From Matthias.Knigge at rohde-schwarz.com Wed Nov 1 10:55:31 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 1 Nov 2017 11:55:31 +0100 Subject: [gpfsug-discuss] Combine different rules Message-ID: Hi at all, I configured a tiered storage with two pools. pool1 >> fast >> ssd pool2 >> slow >> sata First I created a fileset and a placement rule to copy the files to the fast storage. After a time of no access the files and folders should be moved to the slower storage. This could be done by a migration rule. I want to move the whole project folder to the slower storage. If a file in a project folder on the slower storage will be accessed this whole folder should be moved back to the faster storage. The rules must not run automatically. It is ok when this could be done by a cronjob over night. I am a beginner in writing rules. My idea is to write rules which listed files by date and by access and put the output into a file. After that a bash script can change the attributes of these files or rather folders. This could be done by the mmchattr command. If it is possible the mmapplypolicy command could be useful. Someone experiences in those cases? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 1 12:17:45 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 01 Nov 2017 12:17:45 +0000 Subject: [gpfsug-discuss] Combine different rules In-Reply-To: References: Message-ID: <1509538665.18554.1.camel@strath.ac.uk> On Wed, 2017-11-01 at 11:55 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi at all,? > > I configured a tiered storage with two pools.? > > pool1 ? ? ? ?>> ? ? ? ?fast ? ? ? ?>> ? ? ? ?ssd? > pool2 ? ? ? ?>> ? ? ? ?slow ? ? ? ?>> ? ? ? ?sata? > > First I created a fileset and a placement rule to copy the files to > the fast storage.? > > After a time of no access the files and folders should be moved to > the slower storage. This could be done by a migration rule. I want to > move the whole project folder to the slower storage.? Why move the whole project? Just wait if the files are not been accessed they will get moved in short order. You are really making it more complicated for no useful or practical gain. This is a basic policy to move old stuff from fast to slow disks. define(age,(DAYS(CURRENT_TIMESTAMP)-DAYS(ACCESS_TIME))) define(weighting, CASE ????????WHEN age>365 ????????????THEN age*KB_ALLOCATED ????????WHEN age<30 ????????????THEN 0 ????????ELSE ????????????KB_ALLOCATED ???????END ) RULE 'ilm' MIGRATE FROM POOL 'fast' THRESHOLD(90,70) WEIGHT(weighting) TO POOL 'slow' RULE 'new' SET POOL 'fast' LIMIT(95) RULE 'spillover' SET POOL 'slow' Basically it says when fast pool is 90% full, flush it down to 70% full, based on a weighting of the size and age. Basically older bigger files go first. The last two are critical. Allocate new files to the fast pool till it gets 95% full then start using the slow pool. Basically you have to stop allocating files to the fast pool long before it gets full otherwise you will end up with problems. Basically imagine there is 100KB left in the fast pool. I create a file which succeeds because there is space and start writing. When I get to 100KB the write fails because there is no space left in the pool, and a file can only be in one pool at a time. Generally programs will cleanup deleting the failed write at which point there will be space left and so the cycle goes on. You might want to force some file types onto slower disk. For example ISO images?don't really benefit from ever being on the fast disk. /* force ISO images onto nearline storage */ RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso' You also might want to punish people storing inappropriate files on your server so /* force MP3's and the like onto nearline storage forever */ RULE 'mp3' SET POOL 'slow' ????WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR LOWER(NAME) LIKE '%.wma' Another rule I used was to migrate files over to a certain size to the slow pool too. > > If a file in a project folder on the slower storage will be accessed > this whole folder should be moved back to the faster storage.? > Waste of time. In my experience the slow disks when not actually taking new files from a flush of the fast pools will be doing jack all. That is under 10 IOPS per second. That's because if you have everything sized correctly and the right rules people rarely go back to old files. As such the penalty for being on the slower disks is most none existent because there is loads of spare IO capacity on those disks. Secondly by the time you have spotted the files need moving the chances are your users have finished with them so moving them gains nothing. Thirdly if the users start working with those files any change to the file will result in a new file being written which will automatically go to the fast disks. It's the standard dance when you save a file; create new temporary file, write the contents, then do some renaming before deleting the old one. If you are insistent then something like the following would be a start, but moving a whole project would be a *lot* more complicated. I disabled the rule because it was a waste of time. I suggest running a similar rule that prints the files out so you can see how pointless it is. /* migrate recently accessed files back the fast disks */ RULE 'restore' MIGRATE FROM POOL 'slow' WEIGHT(KB_ALLOCATED) TO POOL 'fast' WHERE age < 1 Depending on the number of "projects" you anticipate you could allocate a project to a fileset and then move whole filesets about but I really think the idea is one of those that looks sensible at a high level but in practice is not sensible. > The rules must ?not run automatically. It is ok when this could be > done by a cronjob over night.? > I would argue strongly, very strongly that while you might want to flush the fast pool down every night to a certain amount free, you must have it set so that should it become full during the day an automatic flush is triggered. Failure to do so is guaranteed to bite you in the backside some time down the line. > I am a beginner in writing rules. My idea is to write rules which > listed files by date and by access and put the output into a file. > After that a bash script can change the attributes of these files or > rather folders.? Eh, you apply the policy and it does the work!!! More reading required on the subject I think. A bash script would be horribly slow. IBM have put a lot of work into making the policy engine really really fast. Messing about changing thousands if not millions of files with a bash script will be much much slower and is a recipe for disaster. Your users will put all sorts of random crap into file and directory names; backtick's, asterix's, question marks, newlines, UTF-8 characters etc. that will invariably break your bash script unless carefully escaped. There is no way for you to prevent this. It's the reason find/xargs have the -print0/-0 options, otherwise stuff will just mysteriously break on you. It's really better to just sidestep the whole issue and not process the files with scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From david_johnson at brown.edu Wed Nov 1 12:21:05 2017 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Wed, 1 Nov 2017 08:21:05 -0400 Subject: [gpfsug-discuss] Combine different rules In-Reply-To: References: Message-ID: <3D17430A-B572-4E8E-8CA3-0C308D38AE7B@brown.edu> Filesets and storage pools are for the most part orthogonal concepts. You would sort your users and apply quotas with filesets. You would use storage pools underneath filesets and the filesystem to migrate between faster and slower media. Migration between storage pools is done well by the policy engine with mmapplypolicy. Moving between filesets is entirely up to you, but the path names will change. Migration within a filesystem using storage pools preserves path names. -- ddj Dave Johnson > On Nov 1, 2017, at 6:55 AM, Matthias.Knigge at rohde-schwarz.com wrote: > > Hi at all, > > I configured a tiered storage with two pools. > > pool1 >> fast >> ssd > pool2 >> slow >> sata > > First I created a fileset and a placement rule to copy the files to the fast storage. > > After a time of no access the files and folders should be moved to the slower storage. This could be done by a migration rule. I want to move the whole project folder to the slower storage. > > If a file in a project folder on the slower storage will be accessed this whole folder should be moved back to the faster storage. > > The rules must not run automatically. It is ok when this could be done by a cronjob over night. > > I am a beginner in writing rules. My idea is to write rules which listed files by date and by access and put the output into a file. After that a bash script can change the attributes of these files or rather folders. > > This could be done by the mmchattr command. If it is possible the mmapplypolicy command could be useful. > > Someone experiences in those cases? > > Many thanks in advance! > > Matthias > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Wed Nov 1 12:36:18 2017 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Wed, 1 Nov 2017 07:36:18 -0500 Subject: [gpfsug-discuss] SC17 Spectrum Scale U/G Message-ID: Reminder: Please sign up so we have numbers for planning the happy hour. http://www.spectrumscale.org/ssug-at-sc17/ Douglas O'Flaherty IBM Spectrum Solutions -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 1 14:01:35 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 1 Nov 2017 15:01:35 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules In-Reply-To: <1509538665.18554.1.camel@strath.ac.uk> References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: Hi JAB, many thanks for your answer. Ok, some more background information: We are working with video realtime applications and uncompressed files. So one project is one folder and some subfolders. The size of one project could be more than 1TB. That is the reason why I want to move the whole folder tree. Moving old stuff to the slower storage is not the problem but moving the files back for working with the realtime applications. Not every file will be accessed when you open a project. The Clients get access via GPFS-Client (Windows) and over Samba. Another tool on storage side scan the files for creating playlists etc. While the migration the playout of the video files may not dropped. So I think the best way is to find a solution with mmapplypolicy manually or via crontab. Im must check the access time and the types of files. If I do not do this never a file will be moved the slower storage because the special tool always have access to the files. I will try some concepts and give feedback which solution is working for me. Matthias Von: Jonathan Buzzard An: gpfsug main discussion list Datum: 01.11.2017 13:18 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Wed, 2017-11-01 at 11:55 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi at all, > > I configured a tiered storage with two pools. > > pool1 >> fast >> ssd > pool2 >> slow >> sata > > First I created a fileset and a placement rule to copy the files to > the fast storage. > > After a time of no access the files and folders should be moved to > the slower storage. This could be done by a migration rule. I want to > move the whole project folder to the slower storage. Why move the whole project? Just wait if the files are not been accessed they will get moved in short order. You are really making it more complicated for no useful or practical gain. This is a basic policy to move old stuff from fast to slow disks. define(age,(DAYS(CURRENT_TIMESTAMP)-DAYS(ACCESS_TIME))) define(weighting, CASE WHEN age>365 THEN age*KB_ALLOCATED WHEN age<30 THEN 0 ELSE KB_ALLOCATED END ) RULE 'ilm' MIGRATE FROM POOL 'fast' THRESHOLD(90,70) WEIGHT(weighting) TO POOL 'slow' RULE 'new' SET POOL 'fast' LIMIT(95) RULE 'spillover' SET POOL 'slow' Basically it says when fast pool is 90% full, flush it down to 70% full, based on a weighting of the size and age. Basically older bigger files go first. The last two are critical. Allocate new files to the fast pool till it gets 95% full then start using the slow pool. Basically you have to stop allocating files to the fast pool long before it gets full otherwise you will end up with problems. Basically imagine there is 100KB left in the fast pool. I create a file which succeeds because there is space and start writing. When I get to 100KB the write fails because there is no space left in the pool, and a file can only be in one pool at a time. Generally programs will cleanup deleting the failed write at which point there will be space left and so the cycle goes on. You might want to force some file types onto slower disk. For example ISO images don't really benefit from ever being on the fast disk. /* force ISO images onto nearline storage */ RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso' You also might want to punish people storing inappropriate files on your server so /* force MP3's and the like onto nearline storage forever */ RULE 'mp3' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR LOWER(NAME) LIKE '%.wma' Another rule I used was to migrate files over to a certain size to the slow pool too. > > If a file in a project folder on the slower storage will be accessed > this whole folder should be moved back to the faster storage. > Waste of time. In my experience the slow disks when not actually taking new files from a flush of the fast pools will be doing jack all. That is under 10 IOPS per second. That's because if you have everything sized correctly and the right rules people rarely go back to old files. As such the penalty for being on the slower disks is most none existent because there is loads of spare IO capacity on those disks. Secondly by the time you have spotted the files need moving the chances are your users have finished with them so moving them gains nothing. Thirdly if the users start working with those files any change to the file will result in a new file being written which will automatically go to the fast disks. It's the standard dance when you save a file; create new temporary file, write the contents, then do some renaming before deleting the old one. If you are insistent then something like the following would be a start, but moving a whole project would be a *lot* more complicated. I disabled the rule because it was a waste of time. I suggest running a similar rule that prints the files out so you can see how pointless it is. /* migrate recently accessed files back the fast disks */ RULE 'restore' MIGRATE FROM POOL 'slow' WEIGHT(KB_ALLOCATED) TO POOL 'fast' WHERE age < 1 Depending on the number of "projects" you anticipate you could allocate a project to a fileset and then move whole filesets about but I really think the idea is one of those that looks sensible at a high level but in practice is not sensible. > The rules must not run automatically. It is ok when this could be > done by a cronjob over night. > I would argue strongly, very strongly that while you might want to flush the fast pool down every night to a certain amount free, you must have it set so that should it become full during the day an automatic flush is triggered. Failure to do so is guaranteed to bite you in the backside some time down the line. > I am a beginner in writing rules. My idea is to write rules which > listed files by date and by access and put the output into a file. > After that a bash script can change the attributes of these files or > rather folders. Eh, you apply the policy and it does the work!!! More reading required on the subject I think. A bash script would be horribly slow. IBM have put a lot of work into making the policy engine really really fast. Messing about changing thousands if not millions of files with a bash script will be much much slower and is a recipe for disaster. Your users will put all sorts of random crap into file and directory names; backtick's, asterix's, question marks, newlines, UTF-8 characters etc. that will invariably break your bash script unless carefully escaped. There is no way for you to prevent this. It's the reason find/xargs have the -print0/-0 options, otherwise stuff will just mysteriously break on you. It's really better to just sidestep the whole issue and not process the files with scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 1 14:12:43 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 01 Nov 2017 14:12:43 +0000 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules In-Reply-To: References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: <1509545563.18554.3.camel@strath.ac.uk> On Wed, 2017-11-01 at 15:01 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi JAB,? > > many thanks for your answer.? > > Ok, some more background information:? > > We are working with video realtime applications and uncompressed > files. So one project is one folder and some subfolders. The size of > one project could be more than 1TB. That is the reason why I want to > move the whole folder tree.? > That is not a reason to move the whole folder tree. If the "project" is inactive then the files in it are inactive and the normal "this file has not been accessed" type rules will in due course move the whole lot over to the slower storage. > Moving old stuff to the slower storage is not the problem but moving > the files back for working with the realtime applications. Not every > file will be accessed when you open a project.? > Yeah but you don't want these sorts of policies kicking in automatically. Further if someone where just to check or update a summary document stored with the videos, the whole lot would get moved back to fast disk. By the sounds of it you are going to have to run manual mmapplypolicies to move the groups of files around. Automating what you want is going to be next to impossible. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 1 14:43:27 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 1 Nov 2017 09:43:27 -0500 Subject: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Nov 1 14:59:22 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 1 Nov 2017 09:59:22 -0500 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - STAGING a fileset to a particular POOL In-Reply-To: References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: Not withstanding JAB's remark that this may not necessary: Some customers/admins will want to "stage" a fileset in anticipation of using the data therein. Conversely you can "destage" - just set the TO POOL accordingly. This can be accomplished with a policy rule like: RULE 'stage' MIGRATE FOR FILESET('myfileset') TO POOL 'mypool' /* no FROM POOL clause is required, files will come from any pool - for files already in mypool, no work is done */ And running a command like: mmapplypolicy /path-to/myfileset -P file-with-the-above-policy-rule -g /path-to/shared-temp -N nodelist-to-do-the-work ... (Specifying the path-to/myfileset on the command line will restrict the directory scan, making it go faster.) As JAB remarked, for GPFS POOL to GPFS POOL this may be overkill, but if the files have been "HSMed" migrated or archived to some really slow storage like TAPE ... they an analyst who want to explore the data interactively, might request a migration back to "real" disks (or SSDs) then go to lunch or go to bed ... --marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Wed Nov 1 22:54:04 2017 From: griznog at gmail.com (John Hanks) Date: Wed, 1 Nov 2017 15:54:04 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Message-ID: Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Thu Nov 2 07:11:58 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Thu, 02 Nov 2017 03:11:58 -0400 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: <44655.1509606718@turing-police.cc.vt.edu> On Wed, 01 Nov 2017 15:54:04 -0700, John Hanks said: > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device Check 'df -i' to make sure no file systems are out of inodes. That's From YARD at il.ibm.com Thu Nov 2 07:28:06 2017 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 2 Nov 2017 09:28:06 +0200 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: Hi Please check mmdf output to see that MetaData disks are not full, or you have i-nodes issue. In case you have Independent File-Sets , please run : mmlsfileset -L -i to get the status of each fileset inodes. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: John Hanks To: gpfsug Date: 11/02/2017 12:54 AM Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=WTfQpWOsmp-BdHZ0PWDbaYsxq-5Q1ZH26IyfrBRe3_c&s=SJg8NrUXWEpaxDhqECkwkbJ71jtxjLZz5jX7FxmYMBk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: From Matthias.Knigge at rohde-schwarz.com Thu Nov 2 09:07:48 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Thu, 2 Nov 2017 10:07:48 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: Thanks for this tip. I will try these commands and give feedback in the next week. Matthias Von: "Marc A Kaplan" An: gpfsug main discussion list Datum: 01.11.2017 15:43 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Nov 2 11:19:05 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 2 Nov 2017 11:19:05 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Message-ID: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 14:43:31 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 07:43:31 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Nov 2 14:57:45 2017 From: david_johnson at brown.edu (David Johnson) Date: Thu, 2 Nov 2017 10:57:45 -0400 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert > wrote: > One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > From: > on behalf of John Hanks > > Reply-To: gpfsug main discussion list > > Date: Wednesday, November 1, 2017 at 5:55 PM > To: gpfsug > > Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" > > > > Hi all, <> > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 15:33:11 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 08:33:11 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > >> One thing that I?ve run into before is that on older file systems you had >> the ?*.quota? files in the file system root. If you upgraded the file >> system to a newer version (so these files aren?t used) - There was a bug at >> one time where these didn?t get properly migrated during a restripe. >> Solution was to just remove them >> >> >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> *From: * on behalf of John >> Hanks >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Wednesday, November 1, 2017 at 5:55 PM >> *To: *gpfsug >> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on >> device" >> >> >> >> Hi all, >> >> >> >> I'm trying to do a restripe after setting some nsds to metadataOnly and I >> keep running into this error: >> >> >> >> Scanning user file metadata ... >> >> 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with >> total 531689 MB data processed) >> >> Error processing user file metadata. >> >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on >> scg-gs0 for inodes with broken disk addresses or failures. >> >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> >> >> The file it points to says: >> >> >> >> This inode list was generated in the Parallel Inode Traverse on Wed Nov >> 1 15:36:06 2017 >> >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> >> 53504 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> >> >> >> >> /var on the node I am running this on has > 128 GB free, all the NSDs >> have plenty of free space, the filesystem being restriped has plenty of >> free space and if I watch the node while running this no filesystem on it >> even starts to get full. Could someone tell me where mmrestripefs is >> attempting to write and/or how to point it at a different location? >> >> >> >> Thanks, >> >> >> >> jbh >> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 15:44:08 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 15:44:08 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 15:55:12 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 15:55:12 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 16:13:16 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 09:13:16 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Hmm, this sounds suspicious. We have 10 NSDs in a pool called system. These were previously set to data+metaData with a policy that placed our home directory filesets on this pool. A few weeks ago the NSDs in this pool all filled up. To remedy that I 1. removed old snapshots 2. deleted some old homedir filesets 3. set the NSDs in this pool to metadataOnly 4. changed the policy to point homedir filesets to another pool. 5. ran a migrate policy to migrate all homedir filesets to this other pool After all that I now have ~30% free space on the metadata pool. Our three pools are system (metadataOnly), sas0 (data), sata0 (data) mmrestripefs gsfs0 -r fails immdieately mmrestripefs gsfs0 -r -P system fails immediately mmrestripefs gsfs0 -r -P sas0 fails immediately mmrestripefs gsfs0 -r -P sata0 is running (currently about 3% done) Is the change from data+metadata to metadataOnly the same as removing a disk (for the purposes of this problem) or is it possible my policy is confusing things? [root at scg-gs0 ~]# mmlspolicy gsfs0 Policy for file system '/dev/gsfs0': Installed by root at scg-gs0 on Wed Nov 1 09:30:40 2017. First line of policy 'policy_placement.txt' is: RULE 'homedirs' SET POOL 'sas0' WHERE FILESET_NAME LIKE 'home.%' The policy I used to migrate these filesets is: RULE 'homedirs' MIGRATE TO POOL 'sas0' WHERE FILESET_NAME LIKE 'home.%' jbh On Thu, Nov 2, 2017 at 8:44 AM, Scott Fadden wrote: > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: John Hanks > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson > wrote: > > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r= > WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m= > hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s= > j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 16:19:55 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 09:19:55 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: gpfsug-discuss at spectrumscale.org > Cc: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: John Hanks > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson > wrote: > > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r= > WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m= > hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s= > j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 16:41:36 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 16:41:36 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Nov 2 16:45:30 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 2 Nov 2017 11:45:30 -0500 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Assuming you are replicating data and metadata have you confirmed that all failure groups have the same free space? That is could it be that one of your failure groups has less space than the others? You can verify this with the output of mmdf and look at the NSD sizes and space available. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 12:20 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: Sorry just reread as I hit send and saw this was mmrestripe, in my case it was mmdeledisk. Did you try running the command on just one pool. Or using -B instead? What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ? Looks like it could be related to the maxfeaturelevel of the cluster. Have you recently upgraded? Is everything up to the same level? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Scott Fadden/Portland/IBM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:44 AM I opened a defect on this the other day, in my case it was an incorrect error message. What it meant to say was,"The pool is not empty." Are you trying to remove the last disk in a pool? If so did you empty the pool with a MIGRATE policy first? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: John Hanks Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:34 AM We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University On Nov 2, 2017, at 10:43 AM, John Hanks wrote: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks < griznog at gmail.com> Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 17:16:36 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 10:16:36 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: We do have different amounts of space in the system pool which had the changes applied: [root at scg4-hn01 ~]# mmdf gsfs0 -P system disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) VD000 377487360 100 Yes No 143109120 ( 38%) 35708688 ( 9%) DMD_NSD_804 377487360 100 Yes No 79526144 ( 21%) 2924584 ( 1%) VD002 377487360 100 Yes No 143067136 ( 38%) 35713888 ( 9%) DMD_NSD_802 377487360 100 Yes No 79570432 ( 21%) 2926672 ( 1%) VD004 377487360 100 Yes No 143107584 ( 38%) 35727776 ( 9%) DMD_NSD_805 377487360 200 Yes No 79555584 ( 21%) 2940040 ( 1%) VD001 377487360 200 Yes No 142964992 ( 38%) 35805384 ( 9%) DMD_NSD_803 377487360 200 Yes No 79580160 ( 21%) 2919560 ( 1%) VD003 377487360 200 Yes No 143132672 ( 38%) 35764200 ( 9%) DMD_NSD_801 377487360 200 Yes No 79550208 ( 21%) 2915232 ( 1%) ------------- -------------------- ------------------- (pool total) 3774873600 1113164032 ( 29%) 193346024 ( 5%) and mmldisk shows that there is a problem with replication: ... Number of quorum disks: 5 Read quorum value: 3 Write quorum value: 3 Attention: Due to an earlier configuration change the file system is no longer properly replicated. I thought a 'mmrestripe -r' would fix this, not that I have to fix it first before restriping? jbh On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock wrote: > Assuming you are replicating data and metadata have you confirmed that all > failure groups have the same free space? That is could it be that one of > your failure groups has less space than the others? You can verify this > with the output of mmdf and look at the NSD sizes and space available. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > From: John Hanks > To: gpfsug main discussion list > Date: 11/02/2017 12:20 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Addendum to last message: > > We haven't upgraded recently as far as I know (I just inherited this a > couple of months ago.) but am planning an outage soon to upgrade from > 4.2.0-4 to 4.2.3-5. > > My growing collection of output files generally contain something like > > This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 > 08:34:22 2017 > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > 53506 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > With that inode varying slightly. > > jbh > > On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* > > wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: *gpfsug-discuss at spectrumscale.org* > Cc: *gpfsug-discuss at spectrumscale.org* > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: John Hanks <*griznog at gmail.com* > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* > > wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* > > wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > *Robert.Oesterlin at nuance.com* > wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: *<*gpfsug-discuss-bounces at spectrumscale.org* > > on behalf of John Hanks < > *griznog at gmail.com* > > *Reply-To: *gpfsug main discussion list < > *gpfsug-discuss at spectrumscale.org* > > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* > > > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_ > FF6spqHVpo_0joLY&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Nov 2 17:57:45 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 2 Nov 2017 12:57:45 -0500 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Did you run the tsfindinode command to see where that file is located? Also, what does the mmdf show for your other pools notably the sas0 storage pool? Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 01:17 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org We do have different amounts of space in the system pool which had the changes applied: [root at scg4-hn01 ~]# mmdf gsfs0 -P system disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) VD000 377487360 100 Yes No 143109120 ( 38%) 35708688 ( 9%) DMD_NSD_804 377487360 100 Yes No 79526144 ( 21%) 2924584 ( 1%) VD002 377487360 100 Yes No 143067136 ( 38%) 35713888 ( 9%) DMD_NSD_802 377487360 100 Yes No 79570432 ( 21%) 2926672 ( 1%) VD004 377487360 100 Yes No 143107584 ( 38%) 35727776 ( 9%) DMD_NSD_805 377487360 200 Yes No 79555584 ( 21%) 2940040 ( 1%) VD001 377487360 200 Yes No 142964992 ( 38%) 35805384 ( 9%) DMD_NSD_803 377487360 200 Yes No 79580160 ( 21%) 2919560 ( 1%) VD003 377487360 200 Yes No 143132672 ( 38%) 35764200 ( 9%) DMD_NSD_801 377487360 200 Yes No 79550208 ( 21%) 2915232 ( 1%) ------------- -------------------- ------------------- (pool total) 3774873600 1113164032 ( 29%) 193346024 ( 5%) and mmldisk shows that there is a problem with replication: ... Number of quorum disks: 5 Read quorum value: 3 Write quorum value: 3 Attention: Due to an earlier configuration change the file system is no longer properly replicated. I thought a 'mmrestripe -r' would fix this, not that I have to fix it first before restriping? jbh On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock wrote: Assuming you are replicating data and metadata have you confirmed that all failure groups have the same free space? That is could it be that one of your failure groups has less space than the others? You can verify this with the output of mmdf and look at the NSD sizes and space available. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 12:20 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: Sorry just reread as I hit send and saw this was mmrestripe, in my case it was mmdeledisk. Did you try running the command on just one pool. Or using -B instead? What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ? Looks like it could be related to the maxfeaturelevel of the cluster. Have you recently upgraded? Is everything up to the same level? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Scott Fadden/Portland/IBM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:44 AM I opened a defect on this the other day, in my case it was an incorrect error message. What it meant to say was,"The pool is not empty." Are you trying to remove the last disk in a pool? If so did you empty the pool with a MIGRATE policy first? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: John Hanks Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:34 AM We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University On Nov 2, 2017, at 10:43 AM, John Hanks wrote: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks < griznog at gmail.com> Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s=yDRpuvz3LOTwvP2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 18:14:44 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 11:14:44 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: tsfindiconde tracked the file to user.quota, which somehow escaped my previous attempt to "mv *.quota /elsewhere/" I've moved that now and verified it is actually gone and will retry once the current restripe on the sata0 pool is wrapped up. jbh On Thu, Nov 2, 2017 at 10:57 AM, Frederick Stock wrote: > Did you run the tsfindinode command to see where that file is located? > Also, what does the mmdf show for your other pools notably the sas0 storage > pool? > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > From: John Hanks > To: gpfsug main discussion list > Date: 11/02/2017 01:17 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > We do have different amounts of space in the system pool which had the > changes applied: > > [root at scg4-hn01 ~]# mmdf gsfs0 -P system > disk disk size failure holds holds free > KB free KB > name in KB group metadata data in full > blocks in fragments > --------------- ------------- -------- -------- ----- -------------------- > ------------------- > Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) > VD000 377487360 100 Yes No 143109120 ( > 38%) 35708688 ( 9%) > DMD_NSD_804 377487360 100 Yes No 79526144 ( > 21%) 2924584 ( 1%) > VD002 377487360 100 Yes No 143067136 ( > 38%) 35713888 ( 9%) > DMD_NSD_802 377487360 100 Yes No 79570432 ( > 21%) 2926672 ( 1%) > VD004 377487360 100 Yes No 143107584 ( > 38%) 35727776 ( 9%) > DMD_NSD_805 377487360 200 Yes No 79555584 ( > 21%) 2940040 ( 1%) > VD001 377487360 200 Yes No 142964992 ( > 38%) 35805384 ( 9%) > DMD_NSD_803 377487360 200 Yes No 79580160 ( > 21%) 2919560 ( 1%) > VD003 377487360 200 Yes No 143132672 ( > 38%) 35764200 ( 9%) > DMD_NSD_801 377487360 200 Yes No 79550208 ( > 21%) 2915232 ( 1%) > ------------- -------------------- > ------------------- > (pool total) 3774873600 1113164032 ( > 29%) 193346024 ( 5%) > > > and mmldisk shows that there is a problem with replication: > > ... > Number of quorum disks: 5 > Read quorum value: 3 > Write quorum value: 3 > Attention: Due to an earlier configuration change the file system > is no longer properly replicated. > > > I thought a 'mmrestripe -r' would fix this, not that I have to fix it > first before restriping? > > jbh > > > On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <*stockf at us.ibm.com* > > wrote: > Assuming you are replicating data and metadata have you confirmed that all > failure groups have the same free space? That is could it be that one of > your failure groups has less space than the others? You can verify this > with the output of mmdf and look at the NSD sizes and space available. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *stockf at us.ibm.com* > > > > From: John Hanks <*griznog at gmail.com* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 11/02/2017 12:20 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Addendum to last message: > > We haven't upgraded recently as far as I know (I just inherited this a > couple of months ago.) but am planning an outage soon to upgrade from > 4.2.0-4 to 4.2.3-5. > > My growing collection of output files generally contain something like > > This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 > 08:34:22 2017 > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > 53506 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > With that inode varying slightly. > > jbh > > On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* > > wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: *gpfsug-discuss at spectrumscale.org* > Cc: *gpfsug-discuss at spectrumscale.org* > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: John Hanks <*griznog at gmail.com* > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* > > wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* > > wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > *Robert.Oesterlin at nuance.com* > wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: *<*gpfsug-discuss-bounces at spectrumscale.org* > > on behalf of John Hanks < > *griznog at gmail.com* > > *Reply-To: *gpfsug main discussion list < > *gpfsug-discuss at spectrumscale.org* > > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* > > > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s= > yDRpuvz3LOTwvP2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 18:18:27 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 11:18:27 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Yep, looks like Robert Oesterlin was right, it was the old quota files causing the snag. Now sure how "mv *.quota" managed to move the group file and not the user file, but I'll let that remain a mystery of the universe. In any case I have a restripe running now and have learned a LOT about all the bits in the process. Many thanks to everyone who replied, I learn something from this list every time I get near it. Thank you, jbh On Thu, Nov 2, 2017 at 11:14 AM, John Hanks wrote: > tsfindiconde tracked the file to user.quota, which somehow escaped my > previous attempt to "mv *.quota /elsewhere/" I've moved that now and > verified it is actually gone and will retry once the current restripe on > the sata0 pool is wrapped up. > > jbh > > On Thu, Nov 2, 2017 at 10:57 AM, Frederick Stock > wrote: > >> Did you run the tsfindinode command to see where that file is located? >> Also, what does the mmdf show for your other pools notably the sas0 storage >> pool? >> >> Fred >> __________________________________________________ >> Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> >> stockf at us.ibm.com >> >> >> >> From: John Hanks >> To: gpfsug main discussion list >> Date: 11/02/2017 01:17 PM >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on >> device" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> ------------------------------ >> >> >> >> We do have different amounts of space in the system pool which had the >> changes applied: >> >> [root at scg4-hn01 ~]# mmdf gsfs0 -P system >> disk disk size failure holds holds free >> KB free KB >> name in KB group metadata data in full >> blocks in fragments >> --------------- ------------- -------- -------- ----- >> -------------------- ------------------- >> Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) >> VD000 377487360 100 Yes No 143109120 ( >> 38%) 35708688 ( 9%) >> DMD_NSD_804 377487360 100 Yes No 79526144 ( >> 21%) 2924584 ( 1%) >> VD002 377487360 100 Yes No 143067136 ( >> 38%) 35713888 ( 9%) >> DMD_NSD_802 377487360 100 Yes No 79570432 ( >> 21%) 2926672 ( 1%) >> VD004 377487360 100 Yes No 143107584 ( >> 38%) 35727776 ( 9%) >> DMD_NSD_805 377487360 200 Yes No 79555584 ( >> 21%) 2940040 ( 1%) >> VD001 377487360 200 Yes No 142964992 ( >> 38%) 35805384 ( 9%) >> DMD_NSD_803 377487360 200 Yes No 79580160 ( >> 21%) 2919560 ( 1%) >> VD003 377487360 200 Yes No 143132672 ( >> 38%) 35764200 ( 9%) >> DMD_NSD_801 377487360 200 Yes No 79550208 ( >> 21%) 2915232 ( 1%) >> ------------- >> -------------------- ------------------- >> (pool total) 3774873600 1113164032 ( >> 29%) 193346024 ( 5%) >> >> >> and mmldisk shows that there is a problem with replication: >> >> ... >> Number of quorum disks: 5 >> Read quorum value: 3 >> Write quorum value: 3 >> Attention: Due to an earlier configuration change the file system >> is no longer properly replicated. >> >> >> I thought a 'mmrestripe -r' would fix this, not that I have to fix it >> first before restriping? >> >> jbh >> >> >> On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <*stockf at us.ibm.com* >> > wrote: >> Assuming you are replicating data and metadata have you confirmed that >> all failure groups have the same free space? That is could it be that one >> of your failure groups has less space than the others? You can verify this >> with the output of mmdf and look at the NSD sizes and space available. >> >> Fred >> __________________________________________________ >> Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> >> *stockf at us.ibm.com* >> >> >> >> From: John Hanks <*griznog at gmail.com* > >> To: gpfsug main discussion list < >> *gpfsug-discuss at spectrumscale.org* > >> Date: 11/02/2017 12:20 PM >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on >> device" >> Sent by: *gpfsug-discuss-bounces at spectrumscale.org* >> >> ------------------------------ >> >> >> >> Addendum to last message: >> >> We haven't upgraded recently as far as I know (I just inherited this a >> couple of months ago.) but am planning an outage soon to upgrade from >> 4.2.0-4 to 4.2.3-5. >> >> My growing collection of output files generally contain something like >> >> This inode list was generated in the Parallel Inode Traverse on Thu Nov >> 2 08:34:22 2017 >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> 53506 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> With that inode varying slightly. >> >> jbh >> >> On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* >> > wrote: >> Sorry just reread as I hit send and saw this was mmrestripe, in my case >> it was mmdeledisk. >> >> Did you try running the command on just one pool. Or using -B instead? >> >> What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" >> ? >> >> Looks like it could be related to the maxfeaturelevel of the cluster. >> Have you recently upgraded? Is everything up to the same level? >> >> Scott Fadden >> Spectrum Scale - Technical Marketing >> Phone: *(503) 880-5833* <(503)%20880-5833> >> *sfadden at us.ibm.com* >> *http://www.ibm.com/systems/storage/spectrum/scale* >> >> >> >> ----- Original message ----- >> From: Scott Fadden/Portland/IBM >> To: *gpfsug-discuss at spectrumscale.org* >> Cc: *gpfsug-discuss at spectrumscale.org* >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" >> Date: Thu, Nov 2, 2017 8:44 AM >> >> I opened a defect on this the other day, in my case it was an incorrect >> error message. What it meant to say was,"The pool is not empty." Are you >> trying to remove the last disk in a pool? If so did you empty the pool with >> a MIGRATE policy first? >> >> >> Scott Fadden >> Spectrum Scale - Technical Marketing >> Phone: *(503) 880-5833* <(503)%20880-5833> >> *sfadden at us.ibm.com* >> *http://www.ibm.com/systems/storage/spectrum/scale* >> >> >> >> ----- Original message ----- >> From: John Hanks <*griznog at gmail.com* > >> Sent by: *gpfsug-discuss-bounces at spectrumscale.org* >> >> To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* >> > >> Cc: >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" >> Date: Thu, Nov 2, 2017 8:34 AM >> >> We have no snapshots ( they were the first to go when we initially hit >> the full metadata NSDs). >> >> I've increased quotas so that no filesets have hit a space quota. >> >> Verified that there are no inode quotas anywhere. >> >> mmdf shows the least amount of free space on any nsd to be 9% free. >> >> Still getting this error: >> >> [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs >> 3 >> Scanning file system metadata, phase 1 ... >> Scan completed successfully. >> Scanning file system metadata, phase 2 ... >> Scanning file system metadata for sas0 storage pool >> Scanning file system metadata for sata0 storage pool >> Scan completed successfully. >> Scanning file system metadata, phase 3 ... >> Scan completed successfully. >> Scanning file system metadata, phase 4 ... >> Scan completed successfully. >> Scanning user file metadata ... >> Error processing user file metadata. >> No space left on device >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on >> scg-gs0 for inodes with broken disk addresses or failures. >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> I should note too that this fails almost immediately, far to quickly to >> fill up any location it could be trying to write to. >> >> jbh >> >> On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* >> > wrote: >> One thing that may be relevant is if you have snapshots, depending on >> your release level, >> inodes in the snapshot may considered immutable, and will not be >> migrated. Once the snapshots >> have been deleted, the inodes are freed up and you won?t see the >> (somewhat misleading) message >> about no space. >> >> ? ddj >> Dave Johnson >> Brown University >> >> On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* >> > wrote: >> Thanks all for the suggestions. >> >> Having our metadata NSDs fill up was what prompted this exercise, but >> space was previously feed up on those by switching them from metadata+data >> to metadataOnly and using a policy to migrate files out of that pool. So >> these now have about 30% free space (more if you include fragmented space). >> The restripe attempt is just to make a final move of any remaining data off >> those devices. All the NSDs now have free space on them. >> >> df -i shows inode usage at about 84%, so plenty of free inodes for the >> filesystem as a whole. >> >> We did have old .quota files laying around but removing them didn't have >> any impact. >> >> mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer >> while getting to work. >> >> mmrepquota does show about a half-dozen filesets that have hit their >> quota for space (we don't set quotas on inodes). Once I'm settled in this >> morning I'll try giving them a little extra space and see what happens. >> >> jbh >> >> >> On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < >> *Robert.Oesterlin at nuance.com* > wrote: >> One thing that I?ve run into before is that on older file systems you had >> the ?*.quota? files in the file system root. If you upgraded the file >> system to a newer version (so these files aren?t used) - There was a bug at >> one time where these didn?t get properly migrated during a restripe. >> Solution was to just remove them >> >> >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> *From: *<*gpfsug-discuss-bounces at spectrumscale.org* >> > on behalf of John Hanks < >> *griznog at gmail.com* > >> *Reply-To: *gpfsug main discussion list < >> *gpfsug-discuss at spectrumscale.org* > >> *Date: *Wednesday, November 1, 2017 at 5:55 PM >> *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* >> > >> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on >> device" >> >> >> >> Hi all, >> >> >> >> I'm trying to do a restripe after setting some nsds to metadataOnly and I >> keep running into this error: >> >> >> >> Scanning user file metadata ... >> >> 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with >> total 531689 MB data processed) >> >> Error processing user file metadata. >> >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on >> scg-gs0 for inodes with broken disk addresses or failures. >> >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> >> >> The file it points to says: >> >> >> >> This inode list was generated in the Parallel Inode Traverse on Wed Nov >> 1 15:36:06 2017 >> >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> >> 53504 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> >> >> >> >> /var on the node I am running this on has > 128 GB free, all the NSDs >> have plenty of free space, the filesystem being restriped has plenty of >> free space and if I watch the node while running this no filesystem on it >> even starts to get full. Could someone tell me where mmrestripefs is >> attempting to write and/or how to point it at a different location? >> >> >> >> Thanks, >> >> >> >> jbh >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman* >> >> /listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman* >> >> /listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> >> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> >> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e=* >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.o >> rg_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObT >> bx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= >> XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s=yDRpuvz3LOTwvP >> 2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Sat Nov 4 16:14:46 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Sat, 4 Nov 2017 12:14:46 -0400 Subject: [gpfsug-discuss] file layout API + file fragmentation Message-ID: <83ed4b5a-cf9e-12da-e460-e34a6492afcf@nasa.gov> I've got a question about the file layout API and how it reacts in the case of fragmented files. I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code based on tsGetDataBlk.C. I'm basing the block size based off of what's returned by filemapOut.blockSize but that only seems to return a value > 0 when filemapIn.startOffset is 0. In a case where a file were to be made up of a significant number of non-contiguous fragments (which... would be awful in of itself) how would this be reported by the file layout API? Does the interface technically just report the disk location information of the first block of the $blockSize range and assume that it's contiguous? Thanks! -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From makaplan at us.ibm.com Sun Nov 5 23:01:25 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Sun, 5 Nov 2017 18:01:25 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: I googled GPFS_FCNTL_GET_DATABLKDISKIDX and found this discussion: https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 In general, GPFS files ARE deliberately "fragmented" but we don't say that - we say they are "striped" over many disks -- and that is generally a good thing for parallel performance. Also, in GPFS, if the last would-be block of a file is less than a block, then it is stored in a "fragment" of a block. So you see we use "fragment" to mean something different than other file systems you may know. --marc From: Aaron Knister To: gpfsug main discussion list Date: 11/04/2017 12:22 PM Subject: [gpfsug-discuss] file layout API + file fragmentation Sent by: gpfsug-discuss-bounces at spectrumscale.org I've got a question about the file layout API and how it reacts in the case of fragmented files. I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code based on tsGetDataBlk.C. I'm basing the block size based off of what's returned by filemapOut.blockSize but that only seems to return a value > 0 when filemapIn.startOffset is 0. In a case where a file were to be made up of a significant number of non-contiguous fragments (which... would be awful in of itself) how would this be reported by the file layout API? Does the interface technically just report the disk location information of the first block of the $blockSize range and assume that it's contiguous? Thanks! -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Sun Nov 5 23:39:07 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Sun, 5 Nov 2017 18:39:07 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: <2c1a16ab-9be7-c019-8338-c1dc50d3e069@nasa.gov> Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working on since it needs to be run as unprivileged users. Perhaps I'm not asking the right question. I'm wondering how the file layout api behaves if a given "block"-aligned offset in a file is made up of sub-blocks/fragments that are not all on the same NSD. The assumption based on how I've seen the API used so far is that all sub-blocks within a block at a given offset within a file are all on the same NSD. -Aaron On 11/5/17 6:01 PM, Marc A Kaplan wrote: > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > and found this discussion: > > ?https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > In general, GPFS files ARE deliberately "fragmented" but we don't say > that - we say they are "striped" over many disks -- and that is > generally a good thing for parallel performance. > > Also, in GPFS, if the last would-be block of a file is less than a > block, then it is stored in a "fragment" of a block. ? > So you see we use "fragment" to mean something different than other file > systems you may know. > > --marc > > > > From: ? ? ? ?Aaron Knister > To: ? ? ? ?gpfsug main discussion list > Date: ? ? ? ?11/04/2017 12:22 PM > Subject: ? ? ? ?[gpfsug-discuss] file layout API + file fragmentation > Sent by: ? ? ? ?gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > I've got a question about the file layout API and how it reacts in the > case of fragmented files. > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code > based on tsGetDataBlk.C. I'm basing the block size based off of what's > returned by filemapOut.blockSize but that only seems to return a value > > 0 when filemapIn.startOffset is 0. > > In a case where a file were to be made up of a significant number of > non-contiguous fragments (which... would be awful in of itself) how > would this be reported by the file layout API? Does the interface > technically just report the disk location information of the first block > of the $blockSize range and assume that it's contiguous? > > Thanks! > > -Aaron > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From fschmuck at us.ibm.com Mon Nov 6 00:57:46 2017 From: fschmuck at us.ibm.com (Frank Schmuck) Date: Mon, 6 Nov 2017 00:57:46 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From mutantllama at gmail.com Mon Nov 6 03:35:58 2017 From: mutantllama at gmail.com (Carl) Date: Mon, 6 Nov 2017 14:35:58 +1100 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full Message-ID: Hi Folk, Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. How much degradation do you see above 80% usage, 90% usage? Cheers, Carl. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Nov 6 05:10:30 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 6 Nov 2017 00:10:30 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: Thanks, Frank! That's truly fascinating and has some interesting implications that I hadn't thought of before. I just ran a test on an ~8G fs with a block size of 1M: for i in `seq 1 100000`; do dd if=/dev/zero of=foofile${i} bs=520K count=1 done The fs is "full" according to df/mmdf but there's 3.6G left in subblocks but yeah, I can't allocate any new files that wouldn't fit into the inode and I can't seem to allocate any new subblocks to existing files (e.g. append). What's interesting is if I do the same exercise but with a file size of 30K or even 260K I don't seem to run into the same issue. I'm not sure I understand that yet. I was curious about what this meant in the case of appending to a file where the last offset in the file was allocated to a fragment. By looking at "tsdbfs listda" and appending to a file I could see that the last DA would change (presumably to point to the DA of the start of a contiguous subblock) once the amount of data appended caused the file size to exceed the space available in the trailing subblocks. -Aaron On 11/5/17 7:57 PM, Frank Schmuck wrote: > In GPFS blocks within a file are never fragmented.? For example, if you > have a file of size 7.3 MB and your file system block size is 1MB, then > this file will be made up of 7 full blocks and one fragment of size 320k > (10 subblocks).? Each of the 7 full blocks will be contiguous on a singe > diks (LUN) behind a single NSD server.? The fragment that makes up the > last part of the file will also be contiguous on a single disk, just > shorter than a full block. > ? > Frank Schmuck > IBM Almaden Research Center > ? > ? > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > ? > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > > ??https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ?? > > So you see we use "fragment" to mean something different than > other file > > systems you may know. > > > > --marc > > > > > > > > From: ?? ?? ?? ??Aaron Knister > > To: ?? ?? ?? ??gpfsug main discussion list > > > Date: ?? ?? ?? ??11/04/2017 12:22 PM > > Subject: ?? ?? ?? ??[gpfsug-discuss] file layout API + file > fragmentation > > Sent by: ?? ?? ?? ??gpfsug-discuss-bounces at spectrumscale.org > > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have > some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a > value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first > block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > ? > > ? > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From peter.chase at metoffice.gov.uk Mon Nov 6 09:20:11 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Mon, 6 Nov 2017 09:20:11 +0000 Subject: [gpfsug-discuss] Introduction/Question Message-ID: Hello to all! I'm pleased to have joined the GPFS UG mailing list, I'm experimenting with GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office in the GPCS team (general purpose compute service/mainframe team) and I'm based in Exeter, Devon. I've joined with a specific question to ask, in short: how can I automate sending files to a cloud object store as they arrive in GPFS and keep a copy of the file in GPFS? The longer spiel is this: We have a HPC that throws out a lot of NetCDF files via FTP for use in forecasts. We're currently undergoing a change in working practice, so that data processing is beginning to be done in the cloud. At the same time we're also attempting to de-duplicate the data being sent from the HPC by creating one space to receive it and then have consumers use it or send it on as necessary from there. The data is in terabytes a day sizes, and the timeliness of it's arrival to systems is fairly important (forecasts cease to be forecasts if they're too late). We're using zLinux because the mainframe already receives much of the data from the HPC and has access to a SAN with SSD storage, has the right network connections it needs and generally seems the least amount of work to put something in place. Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the bill and having hardware, storage, OS and filesystem from one provider (IBM) should hopefully save some headaches. We're using Amazon as our cloud provider, and have 2x10GB direct links to their London data centre with a ping of about 15ms, so fairly low latency. The developers using the data want it in s3 so they can access it from server-less environments and won't need to have ec2 instances loitering to look after the data. We were initially interested in using mmcloudgateway/cloud data sharing to send the data, but it's not available for s390x (only x86_64), so I'm now looking at setting up a external storage pool for talking to s3 and then having some kind of ilm soft quota trigger to send the data once enough of it has arrived, but I'm still exploring options. Options such as asking the user group of experienced folks what they think is best! So, any help or advice would be greatly appreciated! Regards, Peter Chase GPCS Team Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 09:37:15 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 09:37:15 +0000 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: Message-ID: Peter, Welcome to the mailing list! Can I summarise in saying that you are looking for a way for GPFS to recognise that a file has just arrived in the filesystem (via FTP) and so trigger an action, in this case to trigger to push to Amazon S3 ? I think that you also have a second question about coping with the restrictions on GPFS on zLinux? ie CES is not supported and hence TCT isn?t either. Looking at the docs, there appears to be many restrictions on TCT for MultiCluster, AFM, Heterogeneous setups, DMAPI tape tiers, etc. So my question to add is; what success have people had in using a TCT in more than the simplest use case of a single small isolated x86 cluster? Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 6 Nov 2017, at 09:20, Chase, Peter wrote: > > Hello to all! > > I?m pleased to have joined the GPFS UG mailing list, I?m experimenting with GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office in the GPCS team (general purpose compute service/mainframe team) and I?m based in Exeter, Devon. > > I?ve joined with a specific question to ask, in short: how can I automate sending files to a cloud object store as they arrive in GPFS and keep a copy of the file in GPFS? > > The longer spiel is this: We have a HPC that throws out a lot of NetCDF files via FTP for use in forecasts. We?re currently undergoing a change in working practice, so that data processing is beginning to be done in the cloud. At the same time we?re also attempting to de-duplicate the data being sent from the HPC by creating one space to receive it and then have consumers use it or send it on as necessary from there. The data is in terabytes a day sizes, and the timeliness of it?s arrival to systems is fairly important (forecasts cease to be forecasts if they?re too late). > > We?re using zLinux because the mainframe already receives much of the data from the HPC and has access to a SAN with SSD storage, has the right network connections it needs and generally seems the least amount of work to put something in place. > > Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the bill and having hardware, storage, OS and filesystem from one provider (IBM) should hopefully save some headaches. > > We?re using Amazon as our cloud provider, and have 2x10GB direct links to their London data centre with a ping of about 15ms, so fairly low latency. The developers using the data want it in s3 so they can access it from server-less environments and won?t need to have ec2 instances loitering to look after the data. > > We were initially interested in using mmcloudgateway/cloud data sharing to send the data, but it?s not available for s390x (only x86_64), so I?m now looking at setting up a external storage pool for talking to s3 and then having some kind of ilm soft quota trigger to send the data once enough of it has arrived, but I?m still exploring options. Options such as asking the user group of experienced folks what they think is best! > > So, any help or advice would be greatly appreciated! > > Regards, > > Peter Chase > GPCS Team > Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom > Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 10:00:39 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 10:00:39 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: Message-ID: Frank, For clarity in the understanding the underlying mechanism in GPFS, could you describe what happens in the case say of a particular file that is appended to every 24 hours? ie. as that file gets to 7MB, it then writes to a new sub-block (1/32 of the next 1MB block). I guess that sub block could be 10th in a a block that already has 9 used. Later on, the file grows to need an 11th subblock and so on. So at what point does this growing file at 8MB occupy all 32 sunblocks of 8 full blocks? Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 6 Nov 2017, at 00:57, Frank Schmuck wrote: > > In GPFS blocks within a file are never fragmented. For example, if you have a file of size 7.3 MB and your file system block size is 1MB, then this file will be made up of 7 full blocks and one fragment of size 320k (10 subblocks). Each of the 7 full blocks will be contiguous on a singe diks (LUN) behind a single NSD server. The fragment that makes up the last part of the file will also be contiguous on a single disk, just shorter than a full block. > > Frank Schmuck > IBM Almaden Research Center > > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > ? https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ? > > So you see we use "fragment" to mean something different than other file > > systems you may know. > > > > --marc > > > > > > > > From: ? ? ? ? Aaron Knister > > To: ? ? ? ? gpfsug main discussion list > > Date: ? ? ? ? 11/04/2017 12:22 PM > > Subject: ? ? ? ? [gpfsug-discuss] file layout API + file fragmentation > > Sent by: ? ? ? ? gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=WH1GLDCza1Rvd9bzdVYoz2Pdzs7l90XNnhUb40FYCqQ&s=LOkUY79m5Ow2FeKqfCqc31cfXZVmqYlvBuQRPirGOFU&e= > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Nov 6 10:01:28 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 06 Nov 2017 10:01:28 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets Message-ID: Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Nov 6 10:22:18 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 6 Nov 2017 15:52:18 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Nov 6 12:25:43 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 6 Nov 2017 12:25:43 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full Message-ID: Hi Carl I don?t have any direct metrics, but we frequently run our file systems above the 80% level, run split data and metadata.I haven?t experienced any GPFS performance issues that I can attribute to high utilization. I know the documentation talks about this, and the lower values of blocks and sub-blocks will make the file system work harder, but so far I haven?t seen any issues. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of Carl Reply-To: gpfsug main discussion list Date: Sunday, November 5, 2017 at 9:36 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Performance of GPFS when filesystem is almost full Hi Folk, Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. How much degradation do you see above 80% usage, 90% usage? Cheers, Carl. -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Nov 6 12:31:30 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 06 Nov 2017 12:31:30 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: > Is this problem happens only for the fileset root directory ? Could you > try accessing the fileset as privileged user after the fileset link and > verify if ACLs are set properly ? AFM reads the ACLs from home and sets in > the cache automatically during the file/dir lookup. What is the Spectrum > Scale version ? > > ~Venkat (vpuvvada at in.ibm.com) > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/06/2017 03:32 PM > Subject: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Dear SpectrumScale Experts, > > > > When creating an IW cache view of a directory in a remote GPFS filesystem, > I prepare the AFM "home" directory using 'mmafmconfig enable ' > command. > > I wish the cache fileset junction point to inherit the ACL for the home > directory when I link it to the filesystem. > > Currently I'm using a flimsy workaround: > > 1. Read the GPFS ACL from the remote directory => store in some file > acl.txt > > 2. Link the AFM fileset to the local filesystem, > > 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i > acl.txt > > Is there a way for the local cache fileset to automatically inherit/clone > the remote directory's ACL, e.g. at mmlinkfileset time? > > > > Thanks! > > Luke._______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Nov 6 13:39:20 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 6 Nov 2017 08:39:20 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: Aaron, brilliant! Your example is close to the worst case, where every file is 512K+1 bytes and the blocksize is 1024K. Yes, in the worse case 49.99999% of space is "lost" or wasted. Don't do that! One can construct such a worst case for any system that allocates by blocks or sectors or whatever you want to call it. Just fill the system with files that are each 0.5*Block_Size+1 bytes and argue that 1/2 the space is wasted. From: Aaron Knister To: Date: 11/06/2017 12:10 AM Subject: Re: [gpfsug-discuss] file layout API + file fragmentation Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, Frank! That's truly fascinating and has some interesting implications that I hadn't thought of before. I just ran a test on an ~8G fs with a block size of 1M: for i in `seq 1 100000`; do dd if=/dev/zero of=foofile${i} bs=520K count=1 done The fs is "full" according to df/mmdf but there's 3.6G left in subblocks but yeah, I can't allocate any new files that wouldn't fit into the inode and I can't seem to allocate any new subblocks to existing files (e.g. append). What's interesting is if I do the same exercise but with a file size of 30K or even 260K I don't seem to run into the same issue. I'm not sure I understand that yet. I was curious about what this meant in the case of appending to a file where the last offset in the file was allocated to a fragment. By looking at "tsdbfs listda" and appending to a file I could see that the last DA would change (presumably to point to the DA of the start of a contiguous subblock) once the amount of data appended caused the file size to exceed the space available in the trailing subblocks. -Aaron On 11/5/17 7:57 PM, Frank Schmuck wrote: > In GPFS blocks within a file are never fragmented. For example, if you > have a file of size 7.3 MB and your file system block size is 1MB, then > this file will be made up of 7 full blocks and one fragment of size 320k > (10 subblocks). Each of the 7 full blocks will be contiguous on a singe > diks (LUN) behind a single NSD server. The fragment that makes up the > last part of the file will also be contiguous on a single disk, just > shorter than a full block. > > Frank Schmuck > IBM Almaden Research Center > > > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > > ? https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ? > > So you see we use "fragment" to mean something different than > other file > > systems you may know. > > > > --marc > > > > > > > > From: ? ? ? ? Aaron Knister > > To: ? ? ? ? gpfsug main discussion list > > > Date: ? ? ? ? 11/04/2017 12:22 PM > > Subject: ? ? ? ? [gpfsug-discuss] file layout API + file > fragmentation > > Sent by: ? ? ? ? gpfsug-discuss-bounces at spectrumscale.org > > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have > some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a > value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first > block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=_xM9xVsqOuNiCqn3ikx6ZaaIHChTPhz_8iDmEKoteX4&s=uy462L5sxX_3Mm3Dh824ptJIxtah9LVRPMmyKz1lAdg&e= > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=_xM9xVsqOuNiCqn3ikx6ZaaIHChTPhz_8iDmEKoteX4&s=uy462L5sxX_3Mm3Dh824ptJIxtah9LVRPMmyKz1lAdg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Nov 6 14:16:34 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 6 Nov 2017 14:16:34 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Message-ID: We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon From peter.smith at framestore.com Mon Nov 6 14:16:42 2017 From: peter.smith at framestore.com (Peter Smith) Date: Mon, 6 Nov 2017 14:16:42 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full In-Reply-To: References: Message-ID: Hi Carl. When we commissioned our system we ran an NFS stress tool, and filled the system to the top. No performance degradation was seen until it was 99.7% full. I believe that after this point it takes longer to find free blocks to write to. YMMV. On 6 November 2017 at 03:35, Carl wrote: > Hi Folk, > > Does anyone have much experience with the performance of GPFS as it > becomes close to full. In particular I am referring to split data/meta > data, where the data pool goes over 80% utilisation. > > How much degradation do you see above 80% usage, 90% usage? > > Cheers, > > Carl. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- [image: Framestore] Peter Smith ? Senior Systems Engineer London ? New York ? Los Angeles ? Chicago ? Montr?al T +44 (0)20 7344 8000 ? M +44 (0)7816 123009 <+44%20%280%297816%20123009> 19-23 Wells Street, London W1T 3PQ Twitter ? Facebook ? framestore.com [image: https://www.framestore.com/] -------------- next part -------------- An HTML attachment was scrubbed... URL: From Achim.Rehor at de.ibm.com Mon Nov 6 16:18:39 2017 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Mon, 6 Nov 2017 11:18:39 -0500 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almostfull In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 7182 bytes Desc: not available URL: From robbyb at us.ibm.com Mon Nov 6 18:02:14 2017 From: robbyb at us.ibm.com (Rob Basham) Date: Mon, 6 Nov 2017 18:02:14 +0000 Subject: [gpfsug-discuss] Fw: Introduction/Question Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15099587293244.png Type: image/png Size: 481 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15099587293245.png Type: image/png Size: 2741 bytes Desc: not available URL: From ewahl at osc.edu Mon Nov 6 19:43:28 2017 From: ewahl at osc.edu (Edward Wahl) Date: Mon, 6 Nov 2017 14:43:28 -0500 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: Message-ID: <20171106144328.58a233f2@osc.edu> On Mon, 6 Nov 2017 09:20:11 +0000 "Chase, Peter" wrote: > how can I automate sending files to a cloud object store as they arrive in > GPFS and keep a copy of the file in GPFS? Sounds like you already have an idea how to do this by using ILM policies. Either quota based as you mention or 'placement' policies should work, though I cannot speak to placement in an S3 environment, the policy engine has a way to call external commands for that if necessary. Though if you create an external pool, a placement policy may be much simpler and possibly faster as well as data would be sent to S3 on write, rather than on a quota trigger. If an external storage pool works properly for S3, I'd probably use a placement policy myself. This also would depend on how/when I needed the data on S3 and your mention of timeliness tells me placement rather than quota may be best. Weighing the solutions for this may be better tested(and timed!) than anything. EVERYONE wants a timely weather forecast. ^_- Ed -- Ed Wahl Ohio Supercomputer Center 614-292-9302 From scale at us.ibm.com Mon Nov 6 19:51:40 2017 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 6 Nov 2017 14:51:40 -0500 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" To: "gpfsug-discuss at spectrumscale.org" Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 20:48:45 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 20:48:45 +0000 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From fschmuck at us.ibm.com Mon Nov 6 20:59:02 2017 From: fschmuck at us.ibm.com (Frank Schmuck) Date: Mon, 6 Nov 2017 20:59:02 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Nov 6 20:59:32 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 6 Nov 2017 20:59:32 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar > on behalf of "scale at us.ibm.com" > Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" >, Simon Thompson > Cc: IBM Spectrum Scale > Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Mon Nov 6 21:09:18 2017 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 6 Nov 2017 21:09:18 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: <7f4c1bf980514e39b2691b15f9b35083@jumptrading.com> Hi Simon, It will only trigger the callback on the currently appointed File System Manager, so you need to make sure your callback scripts are installed on all nodes that can occupy this role. HTH, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Monday, November 06, 2017 3:00 PM To: scale at us.ibm.com; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Note: External Email ________________________________ Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar > on behalf of "scale at us.ibm.com" > Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" >, Simon Thompson > Cc: IBM Spectrum Scale > Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Mon Nov 6 22:18:12 2017 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 6 Nov 2017 17:18:12 -0500 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Right, Bryan. To expand on that a bit, I'll make two additional points. (1) Only a node in the cluster that owns the file system can be appointed a file system manager for the file system. Nodes that remote mount the file system from other clusters cannot be appointed the file system manager of the remote file system. (2) A node need not have the manager designation (as seen in mmlscluster output) to become a file system manager; nodes with the manager designation are preferred, but one could use mmchmgr to assign the role to a non-manager node (for instance). Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Bryan Banister To: gpfsug main discussion list , "scale at us.ibm.com" Date: 11/06/2017 04:09 PM Subject: RE: [gpfsug-discuss] Callbacks / softQuotaExceeded Hi Simon, It will only trigger the callback on the currently appointed File System Manager, so you need to make sure your callback scripts are installed on all nodes that can occupy this role. HTH, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Monday, November 06, 2017 3:00 PM To: scale at us.ibm.com; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Note: External Email Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar on behalf of "scale at us.ibm.com" < scale at us.ibm.com> Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" , Simon Thompson Cc: IBM Spectrum Scale Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" < S.J.Thompson at bham.ac.uk> To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Nov 6 23:49:39 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 6 Nov 2017 18:49:39 -0500 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: , Message-ID: Placement policy rules "SET POOL 'xyz'... " may only name GPFS data pools. NOT "EXTERNAL POOLs" -- EXTERNAL POOL is a concept only supported by MIGRATE rules. However you may be interested in "mmcloudgateway" & co, which is all about combining GPFS with Cloud storage. AKA IBM Transparent Cloud Tiering https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Transparent%20Cloud%20Tiering -------------- next part -------------- An HTML attachment was scrubbed... URL: From mutantllama at gmail.com Tue Nov 7 00:12:11 2017 From: mutantllama at gmail.com (Carl) Date: Tue, 7 Nov 2017 11:12:11 +1100 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almostfull In-Reply-To: References: Message-ID: Thanks to all for the information. Im happy to say that it is close to what I hoped would be the case. Interesting to see the effect of the -n value. Reinforces the need to think about it and not go with the defaults. Thanks again, Carl. On 7 November 2017 at 03:18, Achim Rehor wrote: > I have no practical experience on these numbers, however, Peters > experience below is matching what i learned from Dan years ago. > > As long as the -n setting of the FS (the number of nodes potentially > mounting the fs) is more or less matching the actual number of mounts, > this 99.x % before degradation is expected. If you are far off with that > -n estimate, like having it set to 32, but the actual number of mounts is > in the thousands, > then degradation happens earlier, since the distribution of free blocks in > the allocation maps is not matching the actual setup as good as it could > be. > > Naturally, this depends also on how you do filling of the FS. If it is > only a small percentage of the nodes, doing the creates, then the > distribution can > be 'wrong' as well, and single nodes run earlier out of allocation map > space, and need to look for free blocks elsewhere, costing RPC cycles and > thus performance. > > Putting this in numbers seems quite difficult ;) > > > Mit freundlichen Gr??en / Kind regards > > *Achim Rehor* > > ------------------------------ > > Software Technical Support Specialist AIX/ Emea HPC Support > IBM Certified Advanced Technical Expert - Power Systems with AIX > TSCC Software Service, Dept. 7922 > Global Technology Services > > ------------------------------ > Phone: +49-7034-274-7862 <+49%207034%202747862> IBM Deutschland > E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 > 65451 Kelsterbach > Germany > > > > ------------------------------ > > IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter > Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, > Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 14562 WEEE-Reg.-Nr. DE 99369940 > > > > > > From: Peter Smith > To: gpfsug main discussion list > Date: 11/06/2017 09:17 AM > Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem > is almost full > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Carl. > > When we commissioned our system we ran an NFS stress tool, and filled the > system to the top. > > No performance degradation was seen until it was 99.7% full. > > I believe that after this point it takes longer to find free blocks to > write to. > > YMMV. > > On 6 November 2017 at 03:35, Carl <*mutantllama at gmail.com* > > wrote: > Hi Folk, > > Does anyone have much experience with the performance of GPFS as it > becomes close to full. In particular I am referring to split data/meta > data, where the data pool goes over 80% utilisation. > > How much degradation do you see above 80% usage, 90% usage? > > Cheers, > > Carl. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > -- > *Peter Smith* ? Senior Systems Engineer > *London* ? New York ? Los Angeles ? Chicago ? Montr?al > T +44 (0)20 7344 8000 <+44%2020%207344%208000> ? M +44 (0)7816 123009 > <+44%20%280%297816%20123009> > *19-23 Wells Street, London W1T 3PQ* > > Twitter ? Facebook > ? framestore.com > > ______________________________ > _________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 7182 bytes Desc: not available URL: From vpuvvada at in.ibm.com Tue Nov 7 07:45:37 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 7 Nov 2017 13:15:37 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Luke, This issue has been fixed. As a workaround you could you also try resetting the same ACLs at home (instead of cache) or change directory ctime at home and verify that ACLs are updated correctly on fileset root. You can contact customer support or open a PMR and request efix. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 06:01 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Tue Nov 7 07:57:46 2017 From: john.hearns at asml.com (John Hearns) Date: Tue, 7 Nov 2017 07:57:46 +0000 Subject: [gpfsug-discuss] Spectrum Scale with NVMe Message-ID: I am looking for anyone with experience of using Spectrum Scale with nvme devices. I could use an offline brain dump... The specific issue I have is with the nsd device discovery and the naming. Before anyone replies, I am gettign excellent support from IBM and have been directed to the correct documentation. I am just looking for any wrinkles or tips that anyone has. Thanks -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at spectrumscale.org Tue Nov 7 09:18:52 2017 From: chair at spectrumscale.org (Spectrum Scale UG Chair (Simon Thompson)) Date: Tue, 07 Nov 2017 09:18:52 +0000 Subject: [gpfsug-discuss] SSUG CIUK Call for Speakers Message-ID: The last Spectrum Scale user group meeting of the year will be taking place as part of the Computing Insights UK (CIUK) event in December. We are currently looking for user speakers to talk about their Spectrum Scale implementation. It doesn't have to be a huge deployment, even just a small couple of nodes cluster, we'd love to hear how you are using Scale and about any challenges and successes you've had with it. If you are interested in speaking, you must be registered to attend CIUK and the user group will be taking place on Tuesday 12th December in the afternoon. More details on CIUK and registration at: http://www.stfc.ac.uk/news-events-and-publications/events/general-interest- events/computing-insight-uk/ If you would like to speak, please drop me an email and we can find a slot. Simon From daniel.kidger at uk.ibm.com Tue Nov 7 09:19:24 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Tue, 7 Nov 2017 09:19:24 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem isalmostfull In-Reply-To: Message-ID: I understand that this near linear performance is one of the differentiators of Spectrum Scale. Others with more field experience than me might want to comment on how Lustre and other distributed filesystem perform as they approaches near full capacity. Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 7 Nov 2017, at 00:12, Carl wrote: > > Thanks to all for the information. > > Im happy to say that it is close to what I hoped would be the case. > > Interesting to see the effect of the -n value. Reinforces the need to think about it and not go with the defaults. > > Thanks again, > > Carl. > > >> On 7 November 2017 at 03:18, Achim Rehor wrote: >> I have no practical experience on these numbers, however, Peters experience below is matching what i learned from Dan years ago. >> >> As long as the -n setting of the FS (the number of nodes potentially mounting the fs) is more or less matching the actual number of mounts, >> this 99.x % before degradation is expected. If you are far off with that -n estimate, like having it set to 32, but the actual number of mounts is in the thousands, >> then degradation happens earlier, since the distribution of free blocks in the allocation maps is not matching the actual setup as good as it could be. >> >> Naturally, this depends also on how you do filling of the FS. If it is only a small percentage of the nodes, doing the creates, then the distribution can >> be 'wrong' as well, and single nodes run earlier out of allocation map space, and need to look for free blocks elsewhere, costing RPC cycles and thus performance. >> >> Putting this in numbers seems quite difficult ;) >> >> >> Mit freundlichen Gr??en / Kind regards >> Achim Rehor >> >> >> Software Technical Support Specialist AIX/ Emea HPC Support >> <_1_D95FF418D95FEE980059980B852581D0.gif> >> IBM Certified Advanced Technical Expert - Power Systems with AIX >> TSCC Software Service, Dept. 7922 >> Global Technology Services >> Phone: +49-7034-274-7862 IBM Deutschland >> E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 >> 65451 Kelsterbach >> Germany >> >> >> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter >> Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll >> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 >> >> >> >> >> >> From: Peter Smith >> To: gpfsug main discussion list >> Date: 11/06/2017 09:17 AM >> Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem is almost full >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> >> >> >> Hi Carl. >> >> When we commissioned our system we ran an NFS stress tool, and filled the system to the top. >> >> No performance degradation was seen until it was 99.7% full. >> >> I believe that after this point it takes longer to find free blocks to write to. >> >> YMMV. >> >> On 6 November 2017 at 03:35, Carl wrote: >> Hi Folk, >> >> Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. >> >> How much degradation do you see above 80% usage, 90% usage? >> >> Cheers, >> >> Carl. >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> -- >> Peter Smith ? Senior Systems Engineer >> London ? New York ? Los Angeles ? Chicago ? Montr?al >> T +44 (0)20 7344 8000 ? M +44 (0)7816 123009 >> 19-23 Wells Street, London W1T 3PQ >> Twitter? Facebook? framestore.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckerner at illinois.edu Tue Nov 7 13:04:41 2017 From: ckerner at illinois.edu (Chad Kerner) Date: Tue, 7 Nov 2017 07:04:41 -0600 Subject: [gpfsug-discuss] Spectrum Scale with NVMe In-Reply-To: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> References: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> Message-ID: Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with nvme > devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and have been > directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. Neither the sender nor the > company/group of companies he or she represents shall be liable for the > proper and complete transmission of the information contained in this > communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign From luke.raimbach at googlemail.com Tue Nov 7 16:24:56 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Tue, 07 Nov 2017 16:24:56 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Hello Venkat, Thanks for the information. When was the issue fixed? I tried this on the most recent 4.2.3.5 release and was still experiencing the same behaviour. Cheers, Luke. On Tue, 7 Nov 2017 at 08:45 Venkateswara R Puvvada wrote: > Luke, > > This issue has been fixed. As a workaround you could you also try > resetting the same ACLs at home (instead of cache) or change directory > ctime at home and verify that ACLs are updated correctly on fileset root. > You can contact customer support or open a PMR and request efix. > > ~Venkat (vpuvvada at in.ibm.com) > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/06/2017 06:01 PM > Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Venkat, > > This is only for the fileset root. All other files and directories pull > the correct ACLs as expected when accessing the fileset as root user, or > after setting the correct (missing) ACL on the fileset root. > > Multiple SS versions from around 4.1 to present. > > Thanks! > Luke. > > > On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, <*vpuvvada at in.ibm.com* > > wrote: > > Is this problem happens only for the fileset root directory ? Could you > try accessing the fileset as privileged user after the fileset link and > verify if ACLs are set properly ? AFM reads the ACLs from home and sets in > the cache automatically during the file/dir lookup. What is the Spectrum > Scale version ? > > ~Venkat (*vpuvvada at in.ibm.com* ) > > > > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 11/06/2017 03:32 PM > Subject: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Dear SpectrumScale Experts, > > > When creating an IW cache view of a directory in a remote GPFS filesystem, > I prepare the AFM "home" directory using 'mmafmconfig enable ' > command. > > I wish the cache fileset junction point to inherit the ACL for the home > directory when I link it to the filesystem. > > Currently I'm using a flimsy workaround: > > 1. Read the GPFS ACL from the remote directory => store in some file > acl.txt > > 2. Link the AFM fileset to the local filesystem, > > 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i > acl.txt > > Is there a way for the local cache fileset to automatically inherit/clone > the remote directory's ACL, e.g. at mmlinkfileset time? > > > > Thanks! > > Luke._______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e=* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > > > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex at calicolabs.com Tue Nov 7 17:50:54 2017 From: alex at calicolabs.com (Alex Chekholko) Date: Tue, 7 Nov 2017 09:50:54 -0800 Subject: [gpfsug-discuss] Performance of GPFS when filesystem isalmostfull In-Reply-To: References: Message-ID: One of the parameters that you need to choose at filesystem creation time is the block allocation type. -j {cluster|scatter} parameter to mmcrfs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1ins_blkalmap.htm#ballmap If you use "cluster", you will have quite high performance when the filesystem is close to empty. If you use "scatter", the performance will stay the same no matter the filesystem utilization because blocks for a given file will always be scattered randomly. Some vendors set up their GPFS filesystem using '-j cluster' and then show off their streaming write performance numbers. But the performance degrades considerably as the filesystem fills up. With "scatter", the filesystem performance is slower but stays consistent throughout its lifetime. On Tue, Nov 7, 2017 at 1:19 AM, Daniel Kidger wrote: > I understand that this near linear performance is one of the > differentiators of Spectrum Scale. > Others with more field experience than me might want to comment on how > Lustre and other distributed filesystem perform as they approaches near > full capacity. > > Daniel > [image: /spectrum_storage-banne] > > > [image: Spectrum Scale Logo] > > > *Dr Daniel Kidger* > IBM Technical Sales Specialist > Software Defined Solution Sales > > + <+%2044-7818%20522%20266> 44-(0)7818 522 266 <+%2044-7818%20522%20266> > daniel.kidger at uk.ibm.com > > On 7 Nov 2017, at 00:12, Carl wrote: > > Thanks to all for the information. > > Im happy to say that it is close to what I hoped would be the case. > > Interesting to see the effect of the -n value. Reinforces the need to > think about it and not go with the defaults. > > Thanks again, > > Carl. > > > On 7 November 2017 at 03:18, Achim Rehor wrote: > >> I have no practical experience on these numbers, however, Peters >> experience below is matching what i learned from Dan years ago. >> >> As long as the -n setting of the FS (the number of nodes potentially >> mounting the fs) is more or less matching the actual number of mounts, >> this 99.x % before degradation is expected. If you are far off with that >> -n estimate, like having it set to 32, but the actual number of mounts is >> in the thousands, >> then degradation happens earlier, since the distribution of free blocks >> in the allocation maps is not matching the actual setup as good as it could >> be. >> >> Naturally, this depends also on how you do filling of the FS. If it is >> only a small percentage of the nodes, doing the creates, then the >> distribution can >> be 'wrong' as well, and single nodes run earlier out of allocation map >> space, and need to look for free blocks elsewhere, costing RPC cycles and >> thus performance. >> >> Putting this in numbers seems quite difficult ;) >> >> >> Mit freundlichen Gr??en / Kind regards >> >> *Achim Rehor* >> >> ------------------------------ >> >> Software Technical Support Specialist AIX/ Emea HPC Support >> <_1_D95FF418D95FEE980059980B852581D0.gif> >> IBM Certified Advanced Technical Expert - Power Systems with AIX >> TSCC Software Service, Dept. 7922 >> Global Technology Services >> >> ------------------------------ >> Phone: +49-7034-274-7862 <+49%207034%202747862> IBM Deutschland >> E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 >> 65451 Kelsterbach >> Germany >> >> >> >> ------------------------------ >> >> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter >> Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, >> Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll >> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, >> HRB 14562 WEEE-Reg.-Nr. DE 99369940 >> >> >> >> >> >> From: Peter Smith >> To: gpfsug main discussion list >> Date: 11/06/2017 09:17 AM >> Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem >> is almost full >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> ------------------------------ >> >> >> >> Hi Carl. >> >> When we commissioned our system we ran an NFS stress tool, and filled the >> system to the top. >> >> No performance degradation was seen until it was 99.7% full. >> >> I believe that after this point it takes longer to find free blocks to >> write to. >> >> YMMV. >> >> On 6 November 2017 at 03:35, Carl <*mutantllama at gmail.com* >> > wrote: >> Hi Folk, >> >> Does anyone have much experience with the performance of GPFS as it >> becomes close to full. In particular I am referring to split data/meta >> data, where the data pool goes over 80% utilisation. >> >> How much degradation do you see above 80% usage, 90% usage? >> >> Cheers, >> >> Carl. >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> >> >> >> -- >> *Peter Smith* ? Senior Systems Engineer >> *London* ? New York ? Los Angeles ? Chicago ? Montr?al >> T +44 (0)20 7344 8000 <+44%2020%207344%208000> ? M +44 (0)7816 123009 >> <+44%20%280%297816%20123009> >> *19-23 Wells Street, London W1T 3PQ* >> >> Twitter >> ? >> Facebook >> ? >> framestore.com >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Wed Nov 8 05:16:02 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Wed, 8 Nov 2017 10:46:02 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Luke, There are two issues here. ACLs are not updated on fileset root and other one is that ACLs get updated only when the files/dirs are accessed as root user. Fix for the later one is already part of 4.2.3.5. First issue was fixed after your email, you could request efix on top of 4.2.3.5. First issue will get corrected automatically when ctime is changed on target path at home. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/07/2017 09:55 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Venkat, Thanks for the information. When was the issue fixed? I tried this on the most recent 4.2.3.5 release and was still experiencing the same behaviour. Cheers, Luke. On Tue, 7 Nov 2017 at 08:45 Venkateswara R Puvvada wrote: Luke, This issue has been fixed. As a workaround you could you also try resetting the same ACLs at home (instead of cache) or change directory ctime at home and verify that ACLs are updated correctly on fileset root. You can contact customer support or open a PMR and request efix. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 06:01 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=cbhhdq1uD9_Nmxeh3mRCS0Ic8vc_ts_4uvqXce4DdVc&s=WdJzTgnFn-ApJUW579JhxBPfnVqJ2L3z4x2AJybiVto&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Wed Nov 8 15:50:52 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Wed, 8 Nov 2017 15:50:52 +0000 Subject: [gpfsug-discuss] Default placement/External Pool Message-ID: Hello! A follow up to my previous question about automatically sending files to Amazon s3 as they arrive in GPFS. I have created an interface script to manage Amazon s3 storage as an external pool, I have created a migration policy that pre-migrates all files to the external pool and I have set that as the default policy for the file system. All good so far, but the problem I'm now facing is: Only some of the cluster nodes have access to Amazon due to network constraints. I read the statement "The mmapplypolicy command invokes the external pool script on all nodes in the cluster that have installed the script in its designated location."[1] and thought, 'Great! I'll only install the script on nodes that have access to Amazon' but that appears not to work for a placement policy/default policy and instead, the script runs on precisely no nodes. I assumed this happened because running the script on a non-Amazon facing node resulted in a horrible error (i.e. file not found), so I edited my script to return a non-zero response if being run on a node that isn't in my cloudNode class, then installed the script every where. But this appears to have had no effect what-so-ever. The only thing I can think of now is to control where a migration policy runs based on node class. But I don't know how to do that, or if it's possible, or where the documentation might be as I can't find any. Any assistance would once again be greatly appreciated. [1]=https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_impstorepool.htm Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From Robert.Oesterlin at nuance.com Wed Nov 8 16:02:04 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 8 Nov 2017 16:02:04 +0000 Subject: [gpfsug-discuss] Default placement/External Pool Message-ID: Hi Peter mmapplypolicy has a "-N" parameter that should restrict it to a subset of nodes or node class if you define that. -N {all | mount | Node[,Node...] | NodeFile | NodeClass} Specifies the list of nodes that will run parallel instances of policy code in the GPFS home cluster. This command supports all defined node classes. The default is to run on the node where the mmapplypolicy command is running or the current value of the defaultHelperNodes parameter of the mmchconfig command. Bob Oesterlin Sr Principal Storage Engineer, Nuance ?On 11/8/17, 9:55 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Chase, Peter" wrote: The only thing I can think of now is to control where a migration policy runs based on node class. But I don't know how to do that, or if it's possible, or where the documentation might be as I can't find any. Any assistance would once again be greatly appreciated. From makaplan at us.ibm.com Wed Nov 8 19:21:19 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 8 Nov 2017 14:21:19 -0500 Subject: [gpfsug-discuss] Default placement/External Pool In-Reply-To: References: Message-ID: Peter, 1. to best exploit and integrate both Spectrum Scale and Cloud Storage, please consider: https://www.ibm.com/blogs/systems/spectrum-scale-transparent-cloud-tiering/ 2. Yes, you can use mmapplypolicy to push copies of files to an "external" system. But you'll probably need a strategy or technique to avoid redundantly pushing the "next time" you run the command... 3. Regarding mmapplypolicy nitty-gritty: you can use the -N option to say exactly which nodes you want to run the command. And regarding using ... EXTERNAL ... EXEC 'myscript' You can further restrict which nodes will act as mmapplypolicy "helpers" -- If on a particular node x, 'myscript' does not exist OR myscript TEST returns a non-zero exit code then node x will be excluded.... You will see a message like this: [I] Messages tagged with <3> are from node n3. <3> [E:73] Error on system(/ghome/makaplan/policies/mynodes.sh TEST '/foo/bar5' 2>&1) <3> [W] EXEC '/ghome/makaplan/policies/mynodes.sh' of EXTERNAL POOL or LIST 'x' fails TEST with code 73 on this node. OR [I] Messages tagged with <5> are from node n4. <5> sh: /tmp/mynodes.sh: No such file or directory <5> [E:127] Error on system(/tmp/mynodes.sh TEST '/foo/bar5' 2>&1) <5> [W] EXEC '/tmp/mynodes.sh' of EXTERNAL POOL or LIST 'x' fails TEST with code 127 on this node. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/08/2017 10:51 AM Subject: [gpfsug-discuss] Default placement/External Pool Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello! A follow up to my previous question about automatically sending files to Amazon s3 as they arrive in GPFS. I have created an interface script to manage Amazon s3 storage as an external pool, I have created a migration policy that pre-migrates all files to the external pool and I have set that as the default policy for the file system. All good so far, but the problem I'm now facing is: Only some of the cluster nodes have access to Amazon due to network constraints. I read the statement "The mmapplypolicy command invokes the external pool script on all nodes in the cluster that have installed the script in its designated location."[1] and thought, 'Great! I'll only install the script on nodes that have access to Amazon' but that appears not to work for a placement policy/default policy and instead, the script runs on precisely no nodes. I assumed this happened because running the script on a non-Amazon facing node resulted in a horrible error (i.e. file not found), so I edited my script to return a non-zero response if being run on a node that isn't in my cloudNode class, then installed the script every where. But this appears to have had no effect what-so-ever. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbyb at us.ibm.com Wed Nov 8 20:39:54 2017 From: robbyb at us.ibm.com (Rob Basham) Date: Wed, 8 Nov 2017 20:39:54 +0000 Subject: [gpfsug-discuss] Default placement/External Pool In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 06:22:46 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 07:22:46 +0100 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Message-ID: Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Fri Nov 10 10:06:19 2017 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Fri, 10 Nov 2017 10:06:19 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 10:21:01 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 11:21:01 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Andreas, the version of the GUI and the other packages are the following: gpfs.gui-4.2.3-0.noarch Yes, the collector is running locally on the GUI-Node and it is only one collector configured. The oupt of your command: [root at tower-daemon ~]# echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 1: resolve1|CPU|cpu_user 2: resolve2|CPU|cpu_user 3: sbc-162150007|CPU|cpu_user 4: sbc-162150069|CPU|cpu_user 5: sbc-162150071|CPU|cpu_user 6: sbtl-176173009|CPU|cpu_user 7: tower-daemon|CPU|cpu_user Row Timestamp cpu_user cpu_user cpu_user cpu_user cpu_user cpu_user cpu_user 1 2017-11-10 11:06:00 2.525333 0.151667 0.854333 0.826833 0.836333 0.273833 0.800167 2 2017-11-10 11:07:00 3.052000 0.156833 0.964833 0.946833 0.881833 0.308167 0.896667 3 2017-11-10 11:08:00 4.267167 0.150500 1.134833 1.224833 1.063167 0.300333 0.855333 4 2017-11-10 11:09:00 4.505333 0.149833 1.155333 1.127667 1.098167 0.324500 0.822000 5 2017-11-10 11:10:00 4.023167 0.145667 1.136500 1.079500 1.016000 0.269000 0.836667 6 2017-11-10 11:11:00 2.127167 0.150333 0.903167 0.854833 0.798500 0.280833 0.854500 7 2017-11-10 11:12:00 4.210000 0.151167 0.877833 0.847167 0.836000 0.312500 1.110333 8 2017-11-10 11:13:00 14.388333 0.151000 1.009667 0.986167 0.950333 0.277167 0.814333 9 2017-11-10 11:14:00 18.513167 0.153167 1.048000 0.941333 0.949667 0.282833 0.808333 10 2017-11-10 11:15:00 1.613571 0.149063 0.789630 0.650741 0.826296 0.273333 0.676296 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ [root at tower-daemon ~]# psql postgres postgres -c "select os_host_name from fscc.node;" os_host_name ---------------------- tower sbtl-176173009-admin sbc-162150071-admin sbc-162150069-admin sbc-162150007-admin resolve1-admin resolve2-admin (7rows) The output seems to be ok. Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 11:06 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, 1.) Which GUI version are you running? 2.) Is the Collector running locally on the GUI? 3.) Is there more than one collector configured? 4.) Run the following command on the collector node to verify that there's data in the collector: > echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 5.) Run the following command on the GUI node to verify which host name the GUI uses to query the performance data: psql postgres postgres -c "select os_host_name from fscc.node;" Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 7:23 AM Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=TAzwoRuPR6uYNk_NNemAQPqsxILnSGfc34j4dabTVC0&s=OR8cwq9jfa_GaqXM00kDYFvhoIqPrKR5LT2Anpas3XA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 10:54:17 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 11:54:17 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Some more information: Only the GUI-Node is running on CentOS 7. The Clients are running on CentOS 6.x and RHEL 6.x. Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 11:06 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, 1.) Which GUI version are you running? 2.) Is the Collector running locally on the GUI? 3.) Is there more than one collector configured? 4.) Run the following command on the collector node to verify that there's data in the collector: > echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 5.) Run the following command on the GUI node to verify which host name the GUI uses to query the performance data: psql postgres postgres -c "select os_host_name from fscc.node;" Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 7:23 AM Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=TAzwoRuPR6uYNk_NNemAQPqsxILnSGfc34j4dabTVC0&s=OR8cwq9jfa_GaqXM00kDYFvhoIqPrKR5LT2Anpas3XA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From neil.wilson at metoffice.gov.uk Fri Nov 10 11:19:55 2017 From: neil.wilson at metoffice.gov.uk (Wilson, Neil) Date: Fri, 10 Nov 2017 11:19:55 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the "hostname" field (it's blank by default) of the pmsensors cfg file on that node and restarted pmsensors - the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It's probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Fri Nov 10 12:07:26 2017 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Fri, 10 Nov 2017 12:07:26 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 12:34:18 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 13:34:18 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Andreas, hi Neil, the GUI-Node returned a hostname with a FQDN. The clients have no FQDN. Thanks for this tip. I will change the hostname in the first step. If this does not help then I will change the configuration files. I will give you feedback in the next week! Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Fri Nov 10 13:25:40 2017 From: john.hearns at asml.com (John Hearns) Date: Fri, 10 Nov 2017 13:25:40 +0000 Subject: [gpfsug-discuss] Spectrum Scale with NVMe In-Reply-To: References: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> Message-ID: Chad, Thankyou for the reply. Indded I had that issue - I only noticed because I looked at the utisation of the NSDs and a set of them were not being filled with data... A set which were coincidentally all connected to the same server (me whistles innocently....) -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Chad Kerner Sent: Tuesday, November 07, 2017 2:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with > nvme devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and > have been directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the > body of this communication or the attachment thereto (if any), the > information is provided on an AS-IS basis without any express or > implied warranties or liabilities. To the extent you are relying on > this information, you are doing so at your own risk. If you are not > the intended recipient, please notify the sender immediately by > replying to this message and destroy all copies of this message and > any attachments. Neither the sender nor the company/group of companies > he or she represents shall be liable for the proper and complete > transmission of the information contained in this communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce3875dc1def842e88ee308d525e01e80%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pwti5NtVf7c4SClTUc1PWNz5YW4QHWjM5%2F%2BGLdYHoqQ%3D&reserved=0 -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. From peter.chase at metoffice.gov.uk Fri Nov 10 16:18:36 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Fri, 10 Nov 2017 16:18:36 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From stockf at us.ibm.com Fri Nov 10 16:41:19 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Fri, 10 Nov 2017 11:41:19 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: How do you determine if mmapplypolicy is running on a node? Normally mmapplypolicy as a process runs on a single node but its helper processes, policy-help or something similar, run on all the nodes which are referenced by the -N option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s&s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Nov 10 16:42:28 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 10 Nov 2017 11:42:28 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: mmapplypolicy ... -N nodeClass ... will use the nodes in nodeClass as helper nodes to get its work done. mmdsh -N nodeClass command ... will run the SAME command on each of the nodes -- probably not what you want to do with mmapplypolicy. To see more about what mmapplypolicy is doing use options -d 1 (debug info) If you are using -N because you have a lot of files to process, you should also use -g /some-gpfs-temp-directory (see doc) If you are running a small test case, it may happen that you don't see the helper nodes doing anything, because there's not enough time and work to get them going... For test purposes you can coax the helper nodes into action with: options -B 1 -m 1 so that each helper node only does one file at a time. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=WjhVVKkS23BlFGP2KHmkndM0AZ4yB2aC81UUHv8iIZs&s=-dPme1SlhBAqo45xVmtvVWNeAjumd7JrtEksW1U8o5w&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Fri Nov 10 17:15:55 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Fri, 10 Nov 2017 17:15:55 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: Hi Frederick, The ILM active policy (set by mmchpolicy) has an external list rule, the command for the external list runs the mmapplypolicy command. /gpfs1/s3upload/policies/migration.policy has external pool & a migration rule in it. The handler script for the external pool writes the hostname of the server running it out to a file, so that's how I'm trapping which server is running the policy, and that mmapplypolicy is being run. Hope that explains things, if not let me know and I'll have another try :) Regards, Peter -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of gpfsug-discuss-request at spectrumscale.org Sent: 10 November 2017 16:43 To: gpfsug-discuss at spectrumscale.org Subject: gpfsug-discuss Digest, Vol 70, Issue 32 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Spectrum Scale with NVMe (John Hearns) 2. Specifying nodes in commands (Chase, Peter) 3. Re: Specifying nodes in commands (Frederick Stock) 4. Re: Specifying nodes in commands (Marc A Kaplan) ---------------------------------------------------------------------- Message: 1 Date: Fri, 10 Nov 2017 13:25:40 +0000 From: John Hearns To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Message-ID: Content-Type: text/plain; charset="us-ascii" Chad, Thankyou for the reply. Indded I had that issue - I only noticed because I looked at the utisation of the NSDs and a set of them were not being filled with data... A set which were coincidentally all connected to the same server (me whistles innocently....) -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Chad Kerner Sent: Tuesday, November 07, 2017 2:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with > nvme devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and > have been directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the > body of this communication or the attachment thereto (if any), the > information is provided on an AS-IS basis without any express or > implied warranties or liabilities. To the extent you are relying on > this information, you are doing so at your own risk. If you are not > the intended recipient, please notify the sender immediately by > replying to this message and destroy all copies of this message and > any attachments. Neither the sender nor the company/group of companies > he or she represents shall be liable for the proper and complete > transmission of the information contained in this communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce3875dc1def842e88ee308d525e01e80%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pwti5NtVf7c4SClTUc1PWNz5YW4QHWjM5%2F%2BGLdYHoqQ%3D&reserved=0 -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. ------------------------------ Message: 2 Date: Fri, 10 Nov 2017 16:18:36 +0000 From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk ------------------------------ Message: 3 Date: Fri, 10 Nov 2017 11:41:19 -0500 From: "Frederick Stock" To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" How do you determine if mmapplypolicy is running on a node? Normally mmapplypolicy as a process runs on a single node but its helper processes, policy-help or something similar, run on all the nodes which are referenced by the -N option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s&s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 4 Date: Fri, 10 Nov 2017 11:42:28 -0500 From: "Marc A Kaplan" To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" mmapplypolicy ... -N nodeClass ... will use the nodes in nodeClass as helper nodes to get its work done. mmdsh -N nodeClass command ... will run the SAME command on each of the nodes -- probably not what you want to do with mmapplypolicy. To see more about what mmapplypolicy is doing use options -d 1 (debug info) If you are using -N because you have a lot of files to process, you should also use -g /some-gpfs-temp-directory (see doc) If you are running a small test case, it may happen that you don't see the helper nodes doing anything, because there's not enough time and work to get them going... For test purposes you can coax the helper nodes into action with: options -B 1 -m 1 so that each helper node only does one file at a time. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=WjhVVKkS23BlFGP2KHmkndM0AZ4yB2aC81UUHv8iIZs&s=-dPme1SlhBAqo45xVmtvVWNeAjumd7JrtEksW1U8o5w&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 70, Issue 32 ********************************************** From peter.chase at metoffice.gov.uk Mon Nov 13 11:14:56 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Mon, 13 Nov 2017 11:14:56 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Hi Marc, Thanks for your response, there's some handy advice in there that I'll look at further. I'm still struggling a bit with mmapplypolicy and it's -N option. I've changed my external list command to point at a script, that script looks for "LIST" as the first argument, and runs "/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -d 1 -N cloudNode -P /gpfs1/s3upload/policies/migration.policy >>/gpfs1/s3upload/external-list.log 2>&1". If the script is run from the command line on a node that's not in cloudNode class it works without issue and uses nodes in the cloudNode class as helpers, but if the script is called from the active policy, mmapplypolicy runs, but seems to ignore the -N and doesn't use the cloudNode nodes as helpers and instead seems to run locally (from which ever node started the active policy). So now my questions is: why does the -N option appear to be honoured when run from the command line, but not appear to be honoured when triggered by the active policy? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From makaplan at us.ibm.com Mon Nov 13 17:44:23 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 13 Nov 2017 12:44:23 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: My guess is you have some expectation of how things "ought to be" that does not match how things actually are. If you haven't already done so, put some diagnostics into your script, such as env hostname echo "my args are: $*" And run mmapplypolicy with an explicit node list: mmapplypolicy /some/small-set-of-files -P /mypolicyfile -N node1,node2,node3 -I test -L 1 -d 1 And see how things go Hmmm... reading your post again... It seems perhaps you've got some things out of order or again, incorrect expectations or model of how the this world works... mmapplypolicy reads your policy rules and scans the files and calls the script(s) you've named in the EXEC options of your EXTERNAL rules The scripts are expected to process file lists -- NOT call mmapplypolicy again... Refer to examples in the documentation, and in samples/ilm - and try them! --marc From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/13/2017 06:15 AM Subject: Re: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Marc, Thanks for your response, there's some handy advice in there that I'll look at further. I'm still struggling a bit with mmapplypolicy and it's -N option. I've changed my external list command to point at a script, that script looks for "LIST" as the first argument, and runs "/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -d 1 -N cloudNode -P /gpfs1/s3upload/policies/migration.policy >>/gpfs1/s3upload/external-list.log 2>&1". If the script is run from the command line on a node that's not in cloudNode class it works without issue and uses nodes in the cloudNode class as helpers, but if the script is called from the active policy, mmapplypolicy runs, but seems to ignore the -N and doesn't use the cloudNode nodes as helpers and instead seems to run locally (from which ever node started the active policy). So now my questions is: why does the -N option appear to be honoured when run from the command line, but not appear to be honoured when triggered by the active policy? Regards, Peter Chase GPCS Team Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=tNW4WqkmstX3B3t1dvbenDx32bw3S1FQ4BrpLrs1r4o&s=CBzS6KRLe_hQhI4zpeeuvNaYdraGbc7cCV-JTvCgDcM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From damir.krstic at gmail.com Mon Nov 13 20:49:07 2017 From: damir.krstic at gmail.com (Damir Krstic) Date: Mon, 13 Nov 2017 20:49:07 +0000 Subject: [gpfsug-discuss] verbsRdmaSend yes or no Message-ID: I am missing out on SC17 this year because of some instability with our 2 ESS storage arrays. We have just recently upgraded our ESS to 5.2 and we have a question about verbRdmaSend setting. Per IBM and GPFS guidelines for a large cluster, we have this setting off on all compute nodes. We were able to turn it off on ESS 1 (IO1 and IO2). However, IBM was unable to turn it off on ESS 2 (IO3 and IO4). ESS 1 has following filesystem: projects (1PB) ESS 2 has following filesystems: home and hpc All our client nodes have this setting off. So the question is, should we push through and get it disabled on IO3 and IO4 so that we are consistent across the environment? I assume the answer is yes. But I would also like to know what the impact is of leaving it enabled on IO3 and IO4. Thank you. Damir -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 10:16:44 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 10:16:44 +0000 Subject: [gpfsug-discuss] Backing up GPFS config Message-ID: All, A few months ago someone posted to the list all the commands they run to back up their GPFS configuration. Including mmlsfileset -L, the output of mmlsconfig etc, so that in the event of a proper "crap your pants" moment you can not only restore your data, but also your whole configuration. I cannot seem to find this post... does the OP remember and could kindly forward it on to me, or the list again? Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Nov 14 13:35:46 2017 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 14 Nov 2017 14:35:46 +0100 Subject: [gpfsug-discuss] Backing up GPFS config In-Reply-To: References: Message-ID: Plese see https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Back%20Up%20GPFS%20Configuration But also check ?mmcesdr primary backup?. I don't rememner if it included all of mmbackupconfig/mmccr, but I think it did, and it also includes CES config. You don't need to be using CES DR to use it. -jf tir. 14. nov. 2017 kl. 03:16 skrev Sobey, Richard A : > All, > > > > A few months ago someone posted to the list all the commands they run to > back up their GPFS configuration. Including mmlsfileset -L, the output of > mmlsconfig etc, so that in the event of a proper ?crap your pants? moment > you can not only restore your data, but also your whole configuration. > > > > I cannot seem to find this post? does the OP remember and could kindly > forward it on to me, or the list again? > > > > Thanks > > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Tue Nov 14 14:41:50 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Tue, 14 Nov 2017 14:41:50 +0000 Subject: [gpfsug-discuss] Backing up GPFS config In-Reply-To: References: Message-ID: <20171114144149.7lmc46poy24of4yi@utumno.gs.washington.edu> I can't remember if I replied to that post or a different one, but these are the commands we capture output for before running mmbackup: mmlsconfig mmlsnsd mmlscluster mmlscluster --cnfs mmlscluster --ces mmlsnode mmlsdisk ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L mmbackupconfig ${FS_NAME} All the commands but mmbackupconfig produce human-readable output, while mmbackupconfig produces machine-readable output suitable for recovering the filesystem in a disaster. On Tue, Nov 14, 2017 at 10:16:44AM +0000, Sobey, Richard A wrote: > All, > > A few months ago someone posted to the list all the commands they run to back up their GPFS configuration. Including mmlsfileset -L, the output of mmlsconfig etc, so that in the event of a proper "crap your pants" moment you can not only restore your data, but also your whole configuration. > > I cannot seem to find this post... does the OP remember and could kindly forward it on to me, or the list again? > > Thanks > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From Matthias.Knigge at rohde-schwarz.com Tue Nov 14 15:15:58 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Tue, 14 Nov 2017 16:15:58 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Changing the hostname without FQDN does not help. When I change back that the admin-interface is in the same network as the daemon then it works again. Could it be that for the GUI a daemon-interface must set? If yes, where can I set this interface? Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Tue Nov 14 15:18:23 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Tue, 14 Nov 2017 16:18:23 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: mmfind or rather the convert-script is great! Thanks, Matthias Von: "Marc A Kaplan" An: gpfsug main discussion list Datum: 01.11.2017 15:43 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 16:30:18 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 16:30:18 +0000 Subject: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server In-Reply-To: References: <20171016132932.g5j7vep2frxnsvpf@utumno.gs.washington.edu>, <4B32CB5C696F2849BDEF7DF9EACE884B633F4ACF@SDEB-EXC01.meteo.dz> Message-ID: Hi Scott This looks like what I?m after (thank you Skylar and all others who responded too!) For the uninitiated, what exactly is a User Exit in the context of the following line: ?One way to automate this collection of GPFS configuration data is to use a User Exit. ? Or to put it another way, what is calling the script to be run on the basis of running mmchconfig someparam=someval? I?d like to understand it more. Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Scott Fadden Sent: 16 October 2017 16:35 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server There are some comments on this in the wiki: Backup Spectrum Scale configuration Let me know if anything is missing. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Skylar Thompson > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Date: Mon, Oct 16, 2017 6:29 AM I'm not familiar with GSS, but we have a script that executes the following before backing up a GPFS filesystem so that we have human-readable configuration information: mmlsconfig mmlsnsd mmlscluster mmlsnode mmlsdisk ${FS_NAME} -L mmlsfileset ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L And then executes this for the benefit of GPFS: mmbackupconfig Of course there's quite a bit of overlap for clusters that have more than one filesystem, and even more for filesystems that we backup at the fileset level, but disk is cheap and the hope is it'll make a DR scenario a little bit less harrowing. On Sun, Oct 15, 2017 at 12:44:42PM +0000, atmane khiredine wrote: > Dear All, > > Is there a way to save the GPS configuration? > > OR how backup all GSS > > no backup of data or metadata only configuration for disaster recovery > > for example: > stanza > vdisk > pdisk > RAID code > recovery group > array > > Thank you > > Atmane Khiredine > HPC System Administrator | Office National de la M??t??orologie > T??l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= Click here to report this email as spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 16:57:52 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 16:57:52 +0000 Subject: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server In-Reply-To: References: <20171016132932.g5j7vep2frxnsvpf@utumno.gs.washington.edu>, <4B32CB5C696F2849BDEF7DF9EACE884B633F4ACF@SDEB-EXC01.meteo.dz> Message-ID: To answer my own question: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_3.5.0/com.ibm.cluster.gpfs.v3r5.gpfs100.doc/bl1adm_uxtsdrb.htm It?s built in. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A Sent: 14 November 2017 16:30 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Hi Scott This looks like what I?m after (thank you Skylar and all others who responded too!) For the uninitiated, what exactly is a User Exit in the context of the following line: ?One way to automate this collection of GPFS configuration data is to use a User Exit. ? Or to put it another way, what is calling the script to be run on the basis of running mmchconfig someparam=someval? I?d like to understand it more. Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Scott Fadden Sent: 16 October 2017 16:35 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server There are some comments on this in the wiki: Backup Spectrum Scale configuration Let me know if anything is missing. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Skylar Thompson > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Date: Mon, Oct 16, 2017 6:29 AM I'm not familiar with GSS, but we have a script that executes the following before backing up a GPFS filesystem so that we have human-readable configuration information: mmlsconfig mmlsnsd mmlscluster mmlsnode mmlsdisk ${FS_NAME} -L mmlsfileset ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L And then executes this for the benefit of GPFS: mmbackupconfig Of course there's quite a bit of overlap for clusters that have more than one filesystem, and even more for filesystems that we backup at the fileset level, but disk is cheap and the hope is it'll make a DR scenario a little bit less harrowing. On Sun, Oct 15, 2017 at 12:44:42PM +0000, atmane khiredine wrote: > Dear All, > > Is there a way to save the GPS configuration? > > OR how backup all GSS > > no backup of data or metadata only configuration for disaster recovery > > for example: > stanza > vdisk > pdisk > RAID code > recovery group > array > > Thank you > > Atmane Khiredine > HPC System Administrator | Office National de la M??t??orologie > T??l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= Click here to report this email as spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 15 08:43:28 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 15 Nov 2017 09:43:28 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Strange... I think it is the order of configuration changes. Now it works with severed networks and FQDN. I configured the admin-interface with another network and back to the daemon-network. Then again to the admin-interface and it works fine. So the FQDN should be not the problem. Sometimes a linux system needs a reboot too. ;-) Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Wed Nov 15 16:24:52 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 15 Nov 2017 17:24:52 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size Message-ID: Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano From kums at us.ibm.com Wed Nov 15 16:56:36 2017 From: kums at us.ibm.com (Kumaran Rajaram) Date: Wed, 15 Nov 2017 11:56:36 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hi, >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. [snip from mmcrfs] # mmlsfs | egrep 'Block allocation| Estimated number' -j scatter Block allocation type -n 128 Estimated number of nodes that will mount file system [/snip] [snip from man mmcrfs] layoutMap={scatter | cluster} Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round?robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly. The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system?s free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks. The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance). This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks. The block allocation map type cannot be changed after the storage pool has been created. -n NumNodes The estimated number of nodes that will mount the file system in the local cluster and all remote clusters. This is used as a best guess for the initial size of some file system data structures. The default is 32. This value can be changed after the file system has been created but it does not change the existing data structures. Only the newly created data structure is affected by the new value. For example, new storage pool. When you create a GPFS file system, you might want to overestimate the number of nodes that will mount the file system. GPFS uses this information for creating data structures that are essential for achieving maximum parallelism in file system operations (For more information, see GPFS architecture in IBM Spectrum Scale: Concepts, Planning, and Installation Guide ). If you are sure there will never be more than 64 nodes, allow the default value to be applied. If you are planning to add nodes to your system, you should specify a number larger than the default. [/snip from man mmcrfs] Regards, -Kums From: Ivano Talamo To: Date: 11/15/2017 11:25 AM Subject: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Nov 15 18:25:59 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 15 Nov 2017 13:25:59 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Wed Nov 15 23:48:18 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 15 Nov 2017 23:48:18 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: My 2c ... Be careful here about mixing up three different possible effects seen in filesystems 1. Performance degradation as the filesystem approaches 100% full, often due to the difficulty of finding the remaining unallocated blocks. GPFS doesn?t noticeably suffer from this effect compared to its competitors. 2. Performance degradation over time as files get fragmented and so cause extra movement of the actuator arm of a HDD. (hence defrag on Windows and the idea of short stroking drives). 3. Performance degradation as blocks are written further from the fastest part of a hard disk drive. SSDs do not show this effect. Benchmarks on newly formatted empty filesystems are often artificially high compared to performance after say 12 months whether or not the filesystem is near 90%+ capacity utilisation. The -j scatter option allows for more realistic performance measurement when designing for the long term usage of the filesystem. But this is due to the distributed location of the blocks not how full the filesystem is. Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 15 Nov 2017, at 11:26, Olaf Weiser wrote: > > to add a comment ... .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > To: gpfsug main discussion list > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? > > Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". > > For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. > > [snip from mmcrfs] > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of nodes that will mount file system > [/snip] > > > [snip from man mmcrfs] > layoutMap={scatter| cluster} > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly. > > The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. The cluster > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks. > > The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks. > > The block allocation map type cannot be changed > after the storage pool has been created. > > > -n NumNodes > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool. > > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default. > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > To: > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=Yu5Gt0RPmbb6KaS_emGivhq5C2A33w5DeecdU2aLViQ&s=K0Mz-y4oBH66YUf1syIXaQ3hxck6WjeEMsM-HNHhqAU&e= > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Thu Nov 16 02:34:57 2017 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 16 Nov 2017 02:34:57 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : > to add a comment ... .. very simply... depending on how you allocate the > physical block storage .... if you - simply - using less physical resources > when reducing the capacity (in the same ratio) .. you get , what you > see.... > > so you need to tell us, how you allocate your block-storage .. (Do you > using RAID controllers , where are your LUNs coming from, are then less > RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the hardware can > deliver.. if you reduce resource.. ... you'll get less , if you enhance > your hardware .. you get more... almost regardless of the total capacity in > #blocks .. > > > > > > > From: "Kumaran Rajaram" > To: gpfsug main discussion list > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem > size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Based on your scenario, write degradation as the file-system is populated > is possible if you had formatted the file-system with "-j cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j scatter" > layoutMap.* Also, we need to ensure the mmcrfs "-n" is set properly. > > [snip from mmcrfs] > > > *# mmlsfs | egrep 'Block allocation| Estimated number' -j > scatter Block allocation type -n 128 > Estimated number of nodes that will mount file system* > [/snip] > > > [snip from man mmcrfs] > * layoutMap={scatter|** cluster}* > > > > > > > > > > > > * Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first uses > a round?robin algorithm to spread the data across all > disks in the storage pool. After a disk is selected, the > location of the data block on the disk is determined by > the block allocation map type. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular file are > kept adjacent to each other within each cluster. If > scatter is specified, the location of the block is chosen > randomly.* > > > > > > > > > * The cluster allocation method may provide > better disk performance for some disk subsystems in > relatively small installations. The benefits of clustered > block allocation diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. **The cluster* > > > * allocation method is the default for GPFS > clusters with eight or fewer nodes and for file systems > with eight or fewer disks.* > > > > > > > * The scatter allocation method provides > more consistent file system performance by averaging out > performance variations due to block location (for many > disk subsystems, the location of the data relative to the > disk edge has a substantial effect on performance).* > > > > *This allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than eight > disks.* > > > * The block allocation map type cannot be changed > after the storage pool has been created.* > > > *-n** NumNodes* > > > > > > > > > * The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. This is used > as a best guess for the initial size of some file system data > structures. The default is 32. This value can be changed after the > file system has been created but it does not change the existing > data structures. Only the newly created data structure is > affected by the new value. For example, new storage pool.* > > > > > > > > > > > > * When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the file system. > GPFS uses this information for creating data structures that are > essential for achieving maximum parallelism in file system > operations (For more information, see GPFS architecture in IBM > Spectrum Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, allow > the default value to be applied. If you are planning to add nodes > to your system, you should specify a number larger than the > default.* > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > To: > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 03:42:05 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 03:42:05 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: Sure... as long we assume that really all physical disk are used .. the fact that was told 1/2 or 1/4 might turn out that one / two complet enclosures 're eliminated ... ? ..that s why I was asking for more details .. I dont see this degration in my environments. . as long the vdisks are big enough to span over all pdisks ( which should be the case for capacity in a range of TB ) ... the performance stays the same Gesendet von IBM Verse Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von:"Jan-Frode Myklebust" An:"gpfsug main discussion list" Datum:Mi. 15.11.2017 21:35Betreff:Re: [gpfsug-discuss] Write performances and filesystem size Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : to add a comment ... .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. From: "Kumaran Rajaram" To: gpfsug main discussion list Date: 11/15/2017 11:56 AM Subject: Re: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. [snip from mmcrfs] # mmlsfs | egrep 'Block allocation| Estimated number' -j scatter Block allocation type -n 128 Estimated number of nodes that will mount file system [/snip] [snip from man mmcrfs] layoutMap={scatter| cluster} Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round?robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly. The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system?s free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks. The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance).This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks. The block allocation map type cannot be changed after the storage pool has been created. -n NumNodes The estimated number of nodes that will mount the file system in the local cluster and all remote clusters. This is used as a best guess for the initial size of some file system data structures. The default is 32. This value can be changed after the file system has been created but it does not change the existing data structures. Only the newly created data structure is affected by the new value. For example, new storage pool. When you create a GPFS file system, you might want to overestimate the number of nodes that will mount the file system. GPFS uses this information for creating data structures that are essential for achieving maximum parallelism in file system operations (For more information, see GPFS architecture in IBM Spectrum Scale: Concepts, Planning, and Installation Guide ). If you are sure there will never be more than 64 nodes, allow the default value to be applied. If you are planning to add nodes to your system, you should specify a number larger than the default. [/snip from man mmcrfs] Regards, -Kums From: Ivano Talamo To: Date: 11/15/2017 11:25 AM Subject: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Thu Nov 16 08:44:06 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Thu, 16 Nov 2017 09:44:06 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: <658ae385-ef78-2303-2eef-1b5ac8824c42@psi.ch> Hello Olaf, yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total. Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size. Regarding the layout allocation we used scatter. The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq. Thanks, Ivano Il 16/11/17 04:42, Olaf Weiser ha scritto: > Sure... as long we assume that really all physical disk are used .. the > fact that was told 1/2 or 1/4 might turn out that one / two complet > enclosures 're eliminated ... ? ..that s why I was asking for more > details .. > > I dont see this degration in my environments. . as long the vdisks are > big enough to span over all pdisks ( which should be the case for > capacity in a range of TB ) ... the performance stays the same > > Gesendet von IBM Verse > > Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and > filesystem size --- > > Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same > number of spindles for any size filesystem, so I would also expect them > to perform the same. > > > > -jf > > > ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >: > > to add a comment ... .. very simply... depending on how you > allocate the physical block storage .... if you - simply - using > less physical resources when reducing the capacity (in the same > ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do > you using RAID controllers , where are your LUNs coming from, are > then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the > hardware can deliver.. if you reduce resource.. ... you'll get less > , if you enhance your hardware .. you get more... almost regardless > of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > > To: gpfsug main discussion list > > > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and > filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone > has an explanation for this? > > Based on your scenario, write degradation as the file-system is > populated is possible if you had formatted the file-system with "-j > cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j > scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is > set properly. > > [snip from mmcrfs]/ > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of > nodes that will mount file system/ > [/snip] > > > [snip from man mmcrfs]/ > *layoutMap={scatter|*//*cluster}*// > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type*. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly.*/ > / > * The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. *//The *cluster*// > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks./ > / > *The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).*//This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks./ > / > The block allocation map type cannot be changed > after the storage pool has been created./ > > */ > -n/*/*NumNodes*// > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool./ > / > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default./ > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > > To: > > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, > ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks > seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone > has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org _ > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From olaf.weiser at de.ibm.com Thu Nov 16 12:03:16 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 12:03:16 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. .. You mean something about vdisk Layout. .. So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right? What about Md .. did you create separate vdisk for MD / what size then ? Gesendet von IBM Verse Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von:"Ivano Talamo" An:"gpfsug main discussion list" Datum:Do. 16.11.2017 03:49Betreff:Re: [gpfsug-discuss] Write performances and filesystem size Hello Olaf,yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total.Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size.Regarding the layout allocation we used scatter.The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq.Thanks,IvanoIl 16/11/17 04:42, Olaf Weiser ha scritto:> Sure... as long we assume that really all physical disk are used .. the> fact that was told 1/2 or 1/4 might turn out that one / two complet> enclosures 're eliminated ... ? ..that s why I was asking for more> details ..>> I dont see this degration in my environments. . as long the vdisks are> big enough to span over all pdisks ( which should be the case for> capacity in a range of TB ) ... the performance stays the same>> Gesendet von IBM Verse>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and> filesystem size --->> Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size>> ------------------------------------------------------------------------>> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same> number of spindles for any size filesystem, so I would also expect them> to perform the same.>>>> -jf>>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >:>> to add a comment ... .. very simply... depending on how you> allocate the physical block storage .... if you - simply - using> less physical resources when reducing the capacity (in the same> ratio) .. you get , what you see....>> so you need to tell us, how you allocate your block-storage .. (Do> you using RAID controllers , where are your LUNs coming from, are> then less RAID groups involved, when reducing the capacity ?...)>> GPFS can be configured to give you pretty as much as what the> hardware can deliver.. if you reduce resource.. ... you'll get less> , if you enhance your hardware .. you get more... almost regardless> of the total capacity in #blocks ..>>>>>>> From: "Kumaran Rajaram" >> To: gpfsug main discussion list> >> Date: 11/15/2017 11:56 AM> Subject: Re: [gpfsug-discuss] Write performances and> filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> > ------------------------------------------------------------------------>>>> Hi,>> >>Am I missing something? Is this an expected behaviour and someone> has an explanation for this?>> Based on your scenario, write degradation as the file-system is> populated is possible if you had formatted the file-system with "-j> cluster".>> For consistent file-system performance, we recommend *mmcrfs "-j> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is> set properly.>> [snip from mmcrfs]/> # mmlsfs | egrep 'Block allocation| Estimated number'> -j scatter Block allocation type> -n 128 Estimated number of> nodes that will mount file system/> [/snip]>>> [snip from man mmcrfs]/> *layoutMap={scatter|*//*cluster}*//> Specifies the block allocation map type. When> allocating blocks for a given file, GPFS first> uses a round?robin algorithm to spread the data> across all disks in the storage pool. After a> disk is selected, the location of the data> block on the disk is determined by the block> allocation map type*. If cluster is> specified, GPFS attempts to allocate blocks in> clusters. Blocks that belong to a particular> file are kept adjacent to each other within> each cluster. If scatter is specified,> the location of the block is chosen randomly.*/> /> * The cluster allocation method may provide> better disk performance for some disk> subsystems in relatively small installations.> The benefits of clustered block allocation> diminish when the number of nodes in the> cluster or the number of disks in a file system> increases, or when the file system?s free space> becomes fragmented. *//The *cluster*//> allocation method is the default for GPFS> clusters with eight or fewer nodes and for file> systems with eight or fewer disks./> /> *The scatter allocation method provides> more consistent file system performance by> averaging out performance variations due to> block location (for many disk subsystems, the> location of the data relative to the disk edge> has a substantial effect on performance).*//This> allocation method is appropriate in most cases> and is the default for GPFS clusters with more> than eight nodes or file systems with more than> eight disks./> /> The block allocation map type cannot be changed> after the storage pool has been created./>> */> -n/*/*NumNodes*//> The estimated number of nodes that will mount the file> system in the local cluster and all remote clusters.> This is used as a best guess for the initial size of> some file system data structures. The default is 32.> This value can be changed after the file system has been> created but it does not change the existing data> structures. Only the newly created data structure is> affected by the new value. For example, new storage> pool./> /> When you create a GPFS file system, you might want to> overestimate the number of nodes that will mount the> file system. GPFS uses this information for creating> data structures that are essential for achieving maximum> parallelism in file system operations (For more> information, see GPFS architecture in IBM Spectrum> Scale: Concepts, Planning, and Installation Guide ). If> you are sure there will never be more than 64 nodes,> allow the default value to be applied. If you are> planning to add nodes to your system, you should specify> a number larger than the default./>> [/snip from man mmcrfs]>> Regards,> -Kums>>>>>> From: Ivano Talamo >> To: >> Date: 11/15/2017 11:25 AM> Subject: [gpfsug-discuss] Write performances and filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> > ------------------------------------------------------------------------>>>> Hello everybody,>> together with my colleagues we are actually running some tests on a new> DSS G220 system and we see some unexpected behaviour.>> What we actually see is that write performances (we did not test read> yet) decreases with the decrease of filesystem size.>> I will not go into the details of the tests, but here are some numbers:>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the> sum of the disk activity on the two IO servers;> - with a filesystem using half of the space we get 10 GB/s;> - with a filesystem using 1/4 of the space we get 5 GB/s.>> We also saw that performances are not affected by the vdisks layout,> ie.> taking the full space with one big vdisk or 2 half-size vdisks per RG> gives the same performances.>> To our understanding the IO should be spread evenly across all the> pdisks in the declustered array, and looking at iostat all disks> seem to> be accessed. But so there must be some other element that affects> performances.>> Am I missing something? Is this an expected behaviour and someone> has an> explanation for this?>> Thank you,> Ivano> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org _> __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>_______________________________________________gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Nov 16 12:37:41 2017 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 16 Nov 2017 12:37:41 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: , Message-ID: <83A6EEB0EC738F459A39439733AE80451BB738BC@MBX214.d.ethz.ch> Hi Olaf, yes we have separate vdisks for MD: 2 vdisks, each is 100GBytes large, 1MBytes blocksize, 3WayReplication. A ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Olaf Weiser [olaf.weiser at de.ibm.com] Sent: Thursday, November 16, 2017 1:03 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Write performances and filesystem size Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. .. You mean something about vdisk Layout. .. So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right? What about Md .. did you create separate vdisk for MD / what size then ? Gesendet von IBM Verse Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von: "Ivano Talamo" An: "gpfsug main discussion list" Datum: Do. 16.11.2017 03:49 Betreff: Re: [gpfsug-discuss] Write performances and filesystem size ________________________________ Hello Olaf, yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total. Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size. Regarding the layout allocation we used scatter. The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq. Thanks, Ivano Il 16/11/17 04:42, Olaf Weiser ha scritto: > Sure... as long we assume that really all physical disk are used .. the > fact that was told 1/2 or 1/4 might turn out that one / two complet > enclosures 're eliminated ... ? ..that s why I was asking for more > details .. > > I dont see this degration in my environments. . as long the vdisks are > big enough to span over all pdisks ( which should be the case for > capacity in a range of TB ) ... the performance stays the same > > Gesendet von IBM Verse > > Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and > filesystem size --- > > Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same > number of spindles for any size filesystem, so I would also expect them > to perform the same. > > > > -jf > > > ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >: > > to add a comment ... .. very simply... depending on how you > allocate the physical block storage .... if you - simply - using > less physical resources when reducing the capacity (in the same > ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do > you using RAID controllers , where are your LUNs coming from, are > then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the > hardware can deliver.. if you reduce resource.. ... you'll get less > , if you enhance your hardware .. you get more... almost regardless > of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > > To: gpfsug main discussion list > > > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and > filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone > has an explanation for this? > > Based on your scenario, write degradation as the file-system is > populated is possible if you had formatted the file-system with "-j > cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j > scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is > set properly. > > [snip from mmcrfs]/ > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of > nodes that will mount file system/ > [/snip] > > > [snip from man mmcrfs]/ > *layoutMap={scatter|*//*cluster}*// > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type*. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly.*/ > / > * The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. *//The *cluster*// > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks./ > / > *The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).*//This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks./ > / > The block allocation map type cannot be changed > after the storage pool has been created./ > > */ > -n/*/*NumNodes*// > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool./ > / > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default./ > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > > To: > > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, > ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks > seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone > has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org _ > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Thu Nov 16 13:51:51 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Thu, 16 Nov 2017 14:51:51 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hi, as additional information I past the recovery group information in the full and half size cases. In both cases: - data is on sf_g_01_vdisk01 - metadata on sf_g_01_vdisk02 - sf_g_01_vdisk07 is not used in the filesystem. This is with the full-space filesystem: declustered current allowable recovery group arrays vdisks pdisks format version format version ----------------- ----------- ------ ------ -------------- -------------- sf-g-01 3 6 86 4.2.2.0 4.2.2.0 declustered needs replace scrub background activity array service vdisks pdisks spares threshold free space duration task progress priority ----------- ------- ------ ------ ------ --------- ---------- -------- ------------------------- NVR no 1 2 0,0 1 3632 MiB 14 days scrub 95% low DA1 no 4 83 2,44 1 57 TiB 14 days scrub 0% low SSD no 1 1 0,0 1 372 GiB 14 days scrub 79% low declustered checksum vdisk RAID code array vdisk size block size granularity state remarks ------------------ ------------------ ----------- ---------- ---------- ----------- ----- ------- sf_g_01_logTip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip sf_g_01_logTipBackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup sf_g_01_logHome 4WayReplication DA1 144 GiB 2 MiB 4096 ok log sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk01 8+2p DA1 540 TiB 16 MiB 32 KiB ok config data declustered array spare space remarks ------------------ ------------------ ------------- ------- rebuild space DA1 53 pdisk increasing VCD spares is suggested config data disk group fault tolerance remarks ------------------ --------------------------------- ------- rg descriptor 1 enclosure + 1 drawer + 2 pdisk limited by rebuild space system index 1 enclosure + 1 drawer + 2 pdisk limited by rebuild space vdisk disk group fault tolerance remarks ------------------ --------------------------------- ------- sf_g_01_logTip 1 pdisk sf_g_01_logTipBackup 0 pdisk sf_g_01_logHome 1 enclosure + 1 drawer + 1 pdisk limited by rebuild space sf_g_01_vdisk02 1 enclosure + 1 drawer limited by rebuild space sf_g_01_vdisk07 1 enclosure + 1 drawer limited by rebuild space sf_g_01_vdisk01 2 pdisk This is with the half-space filesystem: declustered current allowable recovery group arrays vdisks pdisks format version format version ----------------- ----------- ------ ------ -------------- -------------- sf-g-01 3 6 86 4.2.2.0 4.2.2.0 declustered needs replace scrub background activity array service vdisks pdisks spares threshold free space duration task progress priority ----------- ------- ------ ------ ------ --------- ---------- -------- ------------------------- NVR no 1 2 0,0 1 3632 MiB 14 days scrub 4% low DA1 no 4 83 2,44 1 395 TiB 14 days scrub 0% low SSD no 1 1 0,0 1 372 GiB 14 days scrub 79% low declustered checksum vdisk RAID code array vdisk size block size granularity state remarks ------------------ ------------------ ----------- ---------- ---------- ----------- ----- ------- sf_g_01_logTip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip sf_g_01_logTipBackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup sf_g_01_logHome 4WayReplication DA1 144 GiB 2 MiB 4096 ok log sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk01 8+2p DA1 270 TiB 16 MiB 32 KiB ok config data declustered array spare space remarks ------------------ ------------------ ------------- ------- rebuild space DA1 68 pdisk increasing VCD spares is suggested config data disk group fault tolerance remarks ------------------ --------------------------------- ------- rg descriptor 1 node + 3 pdisk limited by rebuild space system index 1 node + 3 pdisk limited by rebuild space vdisk disk group fault tolerance remarks ------------------ --------------------------------- ------- sf_g_01_logTip 1 pdisk sf_g_01_logTipBackup 0 pdisk sf_g_01_logHome 1 node + 2 pdisk limited by rebuild space sf_g_01_vdisk02 1 node + 1 pdisk limited by rebuild space sf_g_01_vdisk07 1 node + 1 pdisk limited by rebuild space sf_g_01_vdisk01 2 pdisk Thanks, Ivano Il 16/11/17 13:03, Olaf Weiser ha scritto: > Rjx, that makes it a bit clearer.. as your vdisk is big enough to span > over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... > should bring the same performance. .. > > You mean something about vdisk Layout. .. > So in your test, for the full capacity test, you use just one vdisk per > RG - so 2 in total for 'data' - right? > > What about Md .. did you create separate vdisk for MD / what size then > ? > > Gesendet von IBM Verse > > Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem > size --- > > Von: "Ivano Talamo" > An: "gpfsug main discussion list" > Datum: Do. 16.11.2017 03:49 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Hello Olaf, > > yes, I confirm that is the Lenovo version of the ESS GL2, so 2 > enclosures/4 drawers/166 disks in total. > > Each recovery group has one declustered array with all disks inside, so > vdisks use all the physical ones, even in the case of a vdisk that is > 1/4 of the total size. > > Regarding the layout allocation we used scatter. > > The tests were done on the just created filesystem, so no close-to-full > effect. And we run gpfsperf write seq. > > Thanks, > Ivano > > > Il 16/11/17 04:42, Olaf Weiser ha scritto: >> Sure... as long we assume that really all physical disk are used .. the >> fact that was told 1/2 or 1/4 might turn out that one / two complet >> enclosures 're eliminated ... ? ..that s why I was asking for more >> details .. >> >> I dont see this degration in my environments. . as long the vdisks are >> big enough to span over all pdisks ( which should be the case for >> capacity in a range of TB ) ... the performance stays the same >> >> Gesendet von IBM Verse >> >> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and >> filesystem size --- >> >> Von: "Jan-Frode Myklebust" >> An: "gpfsug main discussion list" >> Datum: Mi. 15.11.2017 21:35 >> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size >> >> ------------------------------------------------------------------------ >> >> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same >> number of spindles for any size filesystem, so I would also expect them >> to perform the same. >> >> >> >> -jf >> >> >> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser > >: >> >> to add a comment ... .. very simply... depending on how you >> allocate the physical block storage .... if you - simply - using >> less physical resources when reducing the capacity (in the same >> ratio) .. you get , what you see.... >> >> so you need to tell us, how you allocate your block-storage .. (Do >> you using RAID controllers , where are your LUNs coming from, are >> then less RAID groups involved, when reducing the capacity ?...) >> >> GPFS can be configured to give you pretty as much as what the >> hardware can deliver.. if you reduce resource.. ... you'll get less >> , if you enhance your hardware .. you get more... almost regardless >> of the total capacity in #blocks .. >> >> >> >> >> >> >> From: "Kumaran Rajaram" > > >> To: gpfsug main discussion list >> > > >> Date: 11/15/2017 11:56 AM >> Subject: Re: [gpfsug-discuss] Write performances and >> filesystem size >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> > ------------------------------------------------------------------------ >> >> >> >> Hi, >> >> >>Am I missing something? Is this an expected behaviour and someone >> has an explanation for this? >> >> Based on your scenario, write degradation as the file-system is >> populated is possible if you had formatted the file-system with "-j >> cluster". >> >> For consistent file-system performance, we recommend *mmcrfs "-j >> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is >> set properly. >> >> [snip from mmcrfs]/ >> # mmlsfs | egrep 'Block allocation| Estimated number' >> -j scatter Block allocation type >> -n 128 Estimated number of >> nodes that will mount file system/ >> [/snip] >> >> >> [snip from man mmcrfs]/ >> *layoutMap={scatter|*//*cluster}*// >> Specifies the block allocation map type. When >> allocating blocks for a given file, GPFS first >> uses a round?robin algorithm to spread the data >> across all disks in the storage pool. After a >> disk is selected, the location of the data >> block on the disk is determined by the block >> allocation map type*. If cluster is >> specified, GPFS attempts to allocate blocks in >> clusters. Blocks that belong to a particular >> file are kept adjacent to each other within >> each cluster. If scatter is specified, >> the location of the block is chosen randomly.*/ >> / >> * The cluster allocation method may provide >> better disk performance for some disk >> subsystems in relatively small installations. >> The benefits of clustered block allocation >> diminish when the number of nodes in the >> cluster or the number of disks in a file system >> increases, or when the file system?s free space >> becomes fragmented. *//The *cluster*// >> allocation method is the default for GPFS >> clusters with eight or fewer nodes and for file >> systems with eight or fewer disks./ >> / >> *The scatter allocation method provides >> more consistent file system performance by >> averaging out performance variations due to >> block location (for many disk subsystems, the >> location of the data relative to the disk edge >> has a substantial effect on performance).*//This >> allocation method is appropriate in most cases >> and is the default for GPFS clusters with more >> than eight nodes or file systems with more than >> eight disks./ >> / >> The block allocation map type cannot be changed >> after the storage pool has been created./ >> >> */ >> -n/*/*NumNodes*// >> The estimated number of nodes that will mount the file >> system in the local cluster and all remote clusters. >> This is used as a best guess for the initial size of >> some file system data structures. The default is 32. >> This value can be changed after the file system has been >> created but it does not change the existing data >> structures. Only the newly created data structure is >> affected by the new value. For example, new storage >> pool./ >> / >> When you create a GPFS file system, you might want to >> overestimate the number of nodes that will mount the >> file system. GPFS uses this information for creating >> data structures that are essential for achieving maximum >> parallelism in file system operations (For more >> information, see GPFS architecture in IBM Spectrum >> Scale: Concepts, Planning, and Installation Guide ). If >> you are sure there will never be more than 64 nodes, >> allow the default value to be applied. If you are >> planning to add nodes to your system, you should specify >> a number larger than the default./ >> >> [/snip from man mmcrfs] >> >> Regards, >> -Kums >> >> >> >> >> >> From: Ivano Talamo > > >> To: > > >> Date: 11/15/2017 11:25 AM >> Subject: [gpfsug-discuss] Write performances and filesystem > size >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> > ------------------------------------------------------------------------ >> >> >> >> Hello everybody, >> >> together with my colleagues we are actually running some tests on > a new >> DSS G220 system and we see some unexpected behaviour. >> >> What we actually see is that write performances (we did not test read >> yet) decreases with the decrease of filesystem size. >> >> I will not go into the details of the tests, but here are some > numbers: >> >> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the >> sum of the disk activity on the two IO servers; >> - with a filesystem using half of the space we get 10 GB/s; >> - with a filesystem using 1/4 of the space we get 5 GB/s. >> >> We also saw that performances are not affected by the vdisks layout, >> ie. >> taking the full space with one big vdisk or 2 half-size vdisks per RG >> gives the same performances. >> >> To our understanding the IO should be spread evenly across all the >> pdisks in the declustered array, and looking at iostat all disks >> seem to >> be accessed. But so there must be some other element that affects >> performances. >> >> Am I missing something? Is this an expected behaviour and someone >> has an >> explanation for this? >> >> Thank you, >> Ivano >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org _ >> > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From sandeep.patil at in.ibm.com Thu Nov 16 14:45:18 2017 From: sandeep.patil at in.ibm.com (Sandeep Ramesh) Date: Thu, 16 Nov 2017 20:15:18 +0530 Subject: [gpfsug-discuss] Latest Technical Blogs on Spectrum Scale Message-ID: Dear User Group members, Here are the Development Blogs in last 3 months on Spectrum Scale Technical Topics. Spectrum Scale Monitoring ? Know More ? https://developer.ibm.com/storage/2017/11/16/spectrum-scale-monitoring-know/ IBM Spectrum Scale 5.0 Release ? What?s coming ! https://developer.ibm.com/storage/2017/11/14/ibm-spectrum-scale-5-0-release-whats-coming/ Four Essentials things to know for managing data ACLs on IBM Spectrum Scale? from Windows https://developer.ibm.com/storage/2017/11/13/four-essentials-things-know-managing-data-acls-ibm-spectrum-scale-windows/ GSSUTILS: A new way of running SSR, Deploying or Upgrading ESS Server https://developer.ibm.com/storage/2017/11/13/gssutils/ IBM Spectrum Scale Object Authentication https://developer.ibm.com/storage/2017/11/02/spectrum-scale-object-authentication/ Video Surveillance ? Choosing the right storage https://developer.ibm.com/storage/2017/11/02/video-surveillance-choosing-right-storage/ IBM Spectrum scale object deep dive training with problem determination https://www.slideshare.net/SmitaRaut/ibm-spectrum-scale-object-deep-dive-training Spectrum Scale as preferred software defined storage for Ubuntu OpenStack https://developer.ibm.com/storage/2017/09/29/spectrum-scale-preferred-software-defined-storage-ubuntu-openstack/ IBM Elastic Storage Server 2U24 Storage ? an All-Flash offering, a performance workhorse https://developer.ibm.com/storage/2017/10/06/ess-5-2-flash-storage/ A Complete Guide to Configure LDAP-based authentication with IBM Spectrum Scale? for File Access https://developer.ibm.com/storage/2017/09/21/complete-guide-configure-ldap-based-authentication-ibm-spectrum-scale-file-access/ Deploying IBM Spectrum Scale on AWS Quick Start https://developer.ibm.com/storage/2017/09/18/deploy-ibm-spectrum-scale-on-aws-quick-start/ Monitoring Spectrum Scale Object metrics https://developer.ibm.com/storage/2017/09/14/monitoring-spectrum-scale-object-metrics/ Tier your data with ease to Spectrum Scale Private Cloud(s) using Moonwalk Universal https://developer.ibm.com/storage/2017/09/14/tier-data-ease-spectrum-scale-private-clouds-using-moonwalk-universal/ Why do I see owner as ?Nobody? for my export mounted using NFSV4 Protocol on IBM Spectrum Scale?? https://developer.ibm.com/storage/2017/09/08/see-owner-nobody-export-mounted-using-nfsv4-protocol-ibm-spectrum-scale/ IBM Spectrum Scale? Authentication using Active Directory and LDAP https://developer.ibm.com/storage/2017/09/01/ibm-spectrum-scale-authentication-using-active-directory-ldap/ IBM Spectrum Scale? Authentication using Active Directory and RFC2307 https://developer.ibm.com/storage/2017/09/01/ibm-spectrum-scale-authentication-using-active-directory-rfc2307/ High Availability Implementation with IBM Spectrum Virtualize and IBM Spectrum Scale https://developer.ibm.com/storage/2017/08/30/high-availability-implementation-ibm-spectrum-virtualize-ibm-spectrum-scale/ 10 Frequently asked Questions on configuring Authentication using AD + AUTO ID mapping on IBM Spectrum Scale?. https://developer.ibm.com/storage/2017/08/04/10-frequently-asked-questions-configuring-authentication-using-ad-auto-id-mapping-ibm-spectrum-scale/ IBM Spectrum Scale? Authentication using Active Directory https://developer.ibm.com/storage/2017/07/30/ibm-spectrum-scale-auth-using-active-directory/ Five cool things that you didn?t know Transparent Cloud Tiering on Spectrum Scale can do https://developer.ibm.com/storage/2017/07/29/five-cool-things-didnt-know-transparent-cloud-tiering-spectrum-scale-can/ IBM Spectrum Scale GUI videos https://developer.ibm.com/storage/2017/07/25/ibm-spectrum-scale-gui-videos/ IBM Spectrum Scale? Authentication ? Planning for NFS Access https://developer.ibm.com/storage/2017/07/24/ibm-spectrum-scale-planning-nfs-access/ For more : Search /browse here: https://developer.ibm.com/storage/blog Consolidation list: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/White%20Papers%20%26%20Media -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 16:08:18 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 11:08:18 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From aelkhouly at sidra.org Thu Nov 16 18:40:51 2017 From: aelkhouly at sidra.org (Ahmad El Khouly) Date: Thu, 16 Nov 2017 18:40:51 +0000 Subject: [gpfsug-discuss] GPFS long waiter Message-ID: <66C328F7-94E9-474F-8AE4-7A4A50DF70E7@sidra.org> Hello all I?m facing long waiter issue and I could not find any way to clear it, I can see all filesystems are responsive and look normal but I can not perform any GPFS commands like mmdf or adding or removing any vdisk, could you please advise how to show more details about this waiter and which pool it is talking about? and any workaround to clear it. 0x7FA0446BF1A0 ( 27706) waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10 (0xFFFFC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace recovery' Ahmed M. Elkhouly Systems Administrator, Scientific Computing Bioinformatics Division Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 23:51:39 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 18:51:39 -0500 Subject: [gpfsug-discuss] GPFS long waiter In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Nov 17 13:03:48 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 17 Nov 2017 13:03:48 +0000 Subject: [gpfsug-discuss] GPFS long waiter Message-ID: Hi Ahmed You might take a look at the file system manager nodes (mmlsmgr) and see if any of them are having problems. It looks like some previous ?mmdf? command was launched and got hung up (and perhaps was terminated by ctrl-c) and the helper process is still running. I have seen mmdf get hung up before, and it?s (almost always) associated with the file system manager node in some way. And I?ve had a few PMRs open on this (vers 4.1, early 4.2) ? I have not seen this on any of the latest code levels) But, as Olaf states, getting a mmsnap and opening a PMR might be worthwhile ? what level of GPFS are you running on? Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of Ahmad El Khouly Reply-To: gpfsug main discussion list Date: Thursday, November 16, 2017 at 12:41 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] GPFS long waiter I?m facing long waiter issue and I could not find any way to clear it, I can see all filesystems are responsive and look normal but I can not perform any GPFS commands like mmdf or adding or removing any vdisk, could you please advise how to show more details about this waiter and which pool it is talking about? and any workaround to clear it. 0x7FA0446BF1A0 ( 27706) waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10 (0xFFFFC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace recovery' -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 13:39:47 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 14:39:47 +0100 Subject: [gpfsug-discuss] gpfs.so vfs samba module is missing Message-ID: Hello at all, anyone know in which package I can find the gpfs vfs module? Currently I am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module is still missing. Any ideas for me? Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Fri Nov 17 16:51:02 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Fri, 17 Nov 2017 11:51:02 -0500 Subject: [gpfsug-discuss] gpfs.so vfs samba module is missing In-Reply-To: References: Message-ID: <8805.1510937462@turing-police.cc.vt.edu> On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 19:04:03 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 20:04:03 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** In-Reply-To: <8805.1510937462@turing-police.cc.vt.edu> References: <8805.1510937462@turing-police.cc.vt.edu> Message-ID: https://manpages.debian.org/testing/samba-vfs-modules/vfs_gpfs.8.en.html I do not think so, the module is a part of samba. I installed the package gpfs.smb too but with the same result. Before I use the normal version of samba I used the version of sernet. There was the module available. Now I am working with CentOS 7.3 and samba of the offical repository of CentOS. Thanks, Matthias Von: valdis.kletnieks at vt.edu An: gpfsug main discussion list Datum: 17.11.2017 17:51 Betreff: [Newsletter] Re: [gpfsug-discuss] gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [Anhang "RohdeSchwarzSecure_E-Mail.html" gel?scht von Matthias Knigge/DVS] -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Nov 17 19:45:30 2017 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 17 Nov 2017 19:45:30 +0000 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** In-Reply-To: References: , <8805.1510937462@turing-police.cc.vt.edu> Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 19:50:27 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 20:50:27 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing In-Reply-To: References: , <8805.1510937462@turing-police.cc.vt.edu> Message-ID: That helps me! Thanks! Von: "Christof Schmitt" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 17.11.2017 20:45 Betreff: [Newsletter] Re: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Whether the gpfs.so module is included depends on each Samba build. Samba provided by Linux distributions typically does not include the gpfs.so module. Sernet package include it. The gpfs.smb Samba build we use in Spectrum Scale also obviously includes the gpfs.so module: # rpm -ql gpfs.smb | grep gpfs.so /usr/lpp/mmfs/lib64/samba/vfs/gpfs.so The main point from a Spectrum Scale point of view: Spectrum Scale only supports the Samba from the gpfs.smb package that was provided with the product. Using any other Samba version is outside of the scope of Spectrum Scale support. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Date: Fri, Nov 17, 2017 12:04 PM https://manpages.debian.org/testing/samba-vfs-modules/vfs_gpfs.8.en.html I do not think so, the module is a part of samba. I installed the package gpfs.smb too but with the same result. Before I use the normal version of samba I used the version of sernet. There was the module available. Now I am working with CentOS 7.3 and samba of the offical repository of CentOS. Thanks, Matthias Von: valdis.kletnieks at vt.edu An: gpfsug main discussion list Datum: 17.11.2017 17:51 Betreff: [Newsletter] Re: [gpfsug-discuss] gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [Anhang "RohdeSchwarzSecure_E-Mail.html" gel?scht von Matthias Knigge/DVS] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=M1Ebd4GVVmaCFs3t0xgGUpgZUM9CzrxWR9I6cvzUqns&s=ONPhff8MP60AoglpZvh9xBAPlV98nW-SmuWoN4EVzUk&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus at koenighome.de Wed Nov 22 03:13:18 2017 From: marcus at koenighome.de (Marcus Koenig) Date: Wed, 22 Nov 2017 16:13:18 +1300 Subject: [gpfsug-discuss] setxattr via policy Message-ID: Hi there, I've got a question around setting userdefined extended attributes. I have played around a bit with setting certain attributes via mmchattr - but now I want to run a policy to do this for me for certain filesets or file sizes. How would I write my policy to set an attribute like user.testflag1=projectX on a number of files in a fileset that are bigger than 1G for example? Thanks folks. Cheers, Marcus -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 22 06:23:08 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 22 Nov 2017 07:23:08 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] setxattr via policy In-Reply-To: References: Message-ID: Good morning, take a look in this directory: cd /usr/lpp/mmfs/samples/ilm/ mmfind or rather tr_findToPol.pl could help you to create a rule/policy. Regards, Matthias Von: Marcus Koenig An: gpfsug-discuss at spectrumscale.org Datum: 22.11.2017 04:13 Betreff: [Newsletter] [gpfsug-discuss] setxattr via policy Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi there, I've got a question around setting userdefined extended attributes. I have played around a bit with setting certain attributes via mmchattr - but now I want to run a policy to do this for me for certain filesets or file sizes. How would I write my policy to set an attribute like user.testflag1=projectX on a number of files in a fileset that are bigger than 1G for example? Thanks folks. Cheers, Marcus_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Wed Nov 22 08:23:22 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 22 Nov 2017 09:23:22 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hello Olaf, thank you for your reply and for confirming that this is not expected, as we also thought. We did repeat the test with 2 vdisks only without dedicated ones for metadata but the result did not change. We now opened a PMR. Thanks, Ivano Il 16/11/17 17:08, Olaf Weiser ha scritto: > Hi Ivano, > so from this output, the performance degradation is not explainable .. > in my current environments.. , having multiple file systems (so vdisks > on one BB) .. and it works fine .. > > as said .. just open a PMR.. I would'nt consider this as the "expected > behavior" > the only thing is.. the MD disks are a bit small.. so maybe redo your > tests and for a simple compare between 1/2 1/1 or 1/4 capacity test > with 2 vdisks only and /dataAndMetadata/ > cheers > > > > > > From: Ivano Talamo > To: gpfsug main discussion list > Date: 11/16/2017 08:52 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > Hi, > > as additional information I past the recovery group information in the > full and half size cases. > In both cases: > - data is on sf_g_01_vdisk01 > - metadata on sf_g_01_vdisk02 > - sf_g_01_vdisk07 is not used in the filesystem. > > This is with the full-space filesystem: > > declustered current allowable > recovery group arrays vdisks pdisks format version format > version > ----------------- ----------- ------ ------ -------------- > -------------- > sf-g-01 3 6 86 4.2.2.0 4.2.2.0 > > > declustered needs replace > scrub background activity > array service vdisks pdisks spares threshold free space > duration task progress priority > ----------- ------- ------ ------ ------ --------- ---------- > -------- ------------------------- > NVR no 1 2 0,0 1 3632 MiB > 14 days scrub 95% low > DA1 no 4 83 2,44 1 57 TiB > 14 days scrub 0% low > SSD no 1 1 0,0 1 372 GiB > 14 days scrub 79% low > > declustered > checksum > vdisk RAID code array vdisk size block > size granularity state remarks > ------------------ ------------------ ----------- ---------- > ---------- ----------- ----- ------- > sf_g_01_logTip 2WayReplication NVR 48 MiB 2 > MiB 4096 ok logTip > sf_g_01_logTipBackup Unreplicated SSD 48 MiB > 2 MiB 4096 ok logTipBackup > sf_g_01_logHome 4WayReplication DA1 144 GiB 2 > MiB 4096 ok log > sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk01 8+2p DA1 540 TiB 16 > MiB 32 KiB ok > > config data declustered array spare space remarks > ------------------ ------------------ ------------- ------- > rebuild space DA1 53 pdisk > increasing VCD spares is suggested > > config data disk group fault tolerance remarks > ------------------ --------------------------------- ------- > rg descriptor 1 enclosure + 1 drawer + 2 pdisk limited by > rebuild space > system index 1 enclosure + 1 drawer + 2 pdisk limited by > rebuild space > > vdisk disk group fault tolerance remarks > ------------------ --------------------------------- ------- > sf_g_01_logTip 1 pdisk > sf_g_01_logTipBackup 0 pdisk > sf_g_01_logHome 1 enclosure + 1 drawer + 1 pdisk limited by > rebuild space > sf_g_01_vdisk02 1 enclosure + 1 drawer limited by > rebuild space > sf_g_01_vdisk07 1 enclosure + 1 drawer limited by > rebuild space > sf_g_01_vdisk01 2 pdisk > > > This is with the half-space filesystem: > > declustered current allowable > recovery group arrays vdisks pdisks format version format > version > ----------------- ----------- ------ ------ -------------- > -------------- > sf-g-01 3 6 86 4.2.2.0 4.2.2.0 > > > declustered needs replace > scrub background activity > array service vdisks pdisks spares threshold free space > duration task progress priority > ----------- ------- ------ ------ ------ --------- ---------- > -------- ------------------------- > NVR no 1 2 0,0 1 3632 MiB > 14 days scrub 4% low > DA1 no 4 83 2,44 1 395 TiB > 14 days scrub 0% low > SSD no 1 1 0,0 1 372 GiB > 14 days scrub 79% low > > declustered > checksum > vdisk RAID code array vdisk size block > size granularity state remarks > ------------------ ------------------ ----------- ---------- > ---------- ----------- ----- ------- > sf_g_01_logTip 2WayReplication NVR 48 MiB 2 > MiB 4096 ok logTip > sf_g_01_logTipBackup Unreplicated SSD 48 MiB > 2 MiB 4096 ok logTipBackup > sf_g_01_logHome 4WayReplication DA1 144 GiB 2 > MiB 4096 ok log > sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk01 8+2p DA1 270 TiB 16 > MiB 32 KiB ok > > config data declustered array spare space remarks > ------------------ ------------------ ------------- ------- > rebuild space DA1 68 pdisk > increasing VCD spares is suggested > > config data disk group fault tolerance remarks > ------------------ --------------------------------- ------- > rg descriptor 1 node + 3 pdisk limited by > rebuild space > system index 1 node + 3 pdisk limited by > rebuild space > > vdisk disk group fault tolerance remarks > ------------------ --------------------------------- ------- > sf_g_01_logTip 1 pdisk > sf_g_01_logTipBackup 0 pdisk > sf_g_01_logHome 1 node + 2 pdisk limited by > rebuild space > sf_g_01_vdisk02 1 node + 1 pdisk limited by > rebuild space > sf_g_01_vdisk07 1 node + 1 pdisk limited by > rebuild space > sf_g_01_vdisk01 2 pdisk > > > Thanks, > Ivano > > > > > Il 16/11/17 13:03, Olaf Weiser ha scritto: >> Rjx, that makes it a bit clearer.. as your vdisk is big enough to span >> over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... >> should bring the same performance. .. >> >> You mean something about vdisk Layout. .. >> So in your test, for the full capacity test, you use just one vdisk per >> RG - so 2 in total for 'data' - right? >> >> What about Md .. did you create separate vdisk for MD / what size then >> ? >> >> Gesendet von IBM Verse >> >> Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem >> size --- >> >> Von: "Ivano Talamo" >> An: "gpfsug main discussion list" > >> Datum: Do. 16.11.2017 03:49 >> Betreff: Re: [gpfsug-discuss] Write performances and > filesystem size >> >> ------------------------------------------------------------------------ >> >> Hello Olaf, >> >> yes, I confirm that is the Lenovo version of the ESS GL2, so 2 >> enclosures/4 drawers/166 disks in total. >> >> Each recovery group has one declustered array with all disks inside, so >> vdisks use all the physical ones, even in the case of a vdisk that is >> 1/4 of the total size. >> >> Regarding the layout allocation we used scatter. >> >> The tests were done on the just created filesystem, so no close-to-full >> effect. And we run gpfsperf write seq. >> >> Thanks, >> Ivano >> >> >> Il 16/11/17 04:42, Olaf Weiser ha scritto: >>> Sure... as long we assume that really all physical disk are used .. the >>> fact that was told 1/2 or 1/4 might turn out that one / two complet >>> enclosures 're eliminated ... ? ..that s why I was asking for more >>> details .. >>> >>> I dont see this degration in my environments. . as long the vdisks are >>> big enough to span over all pdisks ( which should be the case for >>> capacity in a range of TB ) ... the performance stays the same >>> >>> Gesendet von IBM Verse >>> >>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and >>> filesystem size --- >>> >>> Von: "Jan-Frode Myklebust" >>> An: "gpfsug main discussion list" >>> Datum: Mi. 15.11.2017 21:35 >>> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size >>> >>> ------------------------------------------------------------------------ >>> >>> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same >>> number of spindles for any size filesystem, so I would also expect them >>> to perform the same. >>> >>> >>> >>> -jf >>> >>> >>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >> >: >>> >>> to add a comment ... .. very simply... depending on how you >>> allocate the physical block storage .... if you - simply - using >>> less physical resources when reducing the capacity (in the same >>> ratio) .. you get , what you see.... >>> >>> so you need to tell us, how you allocate your block-storage .. (Do >>> you using RAID controllers , where are your LUNs coming from, are >>> then less RAID groups involved, when reducing the capacity ?...) >>> >>> GPFS can be configured to give you pretty as much as what the >>> hardware can deliver.. if you reduce resource.. ... you'll get less >>> , if you enhance your hardware .. you get more... almost regardless >>> of the total capacity in #blocks .. >>> >>> >>> >>> >>> >>> >>> From: "Kumaran Rajaram" >> > >>> To: gpfsug main discussion list >>> >> > >>> Date: 11/15/2017 11:56 AM >>> Subject: Re: [gpfsug-discuss] Write performances and >>> filesystem size >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> >>> >> ------------------------------------------------------------------------ >>> >>> >>> >>> Hi, >>> >>> >>Am I missing something? Is this an expected behaviour and someone >>> has an explanation for this? >>> >>> Based on your scenario, write degradation as the file-system is >>> populated is possible if you had formatted the file-system with "-j >>> cluster". >>> >>> For consistent file-system performance, we recommend *mmcrfs "-j >>> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is >>> set properly. >>> >>> [snip from mmcrfs]/ >>> # mmlsfs | egrep 'Block allocation| Estimated number' >>> -j scatter Block allocation type >>> -n 128 Estimated number of >>> nodes that will mount file system/ >>> [/snip] >>> >>> >>> [snip from man mmcrfs]/ >>> *layoutMap={scatter|*//*cluster}*// >>> Specifies the block allocation map type. When >>> allocating blocks for a given file, GPFS first >>> uses a round?robin algorithm to spread the data >>> across all disks in the storage pool. After a >>> disk is selected, the location of the data >>> block on the disk is determined by the block >>> allocation map type*. If cluster is >>> specified, GPFS attempts to allocate blocks in >>> clusters. Blocks that belong to a particular >>> file are kept adjacent to each other within >>> each cluster. If scatter is specified, >>> the location of the block is chosen randomly.*/ >>> / >>> * The cluster allocation method may provide >>> better disk performance for some disk >>> subsystems in relatively small installations. >>> The benefits of clustered block allocation >>> diminish when the number of nodes in the >>> cluster or the number of disks in a file system >>> increases, or when the file system?s free space >>> becomes fragmented. *//The *cluster*// >>> allocation method is the default for GPFS >>> clusters with eight or fewer nodes and for file >>> systems with eight or fewer disks./ >>> / >>> *The scatter allocation method provides >>> more consistent file system performance by >>> averaging out performance variations due to >>> block location (for many disk subsystems, the >>> location of the data relative to the disk edge >>> has a substantial effect on performance).*//This >>> allocation method is appropriate in most cases >>> and is the default for GPFS clusters with more >>> than eight nodes or file systems with more than >>> eight disks./ >>> / >>> The block allocation map type cannot be changed >>> after the storage pool has been created./ >>> >>> */ >>> -n/*/*NumNodes*// >>> The estimated number of nodes that will mount the file >>> system in the local cluster and all remote clusters. >>> This is used as a best guess for the initial size of >>> some file system data structures. The default is 32. >>> This value can be changed after the file system has been >>> created but it does not change the existing data >>> structures. Only the newly created data structure is >>> affected by the new value. For example, new storage >>> pool./ >>> / >>> When you create a GPFS file system, you might want to >>> overestimate the number of nodes that will mount the >>> file system. GPFS uses this information for creating >>> data structures that are essential for achieving maximum >>> parallelism in file system operations (For more >>> information, see GPFS architecture in IBM Spectrum >>> Scale: Concepts, Planning, and Installation Guide ). If >>> you are sure there will never be more than 64 nodes, >>> allow the default value to be applied. If you are >>> planning to add nodes to your system, you should specify >>> a number larger than the default./ >>> >>> [/snip from man mmcrfs] >>> >>> Regards, >>> -Kums >>> >>> >>> >>> >>> >>> From: Ivano Talamo >> > >>> To: >> > >>> Date: 11/15/2017 11:25 AM >>> Subject: [gpfsug-discuss] Write performances and filesystem >> size >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> >>> >> ------------------------------------------------------------------------ >>> >>> >>> >>> Hello everybody, >>> >>> together with my colleagues we are actually running some tests on >> a new >>> DSS G220 system and we see some unexpected behaviour. >>> >>> What we actually see is that write performances (we did not test read >>> yet) decreases with the decrease of filesystem size. >>> >>> I will not go into the details of the tests, but here are some >> numbers: >>> >>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the >>> sum of the disk activity on the two IO servers; >>> - with a filesystem using half of the space we get 10 GB/s; >>> - with a filesystem using 1/4 of the space we get 5 GB/s. >>> >>> We also saw that performances are not affected by the vdisks layout, >>> ie. >>> taking the full space with one big vdisk or 2 half-size vdisks per RG >>> gives the same performances. >>> >>> To our understanding the IO should be spread evenly across all the >>> pdisks in the declustered array, and looking at iostat all disks >>> seem to >>> be accessed. But so there must be some other element that affects >>> performances. >>> >>> Am I missing something? Is this an expected behaviour and someone >>> has an >>> explanation for this? >>> >>> Thank you, >>> Ivano >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >_ >>> >> > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org > >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org > >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jtucker at pixitmedia.com Wed Nov 22 09:20:55 2017 From: jtucker at pixitmedia.com (Jez Tucker) Date: Wed, 22 Nov 2017 09:20:55 +0000 Subject: [gpfsug-discuss] setxattr via policy In-Reply-To: References: Message-ID: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> Hi Marcus, ? Something like this should do you: RULE 'setxattr' LIST 'do_setxattr' FOR FILESET ('xattrfileset') WEIGHT(DIRECTORY_HASH) ACTION(SETXATTR('user.testflag1','projectX')) WHERE ??? KB_ALLOCATED >? [insert required file size limit] Then with one file larger and another file smaller than the limit: root at elmo:/mmfs1/policies# getfattr -n user.testflag1 /mmfs1/data/xattrfileset/* getfattr: Removing leading '/' from absolute path names # file: mmfs1/data/xattrfileset/file.1 user.testflag1="projectX" /mmfs1/data/xattrfileset/file.2: user.testflag1: No such attribute As xattrs are a superb way of automating data operations, for those of you with our Python API have a look over the xattr examples in the git repo: https://github.com/arcapix/gpfsapi-examples as an alternative Pythonic way to achieve this. Cheers, Jez On 22/11/17 03:13, Marcus Koenig wrote: > Hi there, > > I've got a question around setting userdefined extended attributes. I > have played around a bit with setting certain attributes via mmchattr > - but now I want to run a policy to do this for me for certain > filesets or file sizes. > > How would I write my policy to set an attribute like > user.testflag1=projectX on a number of files in a fileset that are > bigger than 1G for example? > > Thanks folks. > > Cheers, > Marcus > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- *Jez Tucker* Head of Research and Development, Pixit Media 07764193820 | jtucker at pixitmedia.com www.pixitmedia.com | Tw:@pixitmedia.com -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus at koenighome.de Wed Nov 22 09:28:56 2017 From: marcus at koenighome.de (Marcus Koenig) Date: Wed, 22 Nov 2017 22:28:56 +1300 Subject: [gpfsug-discuss] setxattr via policy In-Reply-To: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> References: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> Message-ID: Thanks guys - will test it now - much appreciated. On Wed, Nov 22, 2017 at 10:20 PM, Jez Tucker wrote: > Hi Marcus, > > Something like this should do you: > > RULE 'setxattr' LIST 'do_setxattr' > FOR FILESET ('xattrfileset') > WEIGHT(DIRECTORY_HASH) > ACTION(SETXATTR('user.testflag1','projectX')) > WHERE > KB_ALLOCATED > [insert required file size limit] > > > Then with one file larger and another file smaller than the limit: > > root at elmo:/mmfs1/policies# getfattr -n user.testflag1 > /mmfs1/data/xattrfileset/* > getfattr: Removing leading '/' from absolute path names > # file: mmfs1/data/xattrfileset/file.1 > user.testflag1="projectX" > > /mmfs1/data/xattrfileset/file.2: user.testflag1: No such attribute > > > As xattrs are a superb way of automating data operations, for those of you > with our Python API have a look over the xattr examples in the git repo: > https://github.com/arcapix/gpfsapi-examples as an alternative Pythonic > way to achieve this. > > Cheers, > > Jez > > > > > On 22/11/17 03:13, Marcus Koenig wrote: > > Hi there, > > I've got a question around setting userdefined extended attributes. I have > played around a bit with setting certain attributes via mmchattr - but now > I want to run a policy to do this for me for certain filesets or file sizes. > > How would I write my policy to set an attribute like > user.testflag1=projectX on a number of files in a fileset that are bigger > than 1G for example? > > Thanks folks. > > Cheers, > Marcus > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > -- > *Jez Tucker* > Head of Research and Development, Pixit Media > 07764193820 <07764%20193820> | jtucker at pixitmedia.com > www.pixitmedia.com | Tw:@pixitmedia.com > > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Nov 22 16:51:27 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 22 Nov 2017 11:51:27 -0500 Subject: [gpfsug-discuss] setxattr via policy - extended attributes - tips and hints In-Reply-To: References: Message-ID: Assuming you have a recent version of Spectrum Scale... You can use ACTION(SetXattr(...)) in mmapplypolicy {MIGRATE,LIST} rules and/or in {SET POOL} rules that are evaluated at file creation time. Later... You can use WHERE .... Xattr(...) in any policy rules to test/compare an extended attribute. But watch out for NULL! See the "tips" section of the ILM chapter of the admin guide for some ways to deal with NULL (hints: COALESCE , expr IS NULL, expr IS NOT NULL, CASE ... ) See also mm{ch|ls}attr -d -X --hex-attr and so forth. Also can be used compatibly with {set|get}fattr on Linux --marc -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Nov 23 06:21:10 2017 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Thu, 23 Nov 2017 06:21:10 +0000 Subject: [gpfsug-discuss] tar sparse file data loss Message-ID: Somehow this nugget of joy (that?s most definitely sarcasm, this really sucks) slipped past my radar: http://www-01.ibm.com/support/docview.wss?uid=isg1IV96475 Anyone know if there?s a fix in the 4.1 stream? In my opinion this is 100% a tar bug as the APAR suggests but GPFS has implemented a workaround. See this post from the tar mailing list: https://www.mail-archive.com/bug-tar at gnu.org/msg04209.html It looks like the troublesome code may still exist upstream: http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c#n273 No better way to ensure you?ll hit a problem than to assume you won?t :) -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Thu Nov 23 23:02:46 2017 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 23 Nov 2017 23:02:46 +0000 Subject: [gpfsug-discuss] tar sparse file data loss In-Reply-To: References: Message-ID: <61aa823e50ad4cf3a59de063528e6d12@exch1-cdc.nexus.csiro.au> I logged perhaps the original service request on this but must admit we haven?t tried it of late as have worked around the issue. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] Sent: Thursday, 23 November 2017 4:21 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] tar sparse file data loss Somehow this nugget of joy (that?s most definitely sarcasm, this really sucks) slipped past my radar: http://www-01.ibm.com/support/docview.wss?uid=isg1IV96475 Anyone know if there?s a fix in the 4.1 stream? In my opinion this is 100% a tar bug as the APAR suggests but GPFS has implemented a workaround. See this post from the tar mailing list: https://www.mail-archive.com/bug-tar at gnu.org/msg04209.html It looks like the troublesome code may still exist upstream: http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c#n273 No better way to ensure you?ll hit a problem than to assume you won?t :) -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Sun Nov 26 18:00:37 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Sun, 26 Nov 2017 13:00:37 -0500 Subject: [gpfsug-discuss] Online data migration tool Message-ID: With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron From Greg.Lehmann at csiro.au Sun Nov 26 22:33:45 2017 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Sun, 26 Nov 2017 22:33:45 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: I personally don?t think lack of a migration tool is a problem. I do think that 2 format changes in such quick succession is a problem. I am willing to migrate occasionally, but then the amount of data we have in GPFS is still small. I do value my data, so I'd trust a manual migration using standard tools that have been around for a while over a custom migration tool any day. This last format change seems fairly major to me, so doubly so in this case. Trying to find a plus in this, maybe use it test DR procedures at the same time. Apologies in advance to those, that simply can't. I hope you get your migration tool. To IBM, thank you for making GPFS faster. Greg -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, 27 November 2017 4:01 AM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Online data migration tool With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From S.J.Thompson at bham.ac.uk Sun Nov 26 22:39:48 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Sun, 26 Nov 2017 22:39:48 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: I agree that migration is not easy. We thought we might be able to accomplish it using SOBAR, but the block size has to match in the old and new file-systems. In fact mmfsd asserts if you try. I had a PMR open on this and was told SoBAR can only be used to restore to the same block size and they aren't going to fix it. (Seriously how many people using SOBAR for DR are likely to be able to restore to identical hardware?). Second we thought maybe AFM would help, but we use IFS and child dependent filesets and we can't replicate the structure in the AFM cache. Given there is no other supported way of moving data or converting file-systems, like you we are stuck with significant disruption when we want to replace some aging hardware next year. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of aaron.knister at gmail.com [aaron.knister at gmail.com] Sent: 26 November 2017 18:00 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Online data migration tool With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From abeattie at au1.ibm.com Sun Nov 26 22:46:13 2017 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Sun, 26 Nov 2017 22:46:13 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Nov 27 14:56:56 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 27 Nov 2017 14:56:56 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: <1511794616.18554.121.camel@strath.ac.uk> On Sun, 2017-11-26 at 13:00 -0500, Aaron Knister wrote: > With the release of Scale 5.0 it?s no secret that some of the > performance features of 5.0 require a new disk format and existing > filesystems cannot be migrated in place to get these features.? > > There?s also an issue for long time customers who have had scale > since before the 4.1 days where filesystems crested prior to I think > 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold > metadata. At some point we?re not going to be able to buy storage > that?s not got 4K sectors.? This has been going on since forever. We have had change to 64bit inodes for more than 2 billion files and the ability to mount on Windows. They are like 2.3 and 3.0 changes from memory going back around a decade now. I have a feeling there was another change for mounting HSM'ed file systems on Windows too. I just don't think IBM care. The answer has always been well just start again. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From yguvvala at cambridgecomputer.com Wed Nov 29 16:00:33 2017 From: yguvvala at cambridgecomputer.com (Yugendra Guvvala) Date: Wed, 29 Nov 2017 11:00:33 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511794616.18554.121.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: Hi, I am trying to understand the technical challenges to migrate to GPFS 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to see 5.0 release and hear about some promising features available. But not sure about complexity involved to migrate. ? Thanks, Yugi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:35:04 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:35:04 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: <1511973304.18554.133.camel@strath.ac.uk> On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: > Hi,? > > I am trying to understand the technical challenges to migrate to GPFS > 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to > see 5.0 release and hear about some promising features available. But > not sure about complexity involved to migrate.? > Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 29 16:37:02 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 29 Nov 2017 11:37:02 -0500 Subject: [gpfsug-discuss] SOBAR restore with new blocksize and/or inodesize In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: This redbook http://w3-03.ibm.com/support/techdocs/atsmastr.nsf/3af3af29ce1f19cf86256c7100727a9f/335d8a48048ea78d85258059006dad33/$FILE/SOBAR_Migration_SpectrumScale_v1.0.pdf has these and other hints: -B blocksize, should match the file system block size of the source system, but can also be larger (not smaller). To obtain the file system block size in the source system use the command: mmlsfs -B -i inodesize, should match the file system inode size of the source system, but can also be larger (not smaller). To obtain the inode size in the source system use the following command: mmlsfs -i. Note, in Spectrum Scale it is recommended to use a inodesize of 4K because this well aligns to disk I/O. Our tests have shown that having a greater inode size on the target than on the source works as well. If you really want to shrink the blocksize, some internal testing indicates that works also. Shrinking the inodesize also works, although this will impact the efficiency of small file and extended attributes in-the-inode support. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Wed Nov 29 16:39:25 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 29 Nov 2017 16:39:25 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511973304.18554.133.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. ? Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 29 November 2017 16:35 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Online data migration tool On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: > Hi, > > I am trying to understand the technical challenges to migrate to GPFS > 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to > see 5.0 release and hear about some promising features available. But > not sure about complexity involved to migrate. > Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From scottg at emailhosting.com Wed Nov 29 16:38:07 2017 From: scottg at emailhosting.com (scott) Date: Wed, 29 Nov 2017 11:38:07 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511973304.18554.133.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: Question: Who at IBM is going to reach out to ESPN - a 24/7 online user - with >15PETABYTES of content? Asking customers to copy, reformat, copy back will just cause IBM to have to support the older version for a longer period of time Just my $.03 (adjusted for inflation) On 11/29/2017 11:35 AM, Jonathan Buzzard wrote: > On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: >> Hi, >> >> I am trying to understand the technical challenges to migrate to GPFS >> 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to >> see 5.0 release and hear about some promising features available. But >> not sure about complexity involved to migrate. >> > Oh that's simple. You copy all your data somewhere else (good luck if > you happen to have a few hundred TB or maybe a PB or more) then > reformat your files system with the new disk format then restore all > your data to your shiny new file system. > > Over the years there have been a number of these "reformats" to get all > the new shiny features, which is the cause of the grumbles because it > is not funny and most people don't have the disk space to just hold > another copy of the data, and even if they did it is extremely > disruptive. > > JAB. > From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:47:27 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:47:27 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <1511974047.18554.135.camel@strath.ac.uk> On Wed, 2017-11-29 at 11:38 -0500, scott wrote: > Question: Who at IBM is going to reach out to ESPN - a 24/7 online > user? > - with >15PETABYTES of content? > > Asking customers to copy, reformat, copy back will just cause IBM to? > have to support the older version for a longer period of time > > Just my $.03 (adjusted for inflation) > Oh you can upgrade to 5.0, it's just if your file system was created with a previous version then you won't get to use all the new features.? I would imagine if you still had a file system created under 2.3 you could mount it on 5.0. Just you would be missing a bunch of features like support for more than 2 billion files, or the ability to mount in on Windows or ... JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Kevin.Buterbaugh at Vanderbilt.Edu Wed Nov 29 16:51:51 2017 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 29 Nov 2017 16:51:51 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Hi All, Well, actually a year ago we started the process of doing pretty much what Richard describes below ? the exception being that we rsync?d data over to the new filesystem group by group. It was no fun but it worked. And now GPFS (and it will always be GPFS ? it will never be Spectrum Scale) version 5 is coming and there are compelling reasons to want to do the same thing over again ? despite the pain. Having said all that, I think it would be interesting to have someone from IBM give an explanation of why Apple can migrate millions of devices to a new filesystem with 99.999999% of the users never even knowing they did it ? but IBM can?t provide a way to migrate to a new filesystem ?in place.? And to be fair to IBM, they do ship AIX with root having a password and Apple doesn?t, so we all have our strengths and weaknesses! ;-) Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Nov 29, 2017, at 10:39 AM, Sobey, Richard A > wrote: Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. ? Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 29 November 2017 16:35 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Online data migration tool On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: Hi, I am trying to understand the technical challenges to migrate to GPFS 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to see 5.0 release and hear about some promising features available. But not sure about complexity involved to migrate. Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:55:46 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:55:46 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Message-ID: <1511974546.18554.138.camel@strath.ac.uk> On Wed, 2017-11-29 at 16:51 +0000, Buterbaugh, Kevin L wrote: [SNIP] > And now GPFS (and it will always be GPFS ? it will never be > Spectrum Scale) Splitter, its Tiger Shark forever ;-) JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 29 17:37:29 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 29 Nov 2017 12:37:29 -0500 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Wed Nov 29 17:40:51 2017 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Wed, 29 Nov 2017 17:40:51 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Wednesday, November 29, 2017 at 11:38 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Nov 29 17:43:11 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Wed, 29 Nov 2017 17:43:11 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> , Message-ID: You can in place upgrade. I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] Sent: 29 November 2017 17:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Wednesday, November 29, 2017 at 11:38 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Nov 29 17:50:50 2017 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 29 Nov 2017 17:50:50 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <4FB50580-B5E2-45AD-BABB-C2BE9E99012F@vanderbilt.edu> Simon in correct ? I?d love to be able to support a larger block size for my users who have sane workflows while still not wasting a ton of space for the biomedical folks?. ;-) A question ? will the new, much improved, much faster mmrestripefs that was touted at SC17 require a filesystem that was created with GPFS / Tiger Shark / Spectrum Scale / Multi-media filesystem () version 5 or simply one that has been ?upgraded? to that format? Thanks? Kevin > On Nov 29, 2017, at 11:43 AM, Simon Thompson (IT Research Support) wrote: > > You can in place upgrade. > > I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] > Sent: 29 November 2017 17:40 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. > > From: on behalf of Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Wednesday, November 29, 2017 at 11:38 AM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? > > > The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C755e8b13215f48e4e21508d53750ac45%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636475741979446614&sdata=RpfsLbGTRtlZQ06Winrn65jXQlDYjFHdWuKMvEyZwBI%3D&reserved=0 From knop at us.ibm.com Wed Nov 29 18:27:40 2017 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 29 Nov 2017 13:27:40 -0500 Subject: [gpfsug-discuss] 5.0 features? -- mmrestripefs -b In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: Kevin, The improved rebalance function (mmrestripefs -b) only depends on the cluster level being (at least) 5.0.0, and will work with older file system formats as well. This particular improvement did not require a change in the format/structure of the file system. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 11/29/2017 12:51 PM Subject: Re: [gpfsug-discuss] 5.0 features? Sent by: gpfsug-discuss-bounces at spectrumscale.org Simon in correct ? I?d love to be able to support a larger block size for my users who have sane workflows while still not wasting a ton of space for the biomedical folks?. ;-) A question ? will the new, much improved, much faster mmrestripefs that was touted at SC17 require a filesystem that was created with GPFS / Tiger Shark / Spectrum Scale / Multi-media filesystem () version 5 or simply one that has been ?upgraded? to that format? Thanks? Kevin > On Nov 29, 2017, at 11:43 AM, Simon Thompson (IT Research Support) wrote: > > You can in place upgrade. > > I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] > Sent: 29 November 2017 17:40 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. > > From: on behalf of Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Wednesday, November 29, 2017 at 11:38 AM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? > > > The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttp-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss-26data-3D02-257C01-257CKevin.Buterbaugh-2540vanderbilt.edu-257C755e8b13215f48e4e21508d53750ac45-257Cba5a7f39e3be4ab3b45067fa80faecad-257C0-257C0-257C636475741979446614-26sdata-3DRpfsLbGTRtlZQ06Winrn65jXQlDYjFHdWuKMvEyZwBI-253D-26reserved-3D0&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=T_wlNQsuQkBDoQhdS2fe4nbIoDOo5oywJRYfJ6849M8&s=C6m8yyvkVEqEmpozrpgGHNidk4SwpbgpCWO1fvYKffA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=T_wlNQsuQkBDoQhdS2fe4nbIoDOo5oywJRYfJ6849M8&s=JFaXBwXQ8aaDrZ1mdCvsZ6siAktHtOVvZr7vqiy_Tp4&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From nikhilk at us.ibm.com Wed Nov 29 19:08:11 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Wed, 29 Nov 2017 12:08:11 -0700 Subject: [gpfsug-discuss] Online data migration tool Message-ID: Hi, I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. I hope that clarifies things a little and makes the upgrade path more accessible. Please let me know if there are any other questions or concerns. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Nov 29 19:19:11 2017 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 29 Nov 2017 14:19:11 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Message-ID: <49425FCD-D1CA-46FE-B1F1-98E5F464707C@ulmer.org> About five years ago (I think) Apple slipped a volume manager[1] in on the unsuspecting. :) If you have a Mac, you might have noticed that the mount type/pattern changed with Lion. CoreStorage was the beginning of building the infrastructure to change a million(?) Macs and several hundred million iPhones and iPads under the users? noses. :) Has anyone seen list of the features that would require the on-disk upgrade? If there isn?t one yet, I think that the biggest failing is not not publishing it ? the natives are restless and it?s not like IBM wouldn?t know... [1] This is what Apple calls it. If you?ve ever used AIX or Linux you?ll just chuckle when you look at the limitations. -- Stephen > On Nov 29, 2017, at 11:51 AM, Buterbaugh, Kevin L wrote: > > Hi All, > > Well, actually a year ago we started the process of doing pretty much what Richard describes below ? the exception being that we rsync?d data over to the new filesystem group by group. It was no fun but it worked. And now GPFS (and it will always be GPFS ? it will never be Spectrum Scale) version 5 is coming and there are compelling reasons to want to do the same thing over again ? despite the pain. > > Having said all that, I think it would be interesting to have someone from IBM give an explanation of why Apple can migrate millions of devices to a new filesystem with 99.999999% of the users never even knowing they did it ? but IBM can?t provide a way to migrate to a new filesystem ?in place.? > > And to be fair to IBM, they do ship AIX with root having a password and Apple doesn?t, so we all have our strengths and weaknesses! ;-) > > Kevin > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > >> On Nov 29, 2017, at 10:39 AM, Sobey, Richard A wrote: >> >> Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. >> >> ? >> >> Richard >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard >> Sent: 29 November 2017 16:35 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Online data migration tool >> >> On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: >>> Hi, >>> >>> I am trying to understand the technical challenges to migrate to GPFS >>> 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to >>> see 5.0 release and hear about some promising features available. But >>> not sure about complexity involved to migrate. >>> >> >> Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. >> >> Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From ulmer at ulmer.org Wed Nov 29 19:21:00 2017 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 29 Nov 2017 14:21:00 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Thank you. -- Stephen > On Nov 29, 2017, at 2:08 PM, Nikhil Khandelwal > wrote: > > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Wed Nov 29 22:41:48 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Wed, 29 Nov 2017 17:41:48 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf from those presentations regarding 32 subblocks: "It has a significant performance penalty for small files in large block size filesystems" although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. -Aaron On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For > all Spectrum Scale clusters that are currently at 4.X.X, it is possible to > migrate to 5.0.0 with no offline data migration and no need to move data. > Once these clusters are at 5.0.0, they will benefit from the performance > improvements, new features (such as file audit logging), and various > enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to > these clusters, and that is the increased number of sub-blocks per block > for small file allocation. This means that for file systems with a large > block size and a lot of small files, the overall space utilization will be > the same it currently is in 4.X.X. Since file systems created at 4.X.X and > earlier used a block size that kept this allocation in mind, there should > be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance > improvements, metadata improvements, updated compatibility, new > functionality, and all of the other enhancements will be immediately > available to you once you complete the upgrade to 5.0.0 -- with no need to > reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more > accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikhilk at us.ibm.com Thu Nov 30 00:00:23 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Wed, 29 Nov 2017 17:00:23 -0700 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Hi Aaron, By large block size we are primarily talking about block sizes 4 MB and greater. You are correct, in my previous message I neglected to mention the file create performance for small files on these larger block sizes due to the subblock change. In addition to the added space efficiency, small file creation (for example 32kB files) on large block size filesystems will improve. In the case of a 1 MB block size, there would be no real difference in file creates. For a 16 MB block size, however there will be a performance improvement for small file creation as a part of the subblock change for new filesystems. For users who are upgrading from 4.X.X to 5.0.0, the file creation speed will remain the same after the upgrade. I hope that helps, sorry for the confusion. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption From: Aaron Knister To: gpfsug main discussion list Date: 11/29/2017 03:42 PM Subject: Re: [gpfsug-discuss] Online data migration tool Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf from those presentations regarding 32 subblocks: "It has a significant performance penalty for small files in large block size filesystems" although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. -Aaron On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: Hi, I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. I hope that clarifies things a little and makes the upgrade path more accessible. Please let me know if there are any other questions or concerns. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WUJ15T9xHCCIfLm1wqC74jhfu28fXGLotYoHQvJlMCg&m=GNrHjCLvQL1u_WHVimX2lAlYOGPzciCFrYHGlae3h_E&s=VtVgCRl7kxNRgcl5QeHdZJ0Rz6jCA-jfQXyLztbr5TY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From abeattie at au1.ibm.com Thu Nov 30 01:55:54 2017 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 30 Nov 2017 01:55:54 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: , <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Thu Nov 30 15:35:32 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Thu, 30 Nov 2017 10:35:32 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Oh? I specifically remember Sven talking about the >32 subblocks on the context of file creation speed in addition to space efficiency. If what you?re saying is true, then why do those charts show that feature in the context of file creation performance and specifically mention it as a performance bottleneck? Are the slides incorrect or am I just reading them wrong? Sent from my iPhone > On Nov 30, 2017, at 10:05, Lyle Gayne wrote: > > Aaron, > that is a misunderstanding. The new feature for larger numbers of sub-blocks (varying by block size) has nothing to do with the 50K creates per second or many other performance patterns in GPFS. > > The improved create (and other metadata ops) rates came from identifying and mitigating various locking bottlenecks and optimizing the code paths specifically involved in those ops. > > Thanks > Lyle > > > Aaron Knister ---11/29/2017 05:42:26 PM---Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impressio > > From: Aaron Knister > To: gpfsug main discussion list > Date: 11/29/2017 05:42 PM > Subject: Re: [gpfsug-discuss] Online data migration tool > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: > > http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf > http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf > http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf > > from those presentations regarding 32 subblocks: > > "It has a significant performance penalty for small files in large block size filesystems" > > although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. > > -Aaron > > > On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=irBRNHjLNBazoPW27vuMTJGyZjdo_8yqZZNkY7RRh5I&s=8nZVi2Wp8LPbXo0Pg6ItJv6GEOk5jINHR05MY_H7a4w&e= > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Thu Nov 30 16:13:30 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 30 Nov 2017 16:13:30 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: <1512058410.18554.151.camel@strath.ac.uk> On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: [SNIP] > Since file systems created at 4.X.X and earlier used a block size > that kept this allocation in mind, there should be very little impact > on existing file systems. That is quite a presumption. I would say that file systems created at 4.X.X and earlier potentially used a block size that was the best *compromise*, and the new options would work a lot better. So for example supporting a larger block size for users who have sane workflows while still not wasting a ton of space for the biomedical folks who abuse the file system as a database. Though I have come to the conclusion to stop them using the file system as a database (no don't do ls in that directory there is 200,000 files and takes minutes to come back) is to put your BOFH hat on quota them on maximum file numbers and suggest to them that they use a database even if it is just sticking it all in SQLite :-D JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From valdis.kletnieks at vt.edu Thu Nov 30 16:27:39 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Thu, 30 Nov 2017 11:27:39 -0500 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? Message-ID: <20014.1512059259@turing-police.cc.vt.edu> We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From S.J.Thompson at bham.ac.uk Thu Nov 30 16:31:48 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 30 Nov 2017 16:31:48 +0000 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? In-Reply-To: <20014.1512059259@turing-police.cc.vt.edu> References: <20014.1512059259@turing-police.cc.vt.edu> Message-ID: Um no, you are talking GPFS protocol between cluster nodes still in multicluster. Contact nodes are where the remote cluster goes to start with, but after that it's just normal node to node gpfs traffic (not just the contact nodes). At least that is my understanding. If you want traffic separation, you need something like AFM. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valdis.kletnieks at vt.edu [valdis.kletnieks at vt.edu] Sent: 30 November 2017 16:27 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? From aaron.s.knister at nasa.gov Thu Nov 30 16:35:04 2017 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Thu, 30 Nov 2017 16:35:04 +0000 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? In-Reply-To: <20014.1512059259@turing-police.cc.vt.edu> References: <20014.1512059259@turing-police.cc.vt.edu> Message-ID: It?s my understanding and experience that all member nodes of two clusters that are multi-clustered must be able to (and will eventually given enough time/activity) make connections to any and all nodes in both clusters. Even if you don?t designate the 2 protocol nodes as contact nodes I would expect to see connections from remote clusters to the protocol nodes just because of the nature of the beast. If you don?t want remote nodes to make connections to the protocol nodes then I believe you would need to put the protocol nodes in their own cluster. CES/CNFS hasn?t always supported this but I think it is now supported, at least with NFS. On November 30, 2017 at 11:28:03 EST, valdis.kletnieks at vt.edu wrote: We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikhilk at us.ibm.com Thu Nov 30 17:00:08 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Thu, 30 Nov 2017 10:00:08 -0700 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: That is fair, there certainly are compromises that have to be made with regards to file space/size/performance when choosing a block size, especially with varied workloads or users who may create 200,000 files at a time :). With an increased the number of subblocks, the compromises and parameters going into this choice change. However, I just didn't want to lose sight of the fact that the remainder of the 5.0.0 features and enhancements (and there are a lot :-) ) are available to all systems, with no need to go through painful data movement or recreating of filesystems. Thanks, Nikhil Khandelwal Spectrum Scale Development Client Adoption From: Jonathan Buzzard To: gpfsug main discussion list Date: 11/30/2017 09:13 AM Subject: Re: [gpfsug-discuss] Online data migration tool Sent by: gpfsug-discuss-bounces at spectrumscale.org On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: [SNIP] > Since file systems created at 4.X.X and earlier used a block size > that kept this allocation in mind, there should be very little impact > on existing file systems. That is quite a presumption. I would say that file systems created at 4.X.X and earlier potentially used a block size that was the best *compromise*, and the new options would work a lot better. So for example supporting a larger block size for users who have sane workflows while still not wasting a ton of space for the biomedical folks who abuse the file system as a database. Though I have come to the conclusion to stop them using the file system as a database (no don't do ls in that directory there is 200,000 files and takes minutes to come back) is to put your BOFH hat on quota them on maximum file numbers and suggest to them that they use a database even if it is just sticking it all in SQLite :-D JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WUJ15T9xHCCIfLm1wqC74jhfu28fXGLotYoHQvJlMCg&m=RrwCj4KWyu_ykACVG1SYu8EJiDZnH6edu-2rnoalOg4&s=p7xlojuTYL5csXYA94NyL-R5hk7OgLH0qKGTN0peGFk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From skylar2 at u.washington.edu Thu Nov 30 18:01:48 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 18:01:48 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1512058410.18554.151.camel@strath.ac.uk> References: <1512058410.18554.151.camel@strath.ac.uk> Message-ID: <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> On Thu, Nov 30, 2017 at 04:13:30PM +0000, Jonathan Buzzard wrote: > On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: > > [SNIP] > > > Since file systems created at 4.X.X and earlier used a block size > > that kept this allocation in mind, there should be very little impact > > on existing file systems. > > That is quite a presumption. I would say that file systems created at > 4.X.X and earlier potentially used a block size that was the best > *compromise*, and the new options would work a lot better. > > So for example supporting a larger block size for users who have sane > workflows while still not wasting a ton of space for the biomedical > folks who abuse the file system as a database. > > Though I have come to the conclusion to stop them using the file system > as a database (no don't do ls in that directory there is 200,000 files > and takes minutes to come back) is to put your BOFH hat on quota them > on maximum file numbers and suggest to them that they use a database > even if it is just sticking it all in SQLite :-D To be fair, a lot of our biomedical/informatics folks have no choice in the matter because the vendors are imposing a filesystem-as-a-database paradigm on them. Each of our Illumina sequencers, for instance, generates a few million files per run, many of which are images containing raw data from the sequencers that are used to justify refunds for defective reagents. Sure, we could turn them off, but then we're eating $$$ we could be getting back from the vendor. At least SSD prices have come down far enough that we can put our metadata on fast disks now, even if we can't take advantage of the more efficient small file allocation yet. -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From makaplan at us.ibm.com Thu Nov 30 18:34:05 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 30 Nov 2017 13:34:05 -0500 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: It would be interesting to know how well Spectrum Scale large directory and small file features work in these sort of DB-ish applications. You might want to optimize by creating a file system provisioned and tuned for such application... Regardless of file system, `ls -1 | grep ...` in a huge directory is not going to be a good idea. But stats and/or opens on a huge directory to look for a particular file should work pretty well... -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Thu Nov 30 18:41:52 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 18:41:52 +0000 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: <20171130184152.ivvduyzjlp7etys2@utumno.gs.washington.edu> On Thu, Nov 30, 2017 at 01:34:05PM -0500, Marc A Kaplan wrote: > It would be interesting to know how well Spectrum Scale large directory > and small file features work in these sort of DB-ish applications. > > You might want to optimize by creating a file system provisioned and tuned > for such application... > > Regardless of file system, `ls -1 | grep ...` in a huge directory is not > going to be a good idea. But stats and/or opens on a huge directory to > look for a particular file should work pretty well... I've wondered if it would be worthwhile having POSIX look-alike commands like ls and find that plug into the GPFS API rather than making VFS calls. That's of course a project for my Copious Free Time... -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From makaplan at us.ibm.com Thu Nov 30 20:52:09 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 30 Nov 2017 15:52:09 -0500 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: Generally the GPFS API will give you access to some information and functionality that are not available via the Posix API. But I don't think you'll find significant performance difference in cases where there is functional overlap. Going either way (Posix or GPFS-specific) - for each API call the execution path drops into the kernel - and then if required - an inter-process call to the mmfsd daemon process. From: Skylar Thompson To: gpfsug-discuss at spectrumscale.org Date: 11/30/2017 01:42 PM Subject: Re: [gpfsug-discuss] FIle system vs Database Sent by: gpfsug-discuss-bounces at spectrumscale.org On Thu, Nov 30, 2017 at 01:34:05PM -0500, Marc A Kaplan wrote: > It would be interesting to know how well Spectrum Scale large directory > and small file features work in these sort of DB-ish applications. > > You might want to optimize by creating a file system provisioned and tuned > for such application... > > Regardless of file system, `ls -1 | grep ...` in a huge directory is not > going to be a good idea. But stats and/or opens on a huge directory to > look for a particular file should work pretty well... I've wondered if it would be worthwhile having POSIX look-alike commands like ls and find that plug into the GPFS API rather than making VFS calls. That's of course a project for my Copious Free Time... -- -- Skylar Thompson (skylar2 at u.washington.edu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Thu Nov 30 21:42:21 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 21:42:21 +0000 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: <20171130214220.pqtizt2q6ysu6cds@utumno.gs.washington.edu> Interesting, thanks for the information Marc. Could there be an improvement for something like "ls -l some-dir" using the API, though? Instead of getdents + stat for every file (entering and leaving kernel mode many times), could it be done in one operation with one context switch? On Thu, Nov 30, 2017 at 03:52:09PM -0500, Marc A Kaplan wrote: > Generally the GPFS API will give you access to some information and > functionality that are not available via the Posix API. > > But I don't think you'll find significant performance difference in cases > where there is functional overlap. > > Going either way (Posix or GPFS-specific) - for each API call the > execution path drops into the kernel - and then if required - an > inter-process call to the mmfsd daemon process. -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From jonathan.buzzard at strath.ac.uk Thu Nov 30 22:02:35 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 30 Nov 2017 22:02:35 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> References: <1512058410.18554.151.camel@strath.ac.uk> <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> Message-ID: <17e108bf-67af-78af-3e2d-e4a4b99c178d@strath.ac.uk> On 30/11/17 18:01, Skylar Thompson wrote: [SNIP] > To be fair, a lot of our biomedical/informatics folks have no choice in the > matter because the vendors are imposing a filesystem-as-a-database paradigm > on them. Each of our Illumina sequencers, for instance, generates a few > million files per run, many of which are images containing raw data from > the sequencers that are used to justify refunds for defective reagents. > Sure, we could turn them off, but then we're eating $$$ we could be getting > back from the vendor. > Been there too. What worked was having a find script that ran through their files, found directories that had not been accessed for a week and zipped them all up, before nuking the original files. The other thing I would suggest is if they want to buy sequencers from vendors who are brain dead, then that's fine but they are going to have to pay extra for the storage because they are costing way more than the average to store their files. Far to much buying of kit goes on without any thought of the consequences of how to deal with the data it generates. Then there where the proteomics bunch who basically just needed a good thrashing with a very large clue stick, because the zillions of files where the result of their own Perl scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Matthias.Knigge at rohde-schwarz.com Wed Nov 1 10:55:31 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 1 Nov 2017 11:55:31 +0100 Subject: [gpfsug-discuss] Combine different rules Message-ID: Hi at all, I configured a tiered storage with two pools. pool1 >> fast >> ssd pool2 >> slow >> sata First I created a fileset and a placement rule to copy the files to the fast storage. After a time of no access the files and folders should be moved to the slower storage. This could be done by a migration rule. I want to move the whole project folder to the slower storage. If a file in a project folder on the slower storage will be accessed this whole folder should be moved back to the faster storage. The rules must not run automatically. It is ok when this could be done by a cronjob over night. I am a beginner in writing rules. My idea is to write rules which listed files by date and by access and put the output into a file. After that a bash script can change the attributes of these files or rather folders. This could be done by the mmchattr command. If it is possible the mmapplypolicy command could be useful. Someone experiences in those cases? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 1 12:17:45 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 01 Nov 2017 12:17:45 +0000 Subject: [gpfsug-discuss] Combine different rules In-Reply-To: References: Message-ID: <1509538665.18554.1.camel@strath.ac.uk> On Wed, 2017-11-01 at 11:55 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi at all,? > > I configured a tiered storage with two pools.? > > pool1 ? ? ? ?>> ? ? ? ?fast ? ? ? ?>> ? ? ? ?ssd? > pool2 ? ? ? ?>> ? ? ? ?slow ? ? ? ?>> ? ? ? ?sata? > > First I created a fileset and a placement rule to copy the files to > the fast storage.? > > After a time of no access the files and folders should be moved to > the slower storage. This could be done by a migration rule. I want to > move the whole project folder to the slower storage.? Why move the whole project? Just wait if the files are not been accessed they will get moved in short order. You are really making it more complicated for no useful or practical gain. This is a basic policy to move old stuff from fast to slow disks. define(age,(DAYS(CURRENT_TIMESTAMP)-DAYS(ACCESS_TIME))) define(weighting, CASE ????????WHEN age>365 ????????????THEN age*KB_ALLOCATED ????????WHEN age<30 ????????????THEN 0 ????????ELSE ????????????KB_ALLOCATED ???????END ) RULE 'ilm' MIGRATE FROM POOL 'fast' THRESHOLD(90,70) WEIGHT(weighting) TO POOL 'slow' RULE 'new' SET POOL 'fast' LIMIT(95) RULE 'spillover' SET POOL 'slow' Basically it says when fast pool is 90% full, flush it down to 70% full, based on a weighting of the size and age. Basically older bigger files go first. The last two are critical. Allocate new files to the fast pool till it gets 95% full then start using the slow pool. Basically you have to stop allocating files to the fast pool long before it gets full otherwise you will end up with problems. Basically imagine there is 100KB left in the fast pool. I create a file which succeeds because there is space and start writing. When I get to 100KB the write fails because there is no space left in the pool, and a file can only be in one pool at a time. Generally programs will cleanup deleting the failed write at which point there will be space left and so the cycle goes on. You might want to force some file types onto slower disk. For example ISO images?don't really benefit from ever being on the fast disk. /* force ISO images onto nearline storage */ RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso' You also might want to punish people storing inappropriate files on your server so /* force MP3's and the like onto nearline storage forever */ RULE 'mp3' SET POOL 'slow' ????WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR LOWER(NAME) LIKE '%.wma' Another rule I used was to migrate files over to a certain size to the slow pool too. > > If a file in a project folder on the slower storage will be accessed > this whole folder should be moved back to the faster storage.? > Waste of time. In my experience the slow disks when not actually taking new files from a flush of the fast pools will be doing jack all. That is under 10 IOPS per second. That's because if you have everything sized correctly and the right rules people rarely go back to old files. As such the penalty for being on the slower disks is most none existent because there is loads of spare IO capacity on those disks. Secondly by the time you have spotted the files need moving the chances are your users have finished with them so moving them gains nothing. Thirdly if the users start working with those files any change to the file will result in a new file being written which will automatically go to the fast disks. It's the standard dance when you save a file; create new temporary file, write the contents, then do some renaming before deleting the old one. If you are insistent then something like the following would be a start, but moving a whole project would be a *lot* more complicated. I disabled the rule because it was a waste of time. I suggest running a similar rule that prints the files out so you can see how pointless it is. /* migrate recently accessed files back the fast disks */ RULE 'restore' MIGRATE FROM POOL 'slow' WEIGHT(KB_ALLOCATED) TO POOL 'fast' WHERE age < 1 Depending on the number of "projects" you anticipate you could allocate a project to a fileset and then move whole filesets about but I really think the idea is one of those that looks sensible at a high level but in practice is not sensible. > The rules must ?not run automatically. It is ok when this could be > done by a cronjob over night.? > I would argue strongly, very strongly that while you might want to flush the fast pool down every night to a certain amount free, you must have it set so that should it become full during the day an automatic flush is triggered. Failure to do so is guaranteed to bite you in the backside some time down the line. > I am a beginner in writing rules. My idea is to write rules which > listed files by date and by access and put the output into a file. > After that a bash script can change the attributes of these files or > rather folders.? Eh, you apply the policy and it does the work!!! More reading required on the subject I think. A bash script would be horribly slow. IBM have put a lot of work into making the policy engine really really fast. Messing about changing thousands if not millions of files with a bash script will be much much slower and is a recipe for disaster. Your users will put all sorts of random crap into file and directory names; backtick's, asterix's, question marks, newlines, UTF-8 characters etc. that will invariably break your bash script unless carefully escaped. There is no way for you to prevent this. It's the reason find/xargs have the -print0/-0 options, otherwise stuff will just mysteriously break on you. It's really better to just sidestep the whole issue and not process the files with scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From david_johnson at brown.edu Wed Nov 1 12:21:05 2017 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Wed, 1 Nov 2017 08:21:05 -0400 Subject: [gpfsug-discuss] Combine different rules In-Reply-To: References: Message-ID: <3D17430A-B572-4E8E-8CA3-0C308D38AE7B@brown.edu> Filesets and storage pools are for the most part orthogonal concepts. You would sort your users and apply quotas with filesets. You would use storage pools underneath filesets and the filesystem to migrate between faster and slower media. Migration between storage pools is done well by the policy engine with mmapplypolicy. Moving between filesets is entirely up to you, but the path names will change. Migration within a filesystem using storage pools preserves path names. -- ddj Dave Johnson > On Nov 1, 2017, at 6:55 AM, Matthias.Knigge at rohde-schwarz.com wrote: > > Hi at all, > > I configured a tiered storage with two pools. > > pool1 >> fast >> ssd > pool2 >> slow >> sata > > First I created a fileset and a placement rule to copy the files to the fast storage. > > After a time of no access the files and folders should be moved to the slower storage. This could be done by a migration rule. I want to move the whole project folder to the slower storage. > > If a file in a project folder on the slower storage will be accessed this whole folder should be moved back to the faster storage. > > The rules must not run automatically. It is ok when this could be done by a cronjob over night. > > I am a beginner in writing rules. My idea is to write rules which listed files by date and by access and put the output into a file. After that a bash script can change the attributes of these files or rather folders. > > This could be done by the mmchattr command. If it is possible the mmapplypolicy command could be useful. > > Someone experiences in those cases? > > Many thanks in advance! > > Matthias > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Wed Nov 1 12:36:18 2017 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Wed, 1 Nov 2017 07:36:18 -0500 Subject: [gpfsug-discuss] SC17 Spectrum Scale U/G Message-ID: Reminder: Please sign up so we have numbers for planning the happy hour. http://www.spectrumscale.org/ssug-at-sc17/ Douglas O'Flaherty IBM Spectrum Solutions -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 1 14:01:35 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 1 Nov 2017 15:01:35 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules In-Reply-To: <1509538665.18554.1.camel@strath.ac.uk> References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: Hi JAB, many thanks for your answer. Ok, some more background information: We are working with video realtime applications and uncompressed files. So one project is one folder and some subfolders. The size of one project could be more than 1TB. That is the reason why I want to move the whole folder tree. Moving old stuff to the slower storage is not the problem but moving the files back for working with the realtime applications. Not every file will be accessed when you open a project. The Clients get access via GPFS-Client (Windows) and over Samba. Another tool on storage side scan the files for creating playlists etc. While the migration the playout of the video files may not dropped. So I think the best way is to find a solution with mmapplypolicy manually or via crontab. Im must check the access time and the types of files. If I do not do this never a file will be moved the slower storage because the special tool always have access to the files. I will try some concepts and give feedback which solution is working for me. Matthias Von: Jonathan Buzzard An: gpfsug main discussion list Datum: 01.11.2017 13:18 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Wed, 2017-11-01 at 11:55 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi at all, > > I configured a tiered storage with two pools. > > pool1 >> fast >> ssd > pool2 >> slow >> sata > > First I created a fileset and a placement rule to copy the files to > the fast storage. > > After a time of no access the files and folders should be moved to > the slower storage. This could be done by a migration rule. I want to > move the whole project folder to the slower storage. Why move the whole project? Just wait if the files are not been accessed they will get moved in short order. You are really making it more complicated for no useful or practical gain. This is a basic policy to move old stuff from fast to slow disks. define(age,(DAYS(CURRENT_TIMESTAMP)-DAYS(ACCESS_TIME))) define(weighting, CASE WHEN age>365 THEN age*KB_ALLOCATED WHEN age<30 THEN 0 ELSE KB_ALLOCATED END ) RULE 'ilm' MIGRATE FROM POOL 'fast' THRESHOLD(90,70) WEIGHT(weighting) TO POOL 'slow' RULE 'new' SET POOL 'fast' LIMIT(95) RULE 'spillover' SET POOL 'slow' Basically it says when fast pool is 90% full, flush it down to 70% full, based on a weighting of the size and age. Basically older bigger files go first. The last two are critical. Allocate new files to the fast pool till it gets 95% full then start using the slow pool. Basically you have to stop allocating files to the fast pool long before it gets full otherwise you will end up with problems. Basically imagine there is 100KB left in the fast pool. I create a file which succeeds because there is space and start writing. When I get to 100KB the write fails because there is no space left in the pool, and a file can only be in one pool at a time. Generally programs will cleanup deleting the failed write at which point there will be space left and so the cycle goes on. You might want to force some file types onto slower disk. For example ISO images don't really benefit from ever being on the fast disk. /* force ISO images onto nearline storage */ RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso' You also might want to punish people storing inappropriate files on your server so /* force MP3's and the like onto nearline storage forever */ RULE 'mp3' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR LOWER(NAME) LIKE '%.wma' Another rule I used was to migrate files over to a certain size to the slow pool too. > > If a file in a project folder on the slower storage will be accessed > this whole folder should be moved back to the faster storage. > Waste of time. In my experience the slow disks when not actually taking new files from a flush of the fast pools will be doing jack all. That is under 10 IOPS per second. That's because if you have everything sized correctly and the right rules people rarely go back to old files. As such the penalty for being on the slower disks is most none existent because there is loads of spare IO capacity on those disks. Secondly by the time you have spotted the files need moving the chances are your users have finished with them so moving them gains nothing. Thirdly if the users start working with those files any change to the file will result in a new file being written which will automatically go to the fast disks. It's the standard dance when you save a file; create new temporary file, write the contents, then do some renaming before deleting the old one. If you are insistent then something like the following would be a start, but moving a whole project would be a *lot* more complicated. I disabled the rule because it was a waste of time. I suggest running a similar rule that prints the files out so you can see how pointless it is. /* migrate recently accessed files back the fast disks */ RULE 'restore' MIGRATE FROM POOL 'slow' WEIGHT(KB_ALLOCATED) TO POOL 'fast' WHERE age < 1 Depending on the number of "projects" you anticipate you could allocate a project to a fileset and then move whole filesets about but I really think the idea is one of those that looks sensible at a high level but in practice is not sensible. > The rules must not run automatically. It is ok when this could be > done by a cronjob over night. > I would argue strongly, very strongly that while you might want to flush the fast pool down every night to a certain amount free, you must have it set so that should it become full during the day an automatic flush is triggered. Failure to do so is guaranteed to bite you in the backside some time down the line. > I am a beginner in writing rules. My idea is to write rules which > listed files by date and by access and put the output into a file. > After that a bash script can change the attributes of these files or > rather folders. Eh, you apply the policy and it does the work!!! More reading required on the subject I think. A bash script would be horribly slow. IBM have put a lot of work into making the policy engine really really fast. Messing about changing thousands if not millions of files with a bash script will be much much slower and is a recipe for disaster. Your users will put all sorts of random crap into file and directory names; backtick's, asterix's, question marks, newlines, UTF-8 characters etc. that will invariably break your bash script unless carefully escaped. There is no way for you to prevent this. It's the reason find/xargs have the -print0/-0 options, otherwise stuff will just mysteriously break on you. It's really better to just sidestep the whole issue and not process the files with scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 1 14:12:43 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 01 Nov 2017 14:12:43 +0000 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules In-Reply-To: References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: <1509545563.18554.3.camel@strath.ac.uk> On Wed, 2017-11-01 at 15:01 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi JAB,? > > many thanks for your answer.? > > Ok, some more background information:? > > We are working with video realtime applications and uncompressed > files. So one project is one folder and some subfolders. The size of > one project could be more than 1TB. That is the reason why I want to > move the whole folder tree.? > That is not a reason to move the whole folder tree. If the "project" is inactive then the files in it are inactive and the normal "this file has not been accessed" type rules will in due course move the whole lot over to the slower storage. > Moving old stuff to the slower storage is not the problem but moving > the files back for working with the realtime applications. Not every > file will be accessed when you open a project.? > Yeah but you don't want these sorts of policies kicking in automatically. Further if someone where just to check or update a summary document stored with the videos, the whole lot would get moved back to fast disk. By the sounds of it you are going to have to run manual mmapplypolicies to move the groups of files around. Automating what you want is going to be next to impossible. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 1 14:43:27 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 1 Nov 2017 09:43:27 -0500 Subject: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Nov 1 14:59:22 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 1 Nov 2017 09:59:22 -0500 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - STAGING a fileset to a particular POOL In-Reply-To: References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: Not withstanding JAB's remark that this may not necessary: Some customers/admins will want to "stage" a fileset in anticipation of using the data therein. Conversely you can "destage" - just set the TO POOL accordingly. This can be accomplished with a policy rule like: RULE 'stage' MIGRATE FOR FILESET('myfileset') TO POOL 'mypool' /* no FROM POOL clause is required, files will come from any pool - for files already in mypool, no work is done */ And running a command like: mmapplypolicy /path-to/myfileset -P file-with-the-above-policy-rule -g /path-to/shared-temp -N nodelist-to-do-the-work ... (Specifying the path-to/myfileset on the command line will restrict the directory scan, making it go faster.) As JAB remarked, for GPFS POOL to GPFS POOL this may be overkill, but if the files have been "HSMed" migrated or archived to some really slow storage like TAPE ... they an analyst who want to explore the data interactively, might request a migration back to "real" disks (or SSDs) then go to lunch or go to bed ... --marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Wed Nov 1 22:54:04 2017 From: griznog at gmail.com (John Hanks) Date: Wed, 1 Nov 2017 15:54:04 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Message-ID: Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Thu Nov 2 07:11:58 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Thu, 02 Nov 2017 03:11:58 -0400 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: <44655.1509606718@turing-police.cc.vt.edu> On Wed, 01 Nov 2017 15:54:04 -0700, John Hanks said: > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device Check 'df -i' to make sure no file systems are out of inodes. That's From YARD at il.ibm.com Thu Nov 2 07:28:06 2017 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 2 Nov 2017 09:28:06 +0200 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: Hi Please check mmdf output to see that MetaData disks are not full, or you have i-nodes issue. In case you have Independent File-Sets , please run : mmlsfileset -L -i to get the status of each fileset inodes. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: John Hanks To: gpfsug Date: 11/02/2017 12:54 AM Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=WTfQpWOsmp-BdHZ0PWDbaYsxq-5Q1ZH26IyfrBRe3_c&s=SJg8NrUXWEpaxDhqECkwkbJ71jtxjLZz5jX7FxmYMBk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: From Matthias.Knigge at rohde-schwarz.com Thu Nov 2 09:07:48 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Thu, 2 Nov 2017 10:07:48 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: Thanks for this tip. I will try these commands and give feedback in the next week. Matthias Von: "Marc A Kaplan" An: gpfsug main discussion list Datum: 01.11.2017 15:43 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Nov 2 11:19:05 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 2 Nov 2017 11:19:05 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Message-ID: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 14:43:31 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 07:43:31 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Nov 2 14:57:45 2017 From: david_johnson at brown.edu (David Johnson) Date: Thu, 2 Nov 2017 10:57:45 -0400 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert > wrote: > One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > From: > on behalf of John Hanks > > Reply-To: gpfsug main discussion list > > Date: Wednesday, November 1, 2017 at 5:55 PM > To: gpfsug > > Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" > > > > Hi all, <> > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 15:33:11 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 08:33:11 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > >> One thing that I?ve run into before is that on older file systems you had >> the ?*.quota? files in the file system root. If you upgraded the file >> system to a newer version (so these files aren?t used) - There was a bug at >> one time where these didn?t get properly migrated during a restripe. >> Solution was to just remove them >> >> >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> *From: * on behalf of John >> Hanks >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Wednesday, November 1, 2017 at 5:55 PM >> *To: *gpfsug >> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on >> device" >> >> >> >> Hi all, >> >> >> >> I'm trying to do a restripe after setting some nsds to metadataOnly and I >> keep running into this error: >> >> >> >> Scanning user file metadata ... >> >> 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with >> total 531689 MB data processed) >> >> Error processing user file metadata. >> >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on >> scg-gs0 for inodes with broken disk addresses or failures. >> >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> >> >> The file it points to says: >> >> >> >> This inode list was generated in the Parallel Inode Traverse on Wed Nov >> 1 15:36:06 2017 >> >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> >> 53504 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> >> >> >> >> /var on the node I am running this on has > 128 GB free, all the NSDs >> have plenty of free space, the filesystem being restriped has plenty of >> free space and if I watch the node while running this no filesystem on it >> even starts to get full. Could someone tell me where mmrestripefs is >> attempting to write and/or how to point it at a different location? >> >> >> >> Thanks, >> >> >> >> jbh >> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 15:44:08 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 15:44:08 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 15:55:12 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 15:55:12 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 16:13:16 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 09:13:16 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Hmm, this sounds suspicious. We have 10 NSDs in a pool called system. These were previously set to data+metaData with a policy that placed our home directory filesets on this pool. A few weeks ago the NSDs in this pool all filled up. To remedy that I 1. removed old snapshots 2. deleted some old homedir filesets 3. set the NSDs in this pool to metadataOnly 4. changed the policy to point homedir filesets to another pool. 5. ran a migrate policy to migrate all homedir filesets to this other pool After all that I now have ~30% free space on the metadata pool. Our three pools are system (metadataOnly), sas0 (data), sata0 (data) mmrestripefs gsfs0 -r fails immdieately mmrestripefs gsfs0 -r -P system fails immediately mmrestripefs gsfs0 -r -P sas0 fails immediately mmrestripefs gsfs0 -r -P sata0 is running (currently about 3% done) Is the change from data+metadata to metadataOnly the same as removing a disk (for the purposes of this problem) or is it possible my policy is confusing things? [root at scg-gs0 ~]# mmlspolicy gsfs0 Policy for file system '/dev/gsfs0': Installed by root at scg-gs0 on Wed Nov 1 09:30:40 2017. First line of policy 'policy_placement.txt' is: RULE 'homedirs' SET POOL 'sas0' WHERE FILESET_NAME LIKE 'home.%' The policy I used to migrate these filesets is: RULE 'homedirs' MIGRATE TO POOL 'sas0' WHERE FILESET_NAME LIKE 'home.%' jbh On Thu, Nov 2, 2017 at 8:44 AM, Scott Fadden wrote: > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: John Hanks > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson > wrote: > > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r= > WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m= > hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s= > j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 16:19:55 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 09:19:55 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: gpfsug-discuss at spectrumscale.org > Cc: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: John Hanks > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson > wrote: > > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r= > WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m= > hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s= > j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 16:41:36 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 16:41:36 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Nov 2 16:45:30 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 2 Nov 2017 11:45:30 -0500 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Assuming you are replicating data and metadata have you confirmed that all failure groups have the same free space? That is could it be that one of your failure groups has less space than the others? You can verify this with the output of mmdf and look at the NSD sizes and space available. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 12:20 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: Sorry just reread as I hit send and saw this was mmrestripe, in my case it was mmdeledisk. Did you try running the command on just one pool. Or using -B instead? What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ? Looks like it could be related to the maxfeaturelevel of the cluster. Have you recently upgraded? Is everything up to the same level? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Scott Fadden/Portland/IBM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:44 AM I opened a defect on this the other day, in my case it was an incorrect error message. What it meant to say was,"The pool is not empty." Are you trying to remove the last disk in a pool? If so did you empty the pool with a MIGRATE policy first? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: John Hanks Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:34 AM We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University On Nov 2, 2017, at 10:43 AM, John Hanks wrote: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks < griznog at gmail.com> Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 17:16:36 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 10:16:36 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: We do have different amounts of space in the system pool which had the changes applied: [root at scg4-hn01 ~]# mmdf gsfs0 -P system disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) VD000 377487360 100 Yes No 143109120 ( 38%) 35708688 ( 9%) DMD_NSD_804 377487360 100 Yes No 79526144 ( 21%) 2924584 ( 1%) VD002 377487360 100 Yes No 143067136 ( 38%) 35713888 ( 9%) DMD_NSD_802 377487360 100 Yes No 79570432 ( 21%) 2926672 ( 1%) VD004 377487360 100 Yes No 143107584 ( 38%) 35727776 ( 9%) DMD_NSD_805 377487360 200 Yes No 79555584 ( 21%) 2940040 ( 1%) VD001 377487360 200 Yes No 142964992 ( 38%) 35805384 ( 9%) DMD_NSD_803 377487360 200 Yes No 79580160 ( 21%) 2919560 ( 1%) VD003 377487360 200 Yes No 143132672 ( 38%) 35764200 ( 9%) DMD_NSD_801 377487360 200 Yes No 79550208 ( 21%) 2915232 ( 1%) ------------- -------------------- ------------------- (pool total) 3774873600 1113164032 ( 29%) 193346024 ( 5%) and mmldisk shows that there is a problem with replication: ... Number of quorum disks: 5 Read quorum value: 3 Write quorum value: 3 Attention: Due to an earlier configuration change the file system is no longer properly replicated. I thought a 'mmrestripe -r' would fix this, not that I have to fix it first before restriping? jbh On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock wrote: > Assuming you are replicating data and metadata have you confirmed that all > failure groups have the same free space? That is could it be that one of > your failure groups has less space than the others? You can verify this > with the output of mmdf and look at the NSD sizes and space available. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > From: John Hanks > To: gpfsug main discussion list > Date: 11/02/2017 12:20 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Addendum to last message: > > We haven't upgraded recently as far as I know (I just inherited this a > couple of months ago.) but am planning an outage soon to upgrade from > 4.2.0-4 to 4.2.3-5. > > My growing collection of output files generally contain something like > > This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 > 08:34:22 2017 > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > 53506 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > With that inode varying slightly. > > jbh > > On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* > > wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: *gpfsug-discuss at spectrumscale.org* > Cc: *gpfsug-discuss at spectrumscale.org* > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: John Hanks <*griznog at gmail.com* > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* > > wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* > > wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > *Robert.Oesterlin at nuance.com* > wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: *<*gpfsug-discuss-bounces at spectrumscale.org* > > on behalf of John Hanks < > *griznog at gmail.com* > > *Reply-To: *gpfsug main discussion list < > *gpfsug-discuss at spectrumscale.org* > > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* > > > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_ > FF6spqHVpo_0joLY&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Nov 2 17:57:45 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 2 Nov 2017 12:57:45 -0500 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Did you run the tsfindinode command to see where that file is located? Also, what does the mmdf show for your other pools notably the sas0 storage pool? Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 01:17 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org We do have different amounts of space in the system pool which had the changes applied: [root at scg4-hn01 ~]# mmdf gsfs0 -P system disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) VD000 377487360 100 Yes No 143109120 ( 38%) 35708688 ( 9%) DMD_NSD_804 377487360 100 Yes No 79526144 ( 21%) 2924584 ( 1%) VD002 377487360 100 Yes No 143067136 ( 38%) 35713888 ( 9%) DMD_NSD_802 377487360 100 Yes No 79570432 ( 21%) 2926672 ( 1%) VD004 377487360 100 Yes No 143107584 ( 38%) 35727776 ( 9%) DMD_NSD_805 377487360 200 Yes No 79555584 ( 21%) 2940040 ( 1%) VD001 377487360 200 Yes No 142964992 ( 38%) 35805384 ( 9%) DMD_NSD_803 377487360 200 Yes No 79580160 ( 21%) 2919560 ( 1%) VD003 377487360 200 Yes No 143132672 ( 38%) 35764200 ( 9%) DMD_NSD_801 377487360 200 Yes No 79550208 ( 21%) 2915232 ( 1%) ------------- -------------------- ------------------- (pool total) 3774873600 1113164032 ( 29%) 193346024 ( 5%) and mmldisk shows that there is a problem with replication: ... Number of quorum disks: 5 Read quorum value: 3 Write quorum value: 3 Attention: Due to an earlier configuration change the file system is no longer properly replicated. I thought a 'mmrestripe -r' would fix this, not that I have to fix it first before restriping? jbh On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock wrote: Assuming you are replicating data and metadata have you confirmed that all failure groups have the same free space? That is could it be that one of your failure groups has less space than the others? You can verify this with the output of mmdf and look at the NSD sizes and space available. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 12:20 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: Sorry just reread as I hit send and saw this was mmrestripe, in my case it was mmdeledisk. Did you try running the command on just one pool. Or using -B instead? What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ? Looks like it could be related to the maxfeaturelevel of the cluster. Have you recently upgraded? Is everything up to the same level? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Scott Fadden/Portland/IBM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:44 AM I opened a defect on this the other day, in my case it was an incorrect error message. What it meant to say was,"The pool is not empty." Are you trying to remove the last disk in a pool? If so did you empty the pool with a MIGRATE policy first? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: John Hanks Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:34 AM We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University On Nov 2, 2017, at 10:43 AM, John Hanks wrote: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks < griznog at gmail.com> Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s=yDRpuvz3LOTwvP2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 18:14:44 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 11:14:44 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: tsfindiconde tracked the file to user.quota, which somehow escaped my previous attempt to "mv *.quota /elsewhere/" I've moved that now and verified it is actually gone and will retry once the current restripe on the sata0 pool is wrapped up. jbh On Thu, Nov 2, 2017 at 10:57 AM, Frederick Stock wrote: > Did you run the tsfindinode command to see where that file is located? > Also, what does the mmdf show for your other pools notably the sas0 storage > pool? > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > From: John Hanks > To: gpfsug main discussion list > Date: 11/02/2017 01:17 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > We do have different amounts of space in the system pool which had the > changes applied: > > [root at scg4-hn01 ~]# mmdf gsfs0 -P system > disk disk size failure holds holds free > KB free KB > name in KB group metadata data in full > blocks in fragments > --------------- ------------- -------- -------- ----- -------------------- > ------------------- > Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) > VD000 377487360 100 Yes No 143109120 ( > 38%) 35708688 ( 9%) > DMD_NSD_804 377487360 100 Yes No 79526144 ( > 21%) 2924584 ( 1%) > VD002 377487360 100 Yes No 143067136 ( > 38%) 35713888 ( 9%) > DMD_NSD_802 377487360 100 Yes No 79570432 ( > 21%) 2926672 ( 1%) > VD004 377487360 100 Yes No 143107584 ( > 38%) 35727776 ( 9%) > DMD_NSD_805 377487360 200 Yes No 79555584 ( > 21%) 2940040 ( 1%) > VD001 377487360 200 Yes No 142964992 ( > 38%) 35805384 ( 9%) > DMD_NSD_803 377487360 200 Yes No 79580160 ( > 21%) 2919560 ( 1%) > VD003 377487360 200 Yes No 143132672 ( > 38%) 35764200 ( 9%) > DMD_NSD_801 377487360 200 Yes No 79550208 ( > 21%) 2915232 ( 1%) > ------------- -------------------- > ------------------- > (pool total) 3774873600 1113164032 ( > 29%) 193346024 ( 5%) > > > and mmldisk shows that there is a problem with replication: > > ... > Number of quorum disks: 5 > Read quorum value: 3 > Write quorum value: 3 > Attention: Due to an earlier configuration change the file system > is no longer properly replicated. > > > I thought a 'mmrestripe -r' would fix this, not that I have to fix it > first before restriping? > > jbh > > > On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <*stockf at us.ibm.com* > > wrote: > Assuming you are replicating data and metadata have you confirmed that all > failure groups have the same free space? That is could it be that one of > your failure groups has less space than the others? You can verify this > with the output of mmdf and look at the NSD sizes and space available. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *stockf at us.ibm.com* > > > > From: John Hanks <*griznog at gmail.com* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 11/02/2017 12:20 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Addendum to last message: > > We haven't upgraded recently as far as I know (I just inherited this a > couple of months ago.) but am planning an outage soon to upgrade from > 4.2.0-4 to 4.2.3-5. > > My growing collection of output files generally contain something like > > This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 > 08:34:22 2017 > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > 53506 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > With that inode varying slightly. > > jbh > > On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* > > wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: *gpfsug-discuss at spectrumscale.org* > Cc: *gpfsug-discuss at spectrumscale.org* > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: John Hanks <*griznog at gmail.com* > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* > > wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* > > wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > *Robert.Oesterlin at nuance.com* > wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: *<*gpfsug-discuss-bounces at spectrumscale.org* > > on behalf of John Hanks < > *griznog at gmail.com* > > *Reply-To: *gpfsug main discussion list < > *gpfsug-discuss at spectrumscale.org* > > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* > > > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s= > yDRpuvz3LOTwvP2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 18:18:27 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 11:18:27 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Yep, looks like Robert Oesterlin was right, it was the old quota files causing the snag. Now sure how "mv *.quota" managed to move the group file and not the user file, but I'll let that remain a mystery of the universe. In any case I have a restripe running now and have learned a LOT about all the bits in the process. Many thanks to everyone who replied, I learn something from this list every time I get near it. Thank you, jbh On Thu, Nov 2, 2017 at 11:14 AM, John Hanks wrote: > tsfindiconde tracked the file to user.quota, which somehow escaped my > previous attempt to "mv *.quota /elsewhere/" I've moved that now and > verified it is actually gone and will retry once the current restripe on > the sata0 pool is wrapped up. > > jbh > > On Thu, Nov 2, 2017 at 10:57 AM, Frederick Stock > wrote: > >> Did you run the tsfindinode command to see where that file is located? >> Also, what does the mmdf show for your other pools notably the sas0 storage >> pool? >> >> Fred >> __________________________________________________ >> Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> >> stockf at us.ibm.com >> >> >> >> From: John Hanks >> To: gpfsug main discussion list >> Date: 11/02/2017 01:17 PM >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on >> device" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> ------------------------------ >> >> >> >> We do have different amounts of space in the system pool which had the >> changes applied: >> >> [root at scg4-hn01 ~]# mmdf gsfs0 -P system >> disk disk size failure holds holds free >> KB free KB >> name in KB group metadata data in full >> blocks in fragments >> --------------- ------------- -------- -------- ----- >> -------------------- ------------------- >> Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) >> VD000 377487360 100 Yes No 143109120 ( >> 38%) 35708688 ( 9%) >> DMD_NSD_804 377487360 100 Yes No 79526144 ( >> 21%) 2924584 ( 1%) >> VD002 377487360 100 Yes No 143067136 ( >> 38%) 35713888 ( 9%) >> DMD_NSD_802 377487360 100 Yes No 79570432 ( >> 21%) 2926672 ( 1%) >> VD004 377487360 100 Yes No 143107584 ( >> 38%) 35727776 ( 9%) >> DMD_NSD_805 377487360 200 Yes No 79555584 ( >> 21%) 2940040 ( 1%) >> VD001 377487360 200 Yes No 142964992 ( >> 38%) 35805384 ( 9%) >> DMD_NSD_803 377487360 200 Yes No 79580160 ( >> 21%) 2919560 ( 1%) >> VD003 377487360 200 Yes No 143132672 ( >> 38%) 35764200 ( 9%) >> DMD_NSD_801 377487360 200 Yes No 79550208 ( >> 21%) 2915232 ( 1%) >> ------------- >> -------------------- ------------------- >> (pool total) 3774873600 1113164032 ( >> 29%) 193346024 ( 5%) >> >> >> and mmldisk shows that there is a problem with replication: >> >> ... >> Number of quorum disks: 5 >> Read quorum value: 3 >> Write quorum value: 3 >> Attention: Due to an earlier configuration change the file system >> is no longer properly replicated. >> >> >> I thought a 'mmrestripe -r' would fix this, not that I have to fix it >> first before restriping? >> >> jbh >> >> >> On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <*stockf at us.ibm.com* >> > wrote: >> Assuming you are replicating data and metadata have you confirmed that >> all failure groups have the same free space? That is could it be that one >> of your failure groups has less space than the others? You can verify this >> with the output of mmdf and look at the NSD sizes and space available. >> >> Fred >> __________________________________________________ >> Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> >> *stockf at us.ibm.com* >> >> >> >> From: John Hanks <*griznog at gmail.com* > >> To: gpfsug main discussion list < >> *gpfsug-discuss at spectrumscale.org* > >> Date: 11/02/2017 12:20 PM >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on >> device" >> Sent by: *gpfsug-discuss-bounces at spectrumscale.org* >> >> ------------------------------ >> >> >> >> Addendum to last message: >> >> We haven't upgraded recently as far as I know (I just inherited this a >> couple of months ago.) but am planning an outage soon to upgrade from >> 4.2.0-4 to 4.2.3-5. >> >> My growing collection of output files generally contain something like >> >> This inode list was generated in the Parallel Inode Traverse on Thu Nov >> 2 08:34:22 2017 >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> 53506 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> With that inode varying slightly. >> >> jbh >> >> On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* >> > wrote: >> Sorry just reread as I hit send and saw this was mmrestripe, in my case >> it was mmdeledisk. >> >> Did you try running the command on just one pool. Or using -B instead? >> >> What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" >> ? >> >> Looks like it could be related to the maxfeaturelevel of the cluster. >> Have you recently upgraded? Is everything up to the same level? >> >> Scott Fadden >> Spectrum Scale - Technical Marketing >> Phone: *(503) 880-5833* <(503)%20880-5833> >> *sfadden at us.ibm.com* >> *http://www.ibm.com/systems/storage/spectrum/scale* >> >> >> >> ----- Original message ----- >> From: Scott Fadden/Portland/IBM >> To: *gpfsug-discuss at spectrumscale.org* >> Cc: *gpfsug-discuss at spectrumscale.org* >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" >> Date: Thu, Nov 2, 2017 8:44 AM >> >> I opened a defect on this the other day, in my case it was an incorrect >> error message. What it meant to say was,"The pool is not empty." Are you >> trying to remove the last disk in a pool? If so did you empty the pool with >> a MIGRATE policy first? >> >> >> Scott Fadden >> Spectrum Scale - Technical Marketing >> Phone: *(503) 880-5833* <(503)%20880-5833> >> *sfadden at us.ibm.com* >> *http://www.ibm.com/systems/storage/spectrum/scale* >> >> >> >> ----- Original message ----- >> From: John Hanks <*griznog at gmail.com* > >> Sent by: *gpfsug-discuss-bounces at spectrumscale.org* >> >> To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* >> > >> Cc: >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" >> Date: Thu, Nov 2, 2017 8:34 AM >> >> We have no snapshots ( they were the first to go when we initially hit >> the full metadata NSDs). >> >> I've increased quotas so that no filesets have hit a space quota. >> >> Verified that there are no inode quotas anywhere. >> >> mmdf shows the least amount of free space on any nsd to be 9% free. >> >> Still getting this error: >> >> [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs >> 3 >> Scanning file system metadata, phase 1 ... >> Scan completed successfully. >> Scanning file system metadata, phase 2 ... >> Scanning file system metadata for sas0 storage pool >> Scanning file system metadata for sata0 storage pool >> Scan completed successfully. >> Scanning file system metadata, phase 3 ... >> Scan completed successfully. >> Scanning file system metadata, phase 4 ... >> Scan completed successfully. >> Scanning user file metadata ... >> Error processing user file metadata. >> No space left on device >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on >> scg-gs0 for inodes with broken disk addresses or failures. >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> I should note too that this fails almost immediately, far to quickly to >> fill up any location it could be trying to write to. >> >> jbh >> >> On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* >> > wrote: >> One thing that may be relevant is if you have snapshots, depending on >> your release level, >> inodes in the snapshot may considered immutable, and will not be >> migrated. Once the snapshots >> have been deleted, the inodes are freed up and you won?t see the >> (somewhat misleading) message >> about no space. >> >> ? ddj >> Dave Johnson >> Brown University >> >> On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* >> > wrote: >> Thanks all for the suggestions. >> >> Having our metadata NSDs fill up was what prompted this exercise, but >> space was previously feed up on those by switching them from metadata+data >> to metadataOnly and using a policy to migrate files out of that pool. So >> these now have about 30% free space (more if you include fragmented space). >> The restripe attempt is just to make a final move of any remaining data off >> those devices. All the NSDs now have free space on them. >> >> df -i shows inode usage at about 84%, so plenty of free inodes for the >> filesystem as a whole. >> >> We did have old .quota files laying around but removing them didn't have >> any impact. >> >> mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer >> while getting to work. >> >> mmrepquota does show about a half-dozen filesets that have hit their >> quota for space (we don't set quotas on inodes). Once I'm settled in this >> morning I'll try giving them a little extra space and see what happens. >> >> jbh >> >> >> On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < >> *Robert.Oesterlin at nuance.com* > wrote: >> One thing that I?ve run into before is that on older file systems you had >> the ?*.quota? files in the file system root. If you upgraded the file >> system to a newer version (so these files aren?t used) - There was a bug at >> one time where these didn?t get properly migrated during a restripe. >> Solution was to just remove them >> >> >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> *From: *<*gpfsug-discuss-bounces at spectrumscale.org* >> > on behalf of John Hanks < >> *griznog at gmail.com* > >> *Reply-To: *gpfsug main discussion list < >> *gpfsug-discuss at spectrumscale.org* > >> *Date: *Wednesday, November 1, 2017 at 5:55 PM >> *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* >> > >> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on >> device" >> >> >> >> Hi all, >> >> >> >> I'm trying to do a restripe after setting some nsds to metadataOnly and I >> keep running into this error: >> >> >> >> Scanning user file metadata ... >> >> 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with >> total 531689 MB data processed) >> >> Error processing user file metadata. >> >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on >> scg-gs0 for inodes with broken disk addresses or failures. >> >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> >> >> The file it points to says: >> >> >> >> This inode list was generated in the Parallel Inode Traverse on Wed Nov >> 1 15:36:06 2017 >> >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> >> 53504 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> >> >> >> >> /var on the node I am running this on has > 128 GB free, all the NSDs >> have plenty of free space, the filesystem being restriped has plenty of >> free space and if I watch the node while running this no filesystem on it >> even starts to get full. Could someone tell me where mmrestripefs is >> attempting to write and/or how to point it at a different location? >> >> >> >> Thanks, >> >> >> >> jbh >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman* >> >> /listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman* >> >> /listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> >> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> >> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e=* >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.o >> rg_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObT >> bx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= >> XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s=yDRpuvz3LOTwvP >> 2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Sat Nov 4 16:14:46 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Sat, 4 Nov 2017 12:14:46 -0400 Subject: [gpfsug-discuss] file layout API + file fragmentation Message-ID: <83ed4b5a-cf9e-12da-e460-e34a6492afcf@nasa.gov> I've got a question about the file layout API and how it reacts in the case of fragmented files. I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code based on tsGetDataBlk.C. I'm basing the block size based off of what's returned by filemapOut.blockSize but that only seems to return a value > 0 when filemapIn.startOffset is 0. In a case where a file were to be made up of a significant number of non-contiguous fragments (which... would be awful in of itself) how would this be reported by the file layout API? Does the interface technically just report the disk location information of the first block of the $blockSize range and assume that it's contiguous? Thanks! -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From makaplan at us.ibm.com Sun Nov 5 23:01:25 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Sun, 5 Nov 2017 18:01:25 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: I googled GPFS_FCNTL_GET_DATABLKDISKIDX and found this discussion: https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 In general, GPFS files ARE deliberately "fragmented" but we don't say that - we say they are "striped" over many disks -- and that is generally a good thing for parallel performance. Also, in GPFS, if the last would-be block of a file is less than a block, then it is stored in a "fragment" of a block. So you see we use "fragment" to mean something different than other file systems you may know. --marc From: Aaron Knister To: gpfsug main discussion list Date: 11/04/2017 12:22 PM Subject: [gpfsug-discuss] file layout API + file fragmentation Sent by: gpfsug-discuss-bounces at spectrumscale.org I've got a question about the file layout API and how it reacts in the case of fragmented files. I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code based on tsGetDataBlk.C. I'm basing the block size based off of what's returned by filemapOut.blockSize but that only seems to return a value > 0 when filemapIn.startOffset is 0. In a case where a file were to be made up of a significant number of non-contiguous fragments (which... would be awful in of itself) how would this be reported by the file layout API? Does the interface technically just report the disk location information of the first block of the $blockSize range and assume that it's contiguous? Thanks! -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Sun Nov 5 23:39:07 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Sun, 5 Nov 2017 18:39:07 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: <2c1a16ab-9be7-c019-8338-c1dc50d3e069@nasa.gov> Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working on since it needs to be run as unprivileged users. Perhaps I'm not asking the right question. I'm wondering how the file layout api behaves if a given "block"-aligned offset in a file is made up of sub-blocks/fragments that are not all on the same NSD. The assumption based on how I've seen the API used so far is that all sub-blocks within a block at a given offset within a file are all on the same NSD. -Aaron On 11/5/17 6:01 PM, Marc A Kaplan wrote: > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > and found this discussion: > > ?https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > In general, GPFS files ARE deliberately "fragmented" but we don't say > that - we say they are "striped" over many disks -- and that is > generally a good thing for parallel performance. > > Also, in GPFS, if the last would-be block of a file is less than a > block, then it is stored in a "fragment" of a block. ? > So you see we use "fragment" to mean something different than other file > systems you may know. > > --marc > > > > From: ? ? ? ?Aaron Knister > To: ? ? ? ?gpfsug main discussion list > Date: ? ? ? ?11/04/2017 12:22 PM > Subject: ? ? ? ?[gpfsug-discuss] file layout API + file fragmentation > Sent by: ? ? ? ?gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > I've got a question about the file layout API and how it reacts in the > case of fragmented files. > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code > based on tsGetDataBlk.C. I'm basing the block size based off of what's > returned by filemapOut.blockSize but that only seems to return a value > > 0 when filemapIn.startOffset is 0. > > In a case where a file were to be made up of a significant number of > non-contiguous fragments (which... would be awful in of itself) how > would this be reported by the file layout API? Does the interface > technically just report the disk location information of the first block > of the $blockSize range and assume that it's contiguous? > > Thanks! > > -Aaron > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From fschmuck at us.ibm.com Mon Nov 6 00:57:46 2017 From: fschmuck at us.ibm.com (Frank Schmuck) Date: Mon, 6 Nov 2017 00:57:46 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From mutantllama at gmail.com Mon Nov 6 03:35:58 2017 From: mutantllama at gmail.com (Carl) Date: Mon, 6 Nov 2017 14:35:58 +1100 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full Message-ID: Hi Folk, Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. How much degradation do you see above 80% usage, 90% usage? Cheers, Carl. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Nov 6 05:10:30 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 6 Nov 2017 00:10:30 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: Thanks, Frank! That's truly fascinating and has some interesting implications that I hadn't thought of before. I just ran a test on an ~8G fs with a block size of 1M: for i in `seq 1 100000`; do dd if=/dev/zero of=foofile${i} bs=520K count=1 done The fs is "full" according to df/mmdf but there's 3.6G left in subblocks but yeah, I can't allocate any new files that wouldn't fit into the inode and I can't seem to allocate any new subblocks to existing files (e.g. append). What's interesting is if I do the same exercise but with a file size of 30K or even 260K I don't seem to run into the same issue. I'm not sure I understand that yet. I was curious about what this meant in the case of appending to a file where the last offset in the file was allocated to a fragment. By looking at "tsdbfs listda" and appending to a file I could see that the last DA would change (presumably to point to the DA of the start of a contiguous subblock) once the amount of data appended caused the file size to exceed the space available in the trailing subblocks. -Aaron On 11/5/17 7:57 PM, Frank Schmuck wrote: > In GPFS blocks within a file are never fragmented.? For example, if you > have a file of size 7.3 MB and your file system block size is 1MB, then > this file will be made up of 7 full blocks and one fragment of size 320k > (10 subblocks).? Each of the 7 full blocks will be contiguous on a singe > diks (LUN) behind a single NSD server.? The fragment that makes up the > last part of the file will also be contiguous on a single disk, just > shorter than a full block. > ? > Frank Schmuck > IBM Almaden Research Center > ? > ? > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > ? > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > > ??https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ?? > > So you see we use "fragment" to mean something different than > other file > > systems you may know. > > > > --marc > > > > > > > > From: ?? ?? ?? ??Aaron Knister > > To: ?? ?? ?? ??gpfsug main discussion list > > > Date: ?? ?? ?? ??11/04/2017 12:22 PM > > Subject: ?? ?? ?? ??[gpfsug-discuss] file layout API + file > fragmentation > > Sent by: ?? ?? ?? ??gpfsug-discuss-bounces at spectrumscale.org > > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have > some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a > value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first > block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > ? > > ? > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From peter.chase at metoffice.gov.uk Mon Nov 6 09:20:11 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Mon, 6 Nov 2017 09:20:11 +0000 Subject: [gpfsug-discuss] Introduction/Question Message-ID: Hello to all! I'm pleased to have joined the GPFS UG mailing list, I'm experimenting with GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office in the GPCS team (general purpose compute service/mainframe team) and I'm based in Exeter, Devon. I've joined with a specific question to ask, in short: how can I automate sending files to a cloud object store as they arrive in GPFS and keep a copy of the file in GPFS? The longer spiel is this: We have a HPC that throws out a lot of NetCDF files via FTP for use in forecasts. We're currently undergoing a change in working practice, so that data processing is beginning to be done in the cloud. At the same time we're also attempting to de-duplicate the data being sent from the HPC by creating one space to receive it and then have consumers use it or send it on as necessary from there. The data is in terabytes a day sizes, and the timeliness of it's arrival to systems is fairly important (forecasts cease to be forecasts if they're too late). We're using zLinux because the mainframe already receives much of the data from the HPC and has access to a SAN with SSD storage, has the right network connections it needs and generally seems the least amount of work to put something in place. Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the bill and having hardware, storage, OS and filesystem from one provider (IBM) should hopefully save some headaches. We're using Amazon as our cloud provider, and have 2x10GB direct links to their London data centre with a ping of about 15ms, so fairly low latency. The developers using the data want it in s3 so they can access it from server-less environments and won't need to have ec2 instances loitering to look after the data. We were initially interested in using mmcloudgateway/cloud data sharing to send the data, but it's not available for s390x (only x86_64), so I'm now looking at setting up a external storage pool for talking to s3 and then having some kind of ilm soft quota trigger to send the data once enough of it has arrived, but I'm still exploring options. Options such as asking the user group of experienced folks what they think is best! So, any help or advice would be greatly appreciated! Regards, Peter Chase GPCS Team Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 09:37:15 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 09:37:15 +0000 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: Message-ID: Peter, Welcome to the mailing list! Can I summarise in saying that you are looking for a way for GPFS to recognise that a file has just arrived in the filesystem (via FTP) and so trigger an action, in this case to trigger to push to Amazon S3 ? I think that you also have a second question about coping with the restrictions on GPFS on zLinux? ie CES is not supported and hence TCT isn?t either. Looking at the docs, there appears to be many restrictions on TCT for MultiCluster, AFM, Heterogeneous setups, DMAPI tape tiers, etc. So my question to add is; what success have people had in using a TCT in more than the simplest use case of a single small isolated x86 cluster? Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 6 Nov 2017, at 09:20, Chase, Peter wrote: > > Hello to all! > > I?m pleased to have joined the GPFS UG mailing list, I?m experimenting with GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office in the GPCS team (general purpose compute service/mainframe team) and I?m based in Exeter, Devon. > > I?ve joined with a specific question to ask, in short: how can I automate sending files to a cloud object store as they arrive in GPFS and keep a copy of the file in GPFS? > > The longer spiel is this: We have a HPC that throws out a lot of NetCDF files via FTP for use in forecasts. We?re currently undergoing a change in working practice, so that data processing is beginning to be done in the cloud. At the same time we?re also attempting to de-duplicate the data being sent from the HPC by creating one space to receive it and then have consumers use it or send it on as necessary from there. The data is in terabytes a day sizes, and the timeliness of it?s arrival to systems is fairly important (forecasts cease to be forecasts if they?re too late). > > We?re using zLinux because the mainframe already receives much of the data from the HPC and has access to a SAN with SSD storage, has the right network connections it needs and generally seems the least amount of work to put something in place. > > Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the bill and having hardware, storage, OS and filesystem from one provider (IBM) should hopefully save some headaches. > > We?re using Amazon as our cloud provider, and have 2x10GB direct links to their London data centre with a ping of about 15ms, so fairly low latency. The developers using the data want it in s3 so they can access it from server-less environments and won?t need to have ec2 instances loitering to look after the data. > > We were initially interested in using mmcloudgateway/cloud data sharing to send the data, but it?s not available for s390x (only x86_64), so I?m now looking at setting up a external storage pool for talking to s3 and then having some kind of ilm soft quota trigger to send the data once enough of it has arrived, but I?m still exploring options. Options such as asking the user group of experienced folks what they think is best! > > So, any help or advice would be greatly appreciated! > > Regards, > > Peter Chase > GPCS Team > Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom > Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 10:00:39 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 10:00:39 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: Message-ID: Frank, For clarity in the understanding the underlying mechanism in GPFS, could you describe what happens in the case say of a particular file that is appended to every 24 hours? ie. as that file gets to 7MB, it then writes to a new sub-block (1/32 of the next 1MB block). I guess that sub block could be 10th in a a block that already has 9 used. Later on, the file grows to need an 11th subblock and so on. So at what point does this growing file at 8MB occupy all 32 sunblocks of 8 full blocks? Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 6 Nov 2017, at 00:57, Frank Schmuck wrote: > > In GPFS blocks within a file are never fragmented. For example, if you have a file of size 7.3 MB and your file system block size is 1MB, then this file will be made up of 7 full blocks and one fragment of size 320k (10 subblocks). Each of the 7 full blocks will be contiguous on a singe diks (LUN) behind a single NSD server. The fragment that makes up the last part of the file will also be contiguous on a single disk, just shorter than a full block. > > Frank Schmuck > IBM Almaden Research Center > > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > ? https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ? > > So you see we use "fragment" to mean something different than other file > > systems you may know. > > > > --marc > > > > > > > > From: ? ? ? ? Aaron Knister > > To: ? ? ? ? gpfsug main discussion list > > Date: ? ? ? ? 11/04/2017 12:22 PM > > Subject: ? ? ? ? [gpfsug-discuss] file layout API + file fragmentation > > Sent by: ? ? ? ? gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=WH1GLDCza1Rvd9bzdVYoz2Pdzs7l90XNnhUb40FYCqQ&s=LOkUY79m5Ow2FeKqfCqc31cfXZVmqYlvBuQRPirGOFU&e= > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Nov 6 10:01:28 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 06 Nov 2017 10:01:28 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets Message-ID: Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Nov 6 10:22:18 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 6 Nov 2017 15:52:18 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Nov 6 12:25:43 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 6 Nov 2017 12:25:43 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full Message-ID: Hi Carl I don?t have any direct metrics, but we frequently run our file systems above the 80% level, run split data and metadata.I haven?t experienced any GPFS performance issues that I can attribute to high utilization. I know the documentation talks about this, and the lower values of blocks and sub-blocks will make the file system work harder, but so far I haven?t seen any issues. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of Carl Reply-To: gpfsug main discussion list Date: Sunday, November 5, 2017 at 9:36 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Performance of GPFS when filesystem is almost full Hi Folk, Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. How much degradation do you see above 80% usage, 90% usage? Cheers, Carl. -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Nov 6 12:31:30 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 06 Nov 2017 12:31:30 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: > Is this problem happens only for the fileset root directory ? Could you > try accessing the fileset as privileged user after the fileset link and > verify if ACLs are set properly ? AFM reads the ACLs from home and sets in > the cache automatically during the file/dir lookup. What is the Spectrum > Scale version ? > > ~Venkat (vpuvvada at in.ibm.com) > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/06/2017 03:32 PM > Subject: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Dear SpectrumScale Experts, > > > > When creating an IW cache view of a directory in a remote GPFS filesystem, > I prepare the AFM "home" directory using 'mmafmconfig enable ' > command. > > I wish the cache fileset junction point to inherit the ACL for the home > directory when I link it to the filesystem. > > Currently I'm using a flimsy workaround: > > 1. Read the GPFS ACL from the remote directory => store in some file > acl.txt > > 2. Link the AFM fileset to the local filesystem, > > 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i > acl.txt > > Is there a way for the local cache fileset to automatically inherit/clone > the remote directory's ACL, e.g. at mmlinkfileset time? > > > > Thanks! > > Luke._______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Nov 6 13:39:20 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 6 Nov 2017 08:39:20 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: Aaron, brilliant! Your example is close to the worst case, where every file is 512K+1 bytes and the blocksize is 1024K. Yes, in the worse case 49.99999% of space is "lost" or wasted. Don't do that! One can construct such a worst case for any system that allocates by blocks or sectors or whatever you want to call it. Just fill the system with files that are each 0.5*Block_Size+1 bytes and argue that 1/2 the space is wasted. From: Aaron Knister To: Date: 11/06/2017 12:10 AM Subject: Re: [gpfsug-discuss] file layout API + file fragmentation Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, Frank! That's truly fascinating and has some interesting implications that I hadn't thought of before. I just ran a test on an ~8G fs with a block size of 1M: for i in `seq 1 100000`; do dd if=/dev/zero of=foofile${i} bs=520K count=1 done The fs is "full" according to df/mmdf but there's 3.6G left in subblocks but yeah, I can't allocate any new files that wouldn't fit into the inode and I can't seem to allocate any new subblocks to existing files (e.g. append). What's interesting is if I do the same exercise but with a file size of 30K or even 260K I don't seem to run into the same issue. I'm not sure I understand that yet. I was curious about what this meant in the case of appending to a file where the last offset in the file was allocated to a fragment. By looking at "tsdbfs listda" and appending to a file I could see that the last DA would change (presumably to point to the DA of the start of a contiguous subblock) once the amount of data appended caused the file size to exceed the space available in the trailing subblocks. -Aaron On 11/5/17 7:57 PM, Frank Schmuck wrote: > In GPFS blocks within a file are never fragmented. For example, if you > have a file of size 7.3 MB and your file system block size is 1MB, then > this file will be made up of 7 full blocks and one fragment of size 320k > (10 subblocks). Each of the 7 full blocks will be contiguous on a singe > diks (LUN) behind a single NSD server. The fragment that makes up the > last part of the file will also be contiguous on a single disk, just > shorter than a full block. > > Frank Schmuck > IBM Almaden Research Center > > > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > > ? https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ? > > So you see we use "fragment" to mean something different than > other file > > systems you may know. > > > > --marc > > > > > > > > From: ? ? ? ? Aaron Knister > > To: ? ? ? ? gpfsug main discussion list > > > Date: ? ? ? ? 11/04/2017 12:22 PM > > Subject: ? ? ? ? [gpfsug-discuss] file layout API + file > fragmentation > > Sent by: ? ? ? ? gpfsug-discuss-bounces at spectrumscale.org > > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have > some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a > value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first > block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=_xM9xVsqOuNiCqn3ikx6ZaaIHChTPhz_8iDmEKoteX4&s=uy462L5sxX_3Mm3Dh824ptJIxtah9LVRPMmyKz1lAdg&e= > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=_xM9xVsqOuNiCqn3ikx6ZaaIHChTPhz_8iDmEKoteX4&s=uy462L5sxX_3Mm3Dh824ptJIxtah9LVRPMmyKz1lAdg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Nov 6 14:16:34 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 6 Nov 2017 14:16:34 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Message-ID: We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon From peter.smith at framestore.com Mon Nov 6 14:16:42 2017 From: peter.smith at framestore.com (Peter Smith) Date: Mon, 6 Nov 2017 14:16:42 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full In-Reply-To: References: Message-ID: Hi Carl. When we commissioned our system we ran an NFS stress tool, and filled the system to the top. No performance degradation was seen until it was 99.7% full. I believe that after this point it takes longer to find free blocks to write to. YMMV. On 6 November 2017 at 03:35, Carl wrote: > Hi Folk, > > Does anyone have much experience with the performance of GPFS as it > becomes close to full. In particular I am referring to split data/meta > data, where the data pool goes over 80% utilisation. > > How much degradation do you see above 80% usage, 90% usage? > > Cheers, > > Carl. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- [image: Framestore] Peter Smith ? Senior Systems Engineer London ? New York ? Los Angeles ? Chicago ? Montr?al T +44 (0)20 7344 8000 ? M +44 (0)7816 123009 <+44%20%280%297816%20123009> 19-23 Wells Street, London W1T 3PQ Twitter ? Facebook ? framestore.com [image: https://www.framestore.com/] -------------- next part -------------- An HTML attachment was scrubbed... URL: From Achim.Rehor at de.ibm.com Mon Nov 6 16:18:39 2017 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Mon, 6 Nov 2017 11:18:39 -0500 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almostfull In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 7182 bytes Desc: not available URL: From robbyb at us.ibm.com Mon Nov 6 18:02:14 2017 From: robbyb at us.ibm.com (Rob Basham) Date: Mon, 6 Nov 2017 18:02:14 +0000 Subject: [gpfsug-discuss] Fw: Introduction/Question Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15099587293244.png Type: image/png Size: 481 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15099587293245.png Type: image/png Size: 2741 bytes Desc: not available URL: From ewahl at osc.edu Mon Nov 6 19:43:28 2017 From: ewahl at osc.edu (Edward Wahl) Date: Mon, 6 Nov 2017 14:43:28 -0500 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: Message-ID: <20171106144328.58a233f2@osc.edu> On Mon, 6 Nov 2017 09:20:11 +0000 "Chase, Peter" wrote: > how can I automate sending files to a cloud object store as they arrive in > GPFS and keep a copy of the file in GPFS? Sounds like you already have an idea how to do this by using ILM policies. Either quota based as you mention or 'placement' policies should work, though I cannot speak to placement in an S3 environment, the policy engine has a way to call external commands for that if necessary. Though if you create an external pool, a placement policy may be much simpler and possibly faster as well as data would be sent to S3 on write, rather than on a quota trigger. If an external storage pool works properly for S3, I'd probably use a placement policy myself. This also would depend on how/when I needed the data on S3 and your mention of timeliness tells me placement rather than quota may be best. Weighing the solutions for this may be better tested(and timed!) than anything. EVERYONE wants a timely weather forecast. ^_- Ed -- Ed Wahl Ohio Supercomputer Center 614-292-9302 From scale at us.ibm.com Mon Nov 6 19:51:40 2017 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 6 Nov 2017 14:51:40 -0500 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" To: "gpfsug-discuss at spectrumscale.org" Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 20:48:45 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 20:48:45 +0000 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From fschmuck at us.ibm.com Mon Nov 6 20:59:02 2017 From: fschmuck at us.ibm.com (Frank Schmuck) Date: Mon, 6 Nov 2017 20:59:02 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Nov 6 20:59:32 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 6 Nov 2017 20:59:32 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar > on behalf of "scale at us.ibm.com" > Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" >, Simon Thompson > Cc: IBM Spectrum Scale > Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Mon Nov 6 21:09:18 2017 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 6 Nov 2017 21:09:18 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: <7f4c1bf980514e39b2691b15f9b35083@jumptrading.com> Hi Simon, It will only trigger the callback on the currently appointed File System Manager, so you need to make sure your callback scripts are installed on all nodes that can occupy this role. HTH, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Monday, November 06, 2017 3:00 PM To: scale at us.ibm.com; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Note: External Email ________________________________ Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar > on behalf of "scale at us.ibm.com" > Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" >, Simon Thompson > Cc: IBM Spectrum Scale > Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Mon Nov 6 22:18:12 2017 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 6 Nov 2017 17:18:12 -0500 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Right, Bryan. To expand on that a bit, I'll make two additional points. (1) Only a node in the cluster that owns the file system can be appointed a file system manager for the file system. Nodes that remote mount the file system from other clusters cannot be appointed the file system manager of the remote file system. (2) A node need not have the manager designation (as seen in mmlscluster output) to become a file system manager; nodes with the manager designation are preferred, but one could use mmchmgr to assign the role to a non-manager node (for instance). Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Bryan Banister To: gpfsug main discussion list , "scale at us.ibm.com" Date: 11/06/2017 04:09 PM Subject: RE: [gpfsug-discuss] Callbacks / softQuotaExceeded Hi Simon, It will only trigger the callback on the currently appointed File System Manager, so you need to make sure your callback scripts are installed on all nodes that can occupy this role. HTH, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Monday, November 06, 2017 3:00 PM To: scale at us.ibm.com; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Note: External Email Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar on behalf of "scale at us.ibm.com" < scale at us.ibm.com> Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" , Simon Thompson Cc: IBM Spectrum Scale Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" < S.J.Thompson at bham.ac.uk> To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Nov 6 23:49:39 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 6 Nov 2017 18:49:39 -0500 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: , Message-ID: Placement policy rules "SET POOL 'xyz'... " may only name GPFS data pools. NOT "EXTERNAL POOLs" -- EXTERNAL POOL is a concept only supported by MIGRATE rules. However you may be interested in "mmcloudgateway" & co, which is all about combining GPFS with Cloud storage. AKA IBM Transparent Cloud Tiering https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Transparent%20Cloud%20Tiering -------------- next part -------------- An HTML attachment was scrubbed... URL: From mutantllama at gmail.com Tue Nov 7 00:12:11 2017 From: mutantllama at gmail.com (Carl) Date: Tue, 7 Nov 2017 11:12:11 +1100 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almostfull In-Reply-To: References: Message-ID: Thanks to all for the information. Im happy to say that it is close to what I hoped would be the case. Interesting to see the effect of the -n value. Reinforces the need to think about it and not go with the defaults. Thanks again, Carl. On 7 November 2017 at 03:18, Achim Rehor wrote: > I have no practical experience on these numbers, however, Peters > experience below is matching what i learned from Dan years ago. > > As long as the -n setting of the FS (the number of nodes potentially > mounting the fs) is more or less matching the actual number of mounts, > this 99.x % before degradation is expected. If you are far off with that > -n estimate, like having it set to 32, but the actual number of mounts is > in the thousands, > then degradation happens earlier, since the distribution of free blocks in > the allocation maps is not matching the actual setup as good as it could > be. > > Naturally, this depends also on how you do filling of the FS. If it is > only a small percentage of the nodes, doing the creates, then the > distribution can > be 'wrong' as well, and single nodes run earlier out of allocation map > space, and need to look for free blocks elsewhere, costing RPC cycles and > thus performance. > > Putting this in numbers seems quite difficult ;) > > > Mit freundlichen Gr??en / Kind regards > > *Achim Rehor* > > ------------------------------ > > Software Technical Support Specialist AIX/ Emea HPC Support > IBM Certified Advanced Technical Expert - Power Systems with AIX > TSCC Software Service, Dept. 7922 > Global Technology Services > > ------------------------------ > Phone: +49-7034-274-7862 <+49%207034%202747862> IBM Deutschland > E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 > 65451 Kelsterbach > Germany > > > > ------------------------------ > > IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter > Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, > Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 14562 WEEE-Reg.-Nr. DE 99369940 > > > > > > From: Peter Smith > To: gpfsug main discussion list > Date: 11/06/2017 09:17 AM > Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem > is almost full > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Carl. > > When we commissioned our system we ran an NFS stress tool, and filled the > system to the top. > > No performance degradation was seen until it was 99.7% full. > > I believe that after this point it takes longer to find free blocks to > write to. > > YMMV. > > On 6 November 2017 at 03:35, Carl <*mutantllama at gmail.com* > > wrote: > Hi Folk, > > Does anyone have much experience with the performance of GPFS as it > becomes close to full. In particular I am referring to split data/meta > data, where the data pool goes over 80% utilisation. > > How much degradation do you see above 80% usage, 90% usage? > > Cheers, > > Carl. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > -- > *Peter Smith* ? Senior Systems Engineer > *London* ? New York ? Los Angeles ? Chicago ? Montr?al > T +44 (0)20 7344 8000 <+44%2020%207344%208000> ? M +44 (0)7816 123009 > <+44%20%280%297816%20123009> > *19-23 Wells Street, London W1T 3PQ* > > Twitter ? Facebook > ? framestore.com > > ______________________________ > _________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 7182 bytes Desc: not available URL: From vpuvvada at in.ibm.com Tue Nov 7 07:45:37 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 7 Nov 2017 13:15:37 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Luke, This issue has been fixed. As a workaround you could you also try resetting the same ACLs at home (instead of cache) or change directory ctime at home and verify that ACLs are updated correctly on fileset root. You can contact customer support or open a PMR and request efix. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 06:01 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Tue Nov 7 07:57:46 2017 From: john.hearns at asml.com (John Hearns) Date: Tue, 7 Nov 2017 07:57:46 +0000 Subject: [gpfsug-discuss] Spectrum Scale with NVMe Message-ID: I am looking for anyone with experience of using Spectrum Scale with nvme devices. I could use an offline brain dump... The specific issue I have is with the nsd device discovery and the naming. Before anyone replies, I am gettign excellent support from IBM and have been directed to the correct documentation. I am just looking for any wrinkles or tips that anyone has. Thanks -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at spectrumscale.org Tue Nov 7 09:18:52 2017 From: chair at spectrumscale.org (Spectrum Scale UG Chair (Simon Thompson)) Date: Tue, 07 Nov 2017 09:18:52 +0000 Subject: [gpfsug-discuss] SSUG CIUK Call for Speakers Message-ID: The last Spectrum Scale user group meeting of the year will be taking place as part of the Computing Insights UK (CIUK) event in December. We are currently looking for user speakers to talk about their Spectrum Scale implementation. It doesn't have to be a huge deployment, even just a small couple of nodes cluster, we'd love to hear how you are using Scale and about any challenges and successes you've had with it. If you are interested in speaking, you must be registered to attend CIUK and the user group will be taking place on Tuesday 12th December in the afternoon. More details on CIUK and registration at: http://www.stfc.ac.uk/news-events-and-publications/events/general-interest- events/computing-insight-uk/ If you would like to speak, please drop me an email and we can find a slot. Simon From daniel.kidger at uk.ibm.com Tue Nov 7 09:19:24 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Tue, 7 Nov 2017 09:19:24 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem isalmostfull In-Reply-To: Message-ID: I understand that this near linear performance is one of the differentiators of Spectrum Scale. Others with more field experience than me might want to comment on how Lustre and other distributed filesystem perform as they approaches near full capacity. Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 7 Nov 2017, at 00:12, Carl wrote: > > Thanks to all for the information. > > Im happy to say that it is close to what I hoped would be the case. > > Interesting to see the effect of the -n value. Reinforces the need to think about it and not go with the defaults. > > Thanks again, > > Carl. > > >> On 7 November 2017 at 03:18, Achim Rehor wrote: >> I have no practical experience on these numbers, however, Peters experience below is matching what i learned from Dan years ago. >> >> As long as the -n setting of the FS (the number of nodes potentially mounting the fs) is more or less matching the actual number of mounts, >> this 99.x % before degradation is expected. If you are far off with that -n estimate, like having it set to 32, but the actual number of mounts is in the thousands, >> then degradation happens earlier, since the distribution of free blocks in the allocation maps is not matching the actual setup as good as it could be. >> >> Naturally, this depends also on how you do filling of the FS. If it is only a small percentage of the nodes, doing the creates, then the distribution can >> be 'wrong' as well, and single nodes run earlier out of allocation map space, and need to look for free blocks elsewhere, costing RPC cycles and thus performance. >> >> Putting this in numbers seems quite difficult ;) >> >> >> Mit freundlichen Gr??en / Kind regards >> Achim Rehor >> >> >> Software Technical Support Specialist AIX/ Emea HPC Support >> <_1_D95FF418D95FEE980059980B852581D0.gif> >> IBM Certified Advanced Technical Expert - Power Systems with AIX >> TSCC Software Service, Dept. 7922 >> Global Technology Services >> Phone: +49-7034-274-7862 IBM Deutschland >> E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 >> 65451 Kelsterbach >> Germany >> >> >> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter >> Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll >> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 >> >> >> >> >> >> From: Peter Smith >> To: gpfsug main discussion list >> Date: 11/06/2017 09:17 AM >> Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem is almost full >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> >> >> >> Hi Carl. >> >> When we commissioned our system we ran an NFS stress tool, and filled the system to the top. >> >> No performance degradation was seen until it was 99.7% full. >> >> I believe that after this point it takes longer to find free blocks to write to. >> >> YMMV. >> >> On 6 November 2017 at 03:35, Carl wrote: >> Hi Folk, >> >> Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. >> >> How much degradation do you see above 80% usage, 90% usage? >> >> Cheers, >> >> Carl. >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> -- >> Peter Smith ? Senior Systems Engineer >> London ? New York ? Los Angeles ? Chicago ? Montr?al >> T +44 (0)20 7344 8000 ? M +44 (0)7816 123009 >> 19-23 Wells Street, London W1T 3PQ >> Twitter? Facebook? framestore.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckerner at illinois.edu Tue Nov 7 13:04:41 2017 From: ckerner at illinois.edu (Chad Kerner) Date: Tue, 7 Nov 2017 07:04:41 -0600 Subject: [gpfsug-discuss] Spectrum Scale with NVMe In-Reply-To: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> References: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> Message-ID: Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with nvme > devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and have been > directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. Neither the sender nor the > company/group of companies he or she represents shall be liable for the > proper and complete transmission of the information contained in this > communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign From luke.raimbach at googlemail.com Tue Nov 7 16:24:56 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Tue, 07 Nov 2017 16:24:56 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Hello Venkat, Thanks for the information. When was the issue fixed? I tried this on the most recent 4.2.3.5 release and was still experiencing the same behaviour. Cheers, Luke. On Tue, 7 Nov 2017 at 08:45 Venkateswara R Puvvada wrote: > Luke, > > This issue has been fixed. As a workaround you could you also try > resetting the same ACLs at home (instead of cache) or change directory > ctime at home and verify that ACLs are updated correctly on fileset root. > You can contact customer support or open a PMR and request efix. > > ~Venkat (vpuvvada at in.ibm.com) > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/06/2017 06:01 PM > Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Venkat, > > This is only for the fileset root. All other files and directories pull > the correct ACLs as expected when accessing the fileset as root user, or > after setting the correct (missing) ACL on the fileset root. > > Multiple SS versions from around 4.1 to present. > > Thanks! > Luke. > > > On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, <*vpuvvada at in.ibm.com* > > wrote: > > Is this problem happens only for the fileset root directory ? Could you > try accessing the fileset as privileged user after the fileset link and > verify if ACLs are set properly ? AFM reads the ACLs from home and sets in > the cache automatically during the file/dir lookup. What is the Spectrum > Scale version ? > > ~Venkat (*vpuvvada at in.ibm.com* ) > > > > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 11/06/2017 03:32 PM > Subject: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Dear SpectrumScale Experts, > > > When creating an IW cache view of a directory in a remote GPFS filesystem, > I prepare the AFM "home" directory using 'mmafmconfig enable ' > command. > > I wish the cache fileset junction point to inherit the ACL for the home > directory when I link it to the filesystem. > > Currently I'm using a flimsy workaround: > > 1. Read the GPFS ACL from the remote directory => store in some file > acl.txt > > 2. Link the AFM fileset to the local filesystem, > > 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i > acl.txt > > Is there a way for the local cache fileset to automatically inherit/clone > the remote directory's ACL, e.g. at mmlinkfileset time? > > > > Thanks! > > Luke._______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e=* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > > > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex at calicolabs.com Tue Nov 7 17:50:54 2017 From: alex at calicolabs.com (Alex Chekholko) Date: Tue, 7 Nov 2017 09:50:54 -0800 Subject: [gpfsug-discuss] Performance of GPFS when filesystem isalmostfull In-Reply-To: References: Message-ID: One of the parameters that you need to choose at filesystem creation time is the block allocation type. -j {cluster|scatter} parameter to mmcrfs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1ins_blkalmap.htm#ballmap If you use "cluster", you will have quite high performance when the filesystem is close to empty. If you use "scatter", the performance will stay the same no matter the filesystem utilization because blocks for a given file will always be scattered randomly. Some vendors set up their GPFS filesystem using '-j cluster' and then show off their streaming write performance numbers. But the performance degrades considerably as the filesystem fills up. With "scatter", the filesystem performance is slower but stays consistent throughout its lifetime. On Tue, Nov 7, 2017 at 1:19 AM, Daniel Kidger wrote: > I understand that this near linear performance is one of the > differentiators of Spectrum Scale. > Others with more field experience than me might want to comment on how > Lustre and other distributed filesystem perform as they approaches near > full capacity. > > Daniel > [image: /spectrum_storage-banne] > > > [image: Spectrum Scale Logo] > > > *Dr Daniel Kidger* > IBM Technical Sales Specialist > Software Defined Solution Sales > > + <+%2044-7818%20522%20266> 44-(0)7818 522 266 <+%2044-7818%20522%20266> > daniel.kidger at uk.ibm.com > > On 7 Nov 2017, at 00:12, Carl wrote: > > Thanks to all for the information. > > Im happy to say that it is close to what I hoped would be the case. > > Interesting to see the effect of the -n value. Reinforces the need to > think about it and not go with the defaults. > > Thanks again, > > Carl. > > > On 7 November 2017 at 03:18, Achim Rehor wrote: > >> I have no practical experience on these numbers, however, Peters >> experience below is matching what i learned from Dan years ago. >> >> As long as the -n setting of the FS (the number of nodes potentially >> mounting the fs) is more or less matching the actual number of mounts, >> this 99.x % before degradation is expected. If you are far off with that >> -n estimate, like having it set to 32, but the actual number of mounts is >> in the thousands, >> then degradation happens earlier, since the distribution of free blocks >> in the allocation maps is not matching the actual setup as good as it could >> be. >> >> Naturally, this depends also on how you do filling of the FS. If it is >> only a small percentage of the nodes, doing the creates, then the >> distribution can >> be 'wrong' as well, and single nodes run earlier out of allocation map >> space, and need to look for free blocks elsewhere, costing RPC cycles and >> thus performance. >> >> Putting this in numbers seems quite difficult ;) >> >> >> Mit freundlichen Gr??en / Kind regards >> >> *Achim Rehor* >> >> ------------------------------ >> >> Software Technical Support Specialist AIX/ Emea HPC Support >> <_1_D95FF418D95FEE980059980B852581D0.gif> >> IBM Certified Advanced Technical Expert - Power Systems with AIX >> TSCC Software Service, Dept. 7922 >> Global Technology Services >> >> ------------------------------ >> Phone: +49-7034-274-7862 <+49%207034%202747862> IBM Deutschland >> E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 >> 65451 Kelsterbach >> Germany >> >> >> >> ------------------------------ >> >> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter >> Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, >> Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll >> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, >> HRB 14562 WEEE-Reg.-Nr. DE 99369940 >> >> >> >> >> >> From: Peter Smith >> To: gpfsug main discussion list >> Date: 11/06/2017 09:17 AM >> Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem >> is almost full >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> ------------------------------ >> >> >> >> Hi Carl. >> >> When we commissioned our system we ran an NFS stress tool, and filled the >> system to the top. >> >> No performance degradation was seen until it was 99.7% full. >> >> I believe that after this point it takes longer to find free blocks to >> write to. >> >> YMMV. >> >> On 6 November 2017 at 03:35, Carl <*mutantllama at gmail.com* >> > wrote: >> Hi Folk, >> >> Does anyone have much experience with the performance of GPFS as it >> becomes close to full. In particular I am referring to split data/meta >> data, where the data pool goes over 80% utilisation. >> >> How much degradation do you see above 80% usage, 90% usage? >> >> Cheers, >> >> Carl. >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> >> >> >> -- >> *Peter Smith* ? Senior Systems Engineer >> *London* ? New York ? Los Angeles ? Chicago ? Montr?al >> T +44 (0)20 7344 8000 <+44%2020%207344%208000> ? M +44 (0)7816 123009 >> <+44%20%280%297816%20123009> >> *19-23 Wells Street, London W1T 3PQ* >> >> Twitter >> ? >> Facebook >> ? >> framestore.com >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Wed Nov 8 05:16:02 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Wed, 8 Nov 2017 10:46:02 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Luke, There are two issues here. ACLs are not updated on fileset root and other one is that ACLs get updated only when the files/dirs are accessed as root user. Fix for the later one is already part of 4.2.3.5. First issue was fixed after your email, you could request efix on top of 4.2.3.5. First issue will get corrected automatically when ctime is changed on target path at home. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/07/2017 09:55 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Venkat, Thanks for the information. When was the issue fixed? I tried this on the most recent 4.2.3.5 release and was still experiencing the same behaviour. Cheers, Luke. On Tue, 7 Nov 2017 at 08:45 Venkateswara R Puvvada wrote: Luke, This issue has been fixed. As a workaround you could you also try resetting the same ACLs at home (instead of cache) or change directory ctime at home and verify that ACLs are updated correctly on fileset root. You can contact customer support or open a PMR and request efix. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 06:01 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=cbhhdq1uD9_Nmxeh3mRCS0Ic8vc_ts_4uvqXce4DdVc&s=WdJzTgnFn-ApJUW579JhxBPfnVqJ2L3z4x2AJybiVto&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Wed Nov 8 15:50:52 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Wed, 8 Nov 2017 15:50:52 +0000 Subject: [gpfsug-discuss] Default placement/External Pool Message-ID: Hello! A follow up to my previous question about automatically sending files to Amazon s3 as they arrive in GPFS. I have created an interface script to manage Amazon s3 storage as an external pool, I have created a migration policy that pre-migrates all files to the external pool and I have set that as the default policy for the file system. All good so far, but the problem I'm now facing is: Only some of the cluster nodes have access to Amazon due to network constraints. I read the statement "The mmapplypolicy command invokes the external pool script on all nodes in the cluster that have installed the script in its designated location."[1] and thought, 'Great! I'll only install the script on nodes that have access to Amazon' but that appears not to work for a placement policy/default policy and instead, the script runs on precisely no nodes. I assumed this happened because running the script on a non-Amazon facing node resulted in a horrible error (i.e. file not found), so I edited my script to return a non-zero response if being run on a node that isn't in my cloudNode class, then installed the script every where. But this appears to have had no effect what-so-ever. The only thing I can think of now is to control where a migration policy runs based on node class. But I don't know how to do that, or if it's possible, or where the documentation might be as I can't find any. Any assistance would once again be greatly appreciated. [1]=https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_impstorepool.htm Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From Robert.Oesterlin at nuance.com Wed Nov 8 16:02:04 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 8 Nov 2017 16:02:04 +0000 Subject: [gpfsug-discuss] Default placement/External Pool Message-ID: Hi Peter mmapplypolicy has a "-N" parameter that should restrict it to a subset of nodes or node class if you define that. -N {all | mount | Node[,Node...] | NodeFile | NodeClass} Specifies the list of nodes that will run parallel instances of policy code in the GPFS home cluster. This command supports all defined node classes. The default is to run on the node where the mmapplypolicy command is running or the current value of the defaultHelperNodes parameter of the mmchconfig command. Bob Oesterlin Sr Principal Storage Engineer, Nuance ?On 11/8/17, 9:55 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Chase, Peter" wrote: The only thing I can think of now is to control where a migration policy runs based on node class. But I don't know how to do that, or if it's possible, or where the documentation might be as I can't find any. Any assistance would once again be greatly appreciated. From makaplan at us.ibm.com Wed Nov 8 19:21:19 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 8 Nov 2017 14:21:19 -0500 Subject: [gpfsug-discuss] Default placement/External Pool In-Reply-To: References: Message-ID: Peter, 1. to best exploit and integrate both Spectrum Scale and Cloud Storage, please consider: https://www.ibm.com/blogs/systems/spectrum-scale-transparent-cloud-tiering/ 2. Yes, you can use mmapplypolicy to push copies of files to an "external" system. But you'll probably need a strategy or technique to avoid redundantly pushing the "next time" you run the command... 3. Regarding mmapplypolicy nitty-gritty: you can use the -N option to say exactly which nodes you want to run the command. And regarding using ... EXTERNAL ... EXEC 'myscript' You can further restrict which nodes will act as mmapplypolicy "helpers" -- If on a particular node x, 'myscript' does not exist OR myscript TEST returns a non-zero exit code then node x will be excluded.... You will see a message like this: [I] Messages tagged with <3> are from node n3. <3> [E:73] Error on system(/ghome/makaplan/policies/mynodes.sh TEST '/foo/bar5' 2>&1) <3> [W] EXEC '/ghome/makaplan/policies/mynodes.sh' of EXTERNAL POOL or LIST 'x' fails TEST with code 73 on this node. OR [I] Messages tagged with <5> are from node n4. <5> sh: /tmp/mynodes.sh: No such file or directory <5> [E:127] Error on system(/tmp/mynodes.sh TEST '/foo/bar5' 2>&1) <5> [W] EXEC '/tmp/mynodes.sh' of EXTERNAL POOL or LIST 'x' fails TEST with code 127 on this node. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/08/2017 10:51 AM Subject: [gpfsug-discuss] Default placement/External Pool Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello! A follow up to my previous question about automatically sending files to Amazon s3 as they arrive in GPFS. I have created an interface script to manage Amazon s3 storage as an external pool, I have created a migration policy that pre-migrates all files to the external pool and I have set that as the default policy for the file system. All good so far, but the problem I'm now facing is: Only some of the cluster nodes have access to Amazon due to network constraints. I read the statement "The mmapplypolicy command invokes the external pool script on all nodes in the cluster that have installed the script in its designated location."[1] and thought, 'Great! I'll only install the script on nodes that have access to Amazon' but that appears not to work for a placement policy/default policy and instead, the script runs on precisely no nodes. I assumed this happened because running the script on a non-Amazon facing node resulted in a horrible error (i.e. file not found), so I edited my script to return a non-zero response if being run on a node that isn't in my cloudNode class, then installed the script every where. But this appears to have had no effect what-so-ever. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbyb at us.ibm.com Wed Nov 8 20:39:54 2017 From: robbyb at us.ibm.com (Rob Basham) Date: Wed, 8 Nov 2017 20:39:54 +0000 Subject: [gpfsug-discuss] Default placement/External Pool In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 06:22:46 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 07:22:46 +0100 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Message-ID: Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Fri Nov 10 10:06:19 2017 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Fri, 10 Nov 2017 10:06:19 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 10:21:01 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 11:21:01 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Andreas, the version of the GUI and the other packages are the following: gpfs.gui-4.2.3-0.noarch Yes, the collector is running locally on the GUI-Node and it is only one collector configured. The oupt of your command: [root at tower-daemon ~]# echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 1: resolve1|CPU|cpu_user 2: resolve2|CPU|cpu_user 3: sbc-162150007|CPU|cpu_user 4: sbc-162150069|CPU|cpu_user 5: sbc-162150071|CPU|cpu_user 6: sbtl-176173009|CPU|cpu_user 7: tower-daemon|CPU|cpu_user Row Timestamp cpu_user cpu_user cpu_user cpu_user cpu_user cpu_user cpu_user 1 2017-11-10 11:06:00 2.525333 0.151667 0.854333 0.826833 0.836333 0.273833 0.800167 2 2017-11-10 11:07:00 3.052000 0.156833 0.964833 0.946833 0.881833 0.308167 0.896667 3 2017-11-10 11:08:00 4.267167 0.150500 1.134833 1.224833 1.063167 0.300333 0.855333 4 2017-11-10 11:09:00 4.505333 0.149833 1.155333 1.127667 1.098167 0.324500 0.822000 5 2017-11-10 11:10:00 4.023167 0.145667 1.136500 1.079500 1.016000 0.269000 0.836667 6 2017-11-10 11:11:00 2.127167 0.150333 0.903167 0.854833 0.798500 0.280833 0.854500 7 2017-11-10 11:12:00 4.210000 0.151167 0.877833 0.847167 0.836000 0.312500 1.110333 8 2017-11-10 11:13:00 14.388333 0.151000 1.009667 0.986167 0.950333 0.277167 0.814333 9 2017-11-10 11:14:00 18.513167 0.153167 1.048000 0.941333 0.949667 0.282833 0.808333 10 2017-11-10 11:15:00 1.613571 0.149063 0.789630 0.650741 0.826296 0.273333 0.676296 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ [root at tower-daemon ~]# psql postgres postgres -c "select os_host_name from fscc.node;" os_host_name ---------------------- tower sbtl-176173009-admin sbc-162150071-admin sbc-162150069-admin sbc-162150007-admin resolve1-admin resolve2-admin (7rows) The output seems to be ok. Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 11:06 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, 1.) Which GUI version are you running? 2.) Is the Collector running locally on the GUI? 3.) Is there more than one collector configured? 4.) Run the following command on the collector node to verify that there's data in the collector: > echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 5.) Run the following command on the GUI node to verify which host name the GUI uses to query the performance data: psql postgres postgres -c "select os_host_name from fscc.node;" Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 7:23 AM Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=TAzwoRuPR6uYNk_NNemAQPqsxILnSGfc34j4dabTVC0&s=OR8cwq9jfa_GaqXM00kDYFvhoIqPrKR5LT2Anpas3XA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 10:54:17 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 11:54:17 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Some more information: Only the GUI-Node is running on CentOS 7. The Clients are running on CentOS 6.x and RHEL 6.x. Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 11:06 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, 1.) Which GUI version are you running? 2.) Is the Collector running locally on the GUI? 3.) Is there more than one collector configured? 4.) Run the following command on the collector node to verify that there's data in the collector: > echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 5.) Run the following command on the GUI node to verify which host name the GUI uses to query the performance data: psql postgres postgres -c "select os_host_name from fscc.node;" Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 7:23 AM Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=TAzwoRuPR6uYNk_NNemAQPqsxILnSGfc34j4dabTVC0&s=OR8cwq9jfa_GaqXM00kDYFvhoIqPrKR5LT2Anpas3XA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From neil.wilson at metoffice.gov.uk Fri Nov 10 11:19:55 2017 From: neil.wilson at metoffice.gov.uk (Wilson, Neil) Date: Fri, 10 Nov 2017 11:19:55 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the "hostname" field (it's blank by default) of the pmsensors cfg file on that node and restarted pmsensors - the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It's probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Fri Nov 10 12:07:26 2017 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Fri, 10 Nov 2017 12:07:26 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 12:34:18 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 13:34:18 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Andreas, hi Neil, the GUI-Node returned a hostname with a FQDN. The clients have no FQDN. Thanks for this tip. I will change the hostname in the first step. If this does not help then I will change the configuration files. I will give you feedback in the next week! Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Fri Nov 10 13:25:40 2017 From: john.hearns at asml.com (John Hearns) Date: Fri, 10 Nov 2017 13:25:40 +0000 Subject: [gpfsug-discuss] Spectrum Scale with NVMe In-Reply-To: References: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> Message-ID: Chad, Thankyou for the reply. Indded I had that issue - I only noticed because I looked at the utisation of the NSDs and a set of them were not being filled with data... A set which were coincidentally all connected to the same server (me whistles innocently....) -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Chad Kerner Sent: Tuesday, November 07, 2017 2:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with > nvme devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and > have been directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the > body of this communication or the attachment thereto (if any), the > information is provided on an AS-IS basis without any express or > implied warranties or liabilities. To the extent you are relying on > this information, you are doing so at your own risk. If you are not > the intended recipient, please notify the sender immediately by > replying to this message and destroy all copies of this message and > any attachments. Neither the sender nor the company/group of companies > he or she represents shall be liable for the proper and complete > transmission of the information contained in this communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce3875dc1def842e88ee308d525e01e80%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pwti5NtVf7c4SClTUc1PWNz5YW4QHWjM5%2F%2BGLdYHoqQ%3D&reserved=0 -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. From peter.chase at metoffice.gov.uk Fri Nov 10 16:18:36 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Fri, 10 Nov 2017 16:18:36 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From stockf at us.ibm.com Fri Nov 10 16:41:19 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Fri, 10 Nov 2017 11:41:19 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: How do you determine if mmapplypolicy is running on a node? Normally mmapplypolicy as a process runs on a single node but its helper processes, policy-help or something similar, run on all the nodes which are referenced by the -N option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s&s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Nov 10 16:42:28 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 10 Nov 2017 11:42:28 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: mmapplypolicy ... -N nodeClass ... will use the nodes in nodeClass as helper nodes to get its work done. mmdsh -N nodeClass command ... will run the SAME command on each of the nodes -- probably not what you want to do with mmapplypolicy. To see more about what mmapplypolicy is doing use options -d 1 (debug info) If you are using -N because you have a lot of files to process, you should also use -g /some-gpfs-temp-directory (see doc) If you are running a small test case, it may happen that you don't see the helper nodes doing anything, because there's not enough time and work to get them going... For test purposes you can coax the helper nodes into action with: options -B 1 -m 1 so that each helper node only does one file at a time. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=WjhVVKkS23BlFGP2KHmkndM0AZ4yB2aC81UUHv8iIZs&s=-dPme1SlhBAqo45xVmtvVWNeAjumd7JrtEksW1U8o5w&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Fri Nov 10 17:15:55 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Fri, 10 Nov 2017 17:15:55 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: Hi Frederick, The ILM active policy (set by mmchpolicy) has an external list rule, the command for the external list runs the mmapplypolicy command. /gpfs1/s3upload/policies/migration.policy has external pool & a migration rule in it. The handler script for the external pool writes the hostname of the server running it out to a file, so that's how I'm trapping which server is running the policy, and that mmapplypolicy is being run. Hope that explains things, if not let me know and I'll have another try :) Regards, Peter -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of gpfsug-discuss-request at spectrumscale.org Sent: 10 November 2017 16:43 To: gpfsug-discuss at spectrumscale.org Subject: gpfsug-discuss Digest, Vol 70, Issue 32 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Spectrum Scale with NVMe (John Hearns) 2. Specifying nodes in commands (Chase, Peter) 3. Re: Specifying nodes in commands (Frederick Stock) 4. Re: Specifying nodes in commands (Marc A Kaplan) ---------------------------------------------------------------------- Message: 1 Date: Fri, 10 Nov 2017 13:25:40 +0000 From: John Hearns To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Message-ID: Content-Type: text/plain; charset="us-ascii" Chad, Thankyou for the reply. Indded I had that issue - I only noticed because I looked at the utisation of the NSDs and a set of them were not being filled with data... A set which were coincidentally all connected to the same server (me whistles innocently....) -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Chad Kerner Sent: Tuesday, November 07, 2017 2:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with > nvme devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and > have been directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the > body of this communication or the attachment thereto (if any), the > information is provided on an AS-IS basis without any express or > implied warranties or liabilities. To the extent you are relying on > this information, you are doing so at your own risk. If you are not > the intended recipient, please notify the sender immediately by > replying to this message and destroy all copies of this message and > any attachments. Neither the sender nor the company/group of companies > he or she represents shall be liable for the proper and complete > transmission of the information contained in this communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce3875dc1def842e88ee308d525e01e80%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pwti5NtVf7c4SClTUc1PWNz5YW4QHWjM5%2F%2BGLdYHoqQ%3D&reserved=0 -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. ------------------------------ Message: 2 Date: Fri, 10 Nov 2017 16:18:36 +0000 From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk ------------------------------ Message: 3 Date: Fri, 10 Nov 2017 11:41:19 -0500 From: "Frederick Stock" To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" How do you determine if mmapplypolicy is running on a node? Normally mmapplypolicy as a process runs on a single node but its helper processes, policy-help or something similar, run on all the nodes which are referenced by the -N option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s&s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 4 Date: Fri, 10 Nov 2017 11:42:28 -0500 From: "Marc A Kaplan" To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" mmapplypolicy ... -N nodeClass ... will use the nodes in nodeClass as helper nodes to get its work done. mmdsh -N nodeClass command ... will run the SAME command on each of the nodes -- probably not what you want to do with mmapplypolicy. To see more about what mmapplypolicy is doing use options -d 1 (debug info) If you are using -N because you have a lot of files to process, you should also use -g /some-gpfs-temp-directory (see doc) If you are running a small test case, it may happen that you don't see the helper nodes doing anything, because there's not enough time and work to get them going... For test purposes you can coax the helper nodes into action with: options -B 1 -m 1 so that each helper node only does one file at a time. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=WjhVVKkS23BlFGP2KHmkndM0AZ4yB2aC81UUHv8iIZs&s=-dPme1SlhBAqo45xVmtvVWNeAjumd7JrtEksW1U8o5w&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 70, Issue 32 ********************************************** From peter.chase at metoffice.gov.uk Mon Nov 13 11:14:56 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Mon, 13 Nov 2017 11:14:56 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Hi Marc, Thanks for your response, there's some handy advice in there that I'll look at further. I'm still struggling a bit with mmapplypolicy and it's -N option. I've changed my external list command to point at a script, that script looks for "LIST" as the first argument, and runs "/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -d 1 -N cloudNode -P /gpfs1/s3upload/policies/migration.policy >>/gpfs1/s3upload/external-list.log 2>&1". If the script is run from the command line on a node that's not in cloudNode class it works without issue and uses nodes in the cloudNode class as helpers, but if the script is called from the active policy, mmapplypolicy runs, but seems to ignore the -N and doesn't use the cloudNode nodes as helpers and instead seems to run locally (from which ever node started the active policy). So now my questions is: why does the -N option appear to be honoured when run from the command line, but not appear to be honoured when triggered by the active policy? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From makaplan at us.ibm.com Mon Nov 13 17:44:23 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 13 Nov 2017 12:44:23 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: My guess is you have some expectation of how things "ought to be" that does not match how things actually are. If you haven't already done so, put some diagnostics into your script, such as env hostname echo "my args are: $*" And run mmapplypolicy with an explicit node list: mmapplypolicy /some/small-set-of-files -P /mypolicyfile -N node1,node2,node3 -I test -L 1 -d 1 And see how things go Hmmm... reading your post again... It seems perhaps you've got some things out of order or again, incorrect expectations or model of how the this world works... mmapplypolicy reads your policy rules and scans the files and calls the script(s) you've named in the EXEC options of your EXTERNAL rules The scripts are expected to process file lists -- NOT call mmapplypolicy again... Refer to examples in the documentation, and in samples/ilm - and try them! --marc From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/13/2017 06:15 AM Subject: Re: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Marc, Thanks for your response, there's some handy advice in there that I'll look at further. I'm still struggling a bit with mmapplypolicy and it's -N option. I've changed my external list command to point at a script, that script looks for "LIST" as the first argument, and runs "/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -d 1 -N cloudNode -P /gpfs1/s3upload/policies/migration.policy >>/gpfs1/s3upload/external-list.log 2>&1". If the script is run from the command line on a node that's not in cloudNode class it works without issue and uses nodes in the cloudNode class as helpers, but if the script is called from the active policy, mmapplypolicy runs, but seems to ignore the -N and doesn't use the cloudNode nodes as helpers and instead seems to run locally (from which ever node started the active policy). So now my questions is: why does the -N option appear to be honoured when run from the command line, but not appear to be honoured when triggered by the active policy? Regards, Peter Chase GPCS Team Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=tNW4WqkmstX3B3t1dvbenDx32bw3S1FQ4BrpLrs1r4o&s=CBzS6KRLe_hQhI4zpeeuvNaYdraGbc7cCV-JTvCgDcM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From damir.krstic at gmail.com Mon Nov 13 20:49:07 2017 From: damir.krstic at gmail.com (Damir Krstic) Date: Mon, 13 Nov 2017 20:49:07 +0000 Subject: [gpfsug-discuss] verbsRdmaSend yes or no Message-ID: I am missing out on SC17 this year because of some instability with our 2 ESS storage arrays. We have just recently upgraded our ESS to 5.2 and we have a question about verbRdmaSend setting. Per IBM and GPFS guidelines for a large cluster, we have this setting off on all compute nodes. We were able to turn it off on ESS 1 (IO1 and IO2). However, IBM was unable to turn it off on ESS 2 (IO3 and IO4). ESS 1 has following filesystem: projects (1PB) ESS 2 has following filesystems: home and hpc All our client nodes have this setting off. So the question is, should we push through and get it disabled on IO3 and IO4 so that we are consistent across the environment? I assume the answer is yes. But I would also like to know what the impact is of leaving it enabled on IO3 and IO4. Thank you. Damir -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 10:16:44 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 10:16:44 +0000 Subject: [gpfsug-discuss] Backing up GPFS config Message-ID: All, A few months ago someone posted to the list all the commands they run to back up their GPFS configuration. Including mmlsfileset -L, the output of mmlsconfig etc, so that in the event of a proper "crap your pants" moment you can not only restore your data, but also your whole configuration. I cannot seem to find this post... does the OP remember and could kindly forward it on to me, or the list again? Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Nov 14 13:35:46 2017 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 14 Nov 2017 14:35:46 +0100 Subject: [gpfsug-discuss] Backing up GPFS config In-Reply-To: References: Message-ID: Plese see https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Back%20Up%20GPFS%20Configuration But also check ?mmcesdr primary backup?. I don't rememner if it included all of mmbackupconfig/mmccr, but I think it did, and it also includes CES config. You don't need to be using CES DR to use it. -jf tir. 14. nov. 2017 kl. 03:16 skrev Sobey, Richard A : > All, > > > > A few months ago someone posted to the list all the commands they run to > back up their GPFS configuration. Including mmlsfileset -L, the output of > mmlsconfig etc, so that in the event of a proper ?crap your pants? moment > you can not only restore your data, but also your whole configuration. > > > > I cannot seem to find this post? does the OP remember and could kindly > forward it on to me, or the list again? > > > > Thanks > > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Tue Nov 14 14:41:50 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Tue, 14 Nov 2017 14:41:50 +0000 Subject: [gpfsug-discuss] Backing up GPFS config In-Reply-To: References: Message-ID: <20171114144149.7lmc46poy24of4yi@utumno.gs.washington.edu> I can't remember if I replied to that post or a different one, but these are the commands we capture output for before running mmbackup: mmlsconfig mmlsnsd mmlscluster mmlscluster --cnfs mmlscluster --ces mmlsnode mmlsdisk ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L mmbackupconfig ${FS_NAME} All the commands but mmbackupconfig produce human-readable output, while mmbackupconfig produces machine-readable output suitable for recovering the filesystem in a disaster. On Tue, Nov 14, 2017 at 10:16:44AM +0000, Sobey, Richard A wrote: > All, > > A few months ago someone posted to the list all the commands they run to back up their GPFS configuration. Including mmlsfileset -L, the output of mmlsconfig etc, so that in the event of a proper "crap your pants" moment you can not only restore your data, but also your whole configuration. > > I cannot seem to find this post... does the OP remember and could kindly forward it on to me, or the list again? > > Thanks > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From Matthias.Knigge at rohde-schwarz.com Tue Nov 14 15:15:58 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Tue, 14 Nov 2017 16:15:58 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Changing the hostname without FQDN does not help. When I change back that the admin-interface is in the same network as the daemon then it works again. Could it be that for the GUI a daemon-interface must set? If yes, where can I set this interface? Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Tue Nov 14 15:18:23 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Tue, 14 Nov 2017 16:18:23 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: mmfind or rather the convert-script is great! Thanks, Matthias Von: "Marc A Kaplan" An: gpfsug main discussion list Datum: 01.11.2017 15:43 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 16:30:18 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 16:30:18 +0000 Subject: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server In-Reply-To: References: <20171016132932.g5j7vep2frxnsvpf@utumno.gs.washington.edu>, <4B32CB5C696F2849BDEF7DF9EACE884B633F4ACF@SDEB-EXC01.meteo.dz> Message-ID: Hi Scott This looks like what I?m after (thank you Skylar and all others who responded too!) For the uninitiated, what exactly is a User Exit in the context of the following line: ?One way to automate this collection of GPFS configuration data is to use a User Exit. ? Or to put it another way, what is calling the script to be run on the basis of running mmchconfig someparam=someval? I?d like to understand it more. Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Scott Fadden Sent: 16 October 2017 16:35 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server There are some comments on this in the wiki: Backup Spectrum Scale configuration Let me know if anything is missing. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Skylar Thompson > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Date: Mon, Oct 16, 2017 6:29 AM I'm not familiar with GSS, but we have a script that executes the following before backing up a GPFS filesystem so that we have human-readable configuration information: mmlsconfig mmlsnsd mmlscluster mmlsnode mmlsdisk ${FS_NAME} -L mmlsfileset ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L And then executes this for the benefit of GPFS: mmbackupconfig Of course there's quite a bit of overlap for clusters that have more than one filesystem, and even more for filesystems that we backup at the fileset level, but disk is cheap and the hope is it'll make a DR scenario a little bit less harrowing. On Sun, Oct 15, 2017 at 12:44:42PM +0000, atmane khiredine wrote: > Dear All, > > Is there a way to save the GPS configuration? > > OR how backup all GSS > > no backup of data or metadata only configuration for disaster recovery > > for example: > stanza > vdisk > pdisk > RAID code > recovery group > array > > Thank you > > Atmane Khiredine > HPC System Administrator | Office National de la M??t??orologie > T??l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= Click here to report this email as spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 16:57:52 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 16:57:52 +0000 Subject: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server In-Reply-To: References: <20171016132932.g5j7vep2frxnsvpf@utumno.gs.washington.edu>, <4B32CB5C696F2849BDEF7DF9EACE884B633F4ACF@SDEB-EXC01.meteo.dz> Message-ID: To answer my own question: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_3.5.0/com.ibm.cluster.gpfs.v3r5.gpfs100.doc/bl1adm_uxtsdrb.htm It?s built in. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A Sent: 14 November 2017 16:30 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Hi Scott This looks like what I?m after (thank you Skylar and all others who responded too!) For the uninitiated, what exactly is a User Exit in the context of the following line: ?One way to automate this collection of GPFS configuration data is to use a User Exit. ? Or to put it another way, what is calling the script to be run on the basis of running mmchconfig someparam=someval? I?d like to understand it more. Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Scott Fadden Sent: 16 October 2017 16:35 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server There are some comments on this in the wiki: Backup Spectrum Scale configuration Let me know if anything is missing. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Skylar Thompson > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Date: Mon, Oct 16, 2017 6:29 AM I'm not familiar with GSS, but we have a script that executes the following before backing up a GPFS filesystem so that we have human-readable configuration information: mmlsconfig mmlsnsd mmlscluster mmlsnode mmlsdisk ${FS_NAME} -L mmlsfileset ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L And then executes this for the benefit of GPFS: mmbackupconfig Of course there's quite a bit of overlap for clusters that have more than one filesystem, and even more for filesystems that we backup at the fileset level, but disk is cheap and the hope is it'll make a DR scenario a little bit less harrowing. On Sun, Oct 15, 2017 at 12:44:42PM +0000, atmane khiredine wrote: > Dear All, > > Is there a way to save the GPS configuration? > > OR how backup all GSS > > no backup of data or metadata only configuration for disaster recovery > > for example: > stanza > vdisk > pdisk > RAID code > recovery group > array > > Thank you > > Atmane Khiredine > HPC System Administrator | Office National de la M??t??orologie > T??l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= Click here to report this email as spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 15 08:43:28 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 15 Nov 2017 09:43:28 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Strange... I think it is the order of configuration changes. Now it works with severed networks and FQDN. I configured the admin-interface with another network and back to the daemon-network. Then again to the admin-interface and it works fine. So the FQDN should be not the problem. Sometimes a linux system needs a reboot too. ;-) Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Wed Nov 15 16:24:52 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 15 Nov 2017 17:24:52 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size Message-ID: Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano From kums at us.ibm.com Wed Nov 15 16:56:36 2017 From: kums at us.ibm.com (Kumaran Rajaram) Date: Wed, 15 Nov 2017 11:56:36 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hi, >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. [snip from mmcrfs] # mmlsfs | egrep 'Block allocation| Estimated number' -j scatter Block allocation type -n 128 Estimated number of nodes that will mount file system [/snip] [snip from man mmcrfs] layoutMap={scatter | cluster} Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round?robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly. The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system?s free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks. The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance). This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks. The block allocation map type cannot be changed after the storage pool has been created. -n NumNodes The estimated number of nodes that will mount the file system in the local cluster and all remote clusters. This is used as a best guess for the initial size of some file system data structures. The default is 32. This value can be changed after the file system has been created but it does not change the existing data structures. Only the newly created data structure is affected by the new value. For example, new storage pool. When you create a GPFS file system, you might want to overestimate the number of nodes that will mount the file system. GPFS uses this information for creating data structures that are essential for achieving maximum parallelism in file system operations (For more information, see GPFS architecture in IBM Spectrum Scale: Concepts, Planning, and Installation Guide ). If you are sure there will never be more than 64 nodes, allow the default value to be applied. If you are planning to add nodes to your system, you should specify a number larger than the default. [/snip from man mmcrfs] Regards, -Kums From: Ivano Talamo To: Date: 11/15/2017 11:25 AM Subject: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Nov 15 18:25:59 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 15 Nov 2017 13:25:59 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Wed Nov 15 23:48:18 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 15 Nov 2017 23:48:18 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: My 2c ... Be careful here about mixing up three different possible effects seen in filesystems 1. Performance degradation as the filesystem approaches 100% full, often due to the difficulty of finding the remaining unallocated blocks. GPFS doesn?t noticeably suffer from this effect compared to its competitors. 2. Performance degradation over time as files get fragmented and so cause extra movement of the actuator arm of a HDD. (hence defrag on Windows and the idea of short stroking drives). 3. Performance degradation as blocks are written further from the fastest part of a hard disk drive. SSDs do not show this effect. Benchmarks on newly formatted empty filesystems are often artificially high compared to performance after say 12 months whether or not the filesystem is near 90%+ capacity utilisation. The -j scatter option allows for more realistic performance measurement when designing for the long term usage of the filesystem. But this is due to the distributed location of the blocks not how full the filesystem is. Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 15 Nov 2017, at 11:26, Olaf Weiser wrote: > > to add a comment ... .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > To: gpfsug main discussion list > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? > > Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". > > For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. > > [snip from mmcrfs] > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of nodes that will mount file system > [/snip] > > > [snip from man mmcrfs] > layoutMap={scatter| cluster} > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly. > > The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. The cluster > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks. > > The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks. > > The block allocation map type cannot be changed > after the storage pool has been created. > > > -n NumNodes > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool. > > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default. > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > To: > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=Yu5Gt0RPmbb6KaS_emGivhq5C2A33w5DeecdU2aLViQ&s=K0Mz-y4oBH66YUf1syIXaQ3hxck6WjeEMsM-HNHhqAU&e= > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Thu Nov 16 02:34:57 2017 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 16 Nov 2017 02:34:57 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : > to add a comment ... .. very simply... depending on how you allocate the > physical block storage .... if you - simply - using less physical resources > when reducing the capacity (in the same ratio) .. you get , what you > see.... > > so you need to tell us, how you allocate your block-storage .. (Do you > using RAID controllers , where are your LUNs coming from, are then less > RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the hardware can > deliver.. if you reduce resource.. ... you'll get less , if you enhance > your hardware .. you get more... almost regardless of the total capacity in > #blocks .. > > > > > > > From: "Kumaran Rajaram" > To: gpfsug main discussion list > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem > size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Based on your scenario, write degradation as the file-system is populated > is possible if you had formatted the file-system with "-j cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j scatter" > layoutMap.* Also, we need to ensure the mmcrfs "-n" is set properly. > > [snip from mmcrfs] > > > *# mmlsfs | egrep 'Block allocation| Estimated number' -j > scatter Block allocation type -n 128 > Estimated number of nodes that will mount file system* > [/snip] > > > [snip from man mmcrfs] > * layoutMap={scatter|** cluster}* > > > > > > > > > > > > * Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first uses > a round?robin algorithm to spread the data across all > disks in the storage pool. After a disk is selected, the > location of the data block on the disk is determined by > the block allocation map type. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular file are > kept adjacent to each other within each cluster. If > scatter is specified, the location of the block is chosen > randomly.* > > > > > > > > > * The cluster allocation method may provide > better disk performance for some disk subsystems in > relatively small installations. The benefits of clustered > block allocation diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. **The cluster* > > > * allocation method is the default for GPFS > clusters with eight or fewer nodes and for file systems > with eight or fewer disks.* > > > > > > > * The scatter allocation method provides > more consistent file system performance by averaging out > performance variations due to block location (for many > disk subsystems, the location of the data relative to the > disk edge has a substantial effect on performance).* > > > > *This allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than eight > disks.* > > > * The block allocation map type cannot be changed > after the storage pool has been created.* > > > *-n** NumNodes* > > > > > > > > > * The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. This is used > as a best guess for the initial size of some file system data > structures. The default is 32. This value can be changed after the > file system has been created but it does not change the existing > data structures. Only the newly created data structure is > affected by the new value. For example, new storage pool.* > > > > > > > > > > > > * When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the file system. > GPFS uses this information for creating data structures that are > essential for achieving maximum parallelism in file system > operations (For more information, see GPFS architecture in IBM > Spectrum Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, allow > the default value to be applied. If you are planning to add nodes > to your system, you should specify a number larger than the > default.* > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > To: > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 03:42:05 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 03:42:05 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: Sure... as long we assume that really all physical disk are used .. the fact that was told 1/2 or 1/4 might turn out that one / two complet enclosures 're eliminated ... ? ..that s why I was asking for more details .. I dont see this degration in my environments. . as long the vdisks are big enough to span over all pdisks ( which should be the case for capacity in a range of TB ) ... the performance stays the same Gesendet von IBM Verse Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von:"Jan-Frode Myklebust" An:"gpfsug main discussion list" Datum:Mi. 15.11.2017 21:35Betreff:Re: [gpfsug-discuss] Write performances and filesystem size Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : to add a comment ... .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. From: "Kumaran Rajaram" To: gpfsug main discussion list Date: 11/15/2017 11:56 AM Subject: Re: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. [snip from mmcrfs] # mmlsfs | egrep 'Block allocation| Estimated number' -j scatter Block allocation type -n 128 Estimated number of nodes that will mount file system [/snip] [snip from man mmcrfs] layoutMap={scatter| cluster} Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round?robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly. The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system?s free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks. The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance).This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks. The block allocation map type cannot be changed after the storage pool has been created. -n NumNodes The estimated number of nodes that will mount the file system in the local cluster and all remote clusters. This is used as a best guess for the initial size of some file system data structures. The default is 32. This value can be changed after the file system has been created but it does not change the existing data structures. Only the newly created data structure is affected by the new value. For example, new storage pool. When you create a GPFS file system, you might want to overestimate the number of nodes that will mount the file system. GPFS uses this information for creating data structures that are essential for achieving maximum parallelism in file system operations (For more information, see GPFS architecture in IBM Spectrum Scale: Concepts, Planning, and Installation Guide ). If you are sure there will never be more than 64 nodes, allow the default value to be applied. If you are planning to add nodes to your system, you should specify a number larger than the default. [/snip from man mmcrfs] Regards, -Kums From: Ivano Talamo To: Date: 11/15/2017 11:25 AM Subject: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Thu Nov 16 08:44:06 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Thu, 16 Nov 2017 09:44:06 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: <658ae385-ef78-2303-2eef-1b5ac8824c42@psi.ch> Hello Olaf, yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total. Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size. Regarding the layout allocation we used scatter. The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq. Thanks, Ivano Il 16/11/17 04:42, Olaf Weiser ha scritto: > Sure... as long we assume that really all physical disk are used .. the > fact that was told 1/2 or 1/4 might turn out that one / two complet > enclosures 're eliminated ... ? ..that s why I was asking for more > details .. > > I dont see this degration in my environments. . as long the vdisks are > big enough to span over all pdisks ( which should be the case for > capacity in a range of TB ) ... the performance stays the same > > Gesendet von IBM Verse > > Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and > filesystem size --- > > Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same > number of spindles for any size filesystem, so I would also expect them > to perform the same. > > > > -jf > > > ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >: > > to add a comment ... .. very simply... depending on how you > allocate the physical block storage .... if you - simply - using > less physical resources when reducing the capacity (in the same > ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do > you using RAID controllers , where are your LUNs coming from, are > then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the > hardware can deliver.. if you reduce resource.. ... you'll get less > , if you enhance your hardware .. you get more... almost regardless > of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > > To: gpfsug main discussion list > > > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and > filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone > has an explanation for this? > > Based on your scenario, write degradation as the file-system is > populated is possible if you had formatted the file-system with "-j > cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j > scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is > set properly. > > [snip from mmcrfs]/ > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of > nodes that will mount file system/ > [/snip] > > > [snip from man mmcrfs]/ > *layoutMap={scatter|*//*cluster}*// > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type*. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly.*/ > / > * The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. *//The *cluster*// > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks./ > / > *The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).*//This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks./ > / > The block allocation map type cannot be changed > after the storage pool has been created./ > > */ > -n/*/*NumNodes*// > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool./ > / > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default./ > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > > To: > > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, > ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks > seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone > has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org _ > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From olaf.weiser at de.ibm.com Thu Nov 16 12:03:16 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 12:03:16 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. .. You mean something about vdisk Layout. .. So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right? What about Md .. did you create separate vdisk for MD / what size then ? Gesendet von IBM Verse Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von:"Ivano Talamo" An:"gpfsug main discussion list" Datum:Do. 16.11.2017 03:49Betreff:Re: [gpfsug-discuss] Write performances and filesystem size Hello Olaf,yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total.Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size.Regarding the layout allocation we used scatter.The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq.Thanks,IvanoIl 16/11/17 04:42, Olaf Weiser ha scritto:> Sure... as long we assume that really all physical disk are used .. the> fact that was told 1/2 or 1/4 might turn out that one / two complet> enclosures 're eliminated ... ? ..that s why I was asking for more> details ..>> I dont see this degration in my environments. . as long the vdisks are> big enough to span over all pdisks ( which should be the case for> capacity in a range of TB ) ... the performance stays the same>> Gesendet von IBM Verse>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and> filesystem size --->> Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size>> ------------------------------------------------------------------------>> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same> number of spindles for any size filesystem, so I would also expect them> to perform the same.>>>> -jf>>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >:>> to add a comment ... .. very simply... depending on how you> allocate the physical block storage .... if you - simply - using> less physical resources when reducing the capacity (in the same> ratio) .. you get , what you see....>> so you need to tell us, how you allocate your block-storage .. (Do> you using RAID controllers , where are your LUNs coming from, are> then less RAID groups involved, when reducing the capacity ?...)>> GPFS can be configured to give you pretty as much as what the> hardware can deliver.. if you reduce resource.. ... you'll get less> , if you enhance your hardware .. you get more... almost regardless> of the total capacity in #blocks ..>>>>>>> From: "Kumaran Rajaram" >> To: gpfsug main discussion list> >> Date: 11/15/2017 11:56 AM> Subject: Re: [gpfsug-discuss] Write performances and> filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> > ------------------------------------------------------------------------>>>> Hi,>> >>Am I missing something? Is this an expected behaviour and someone> has an explanation for this?>> Based on your scenario, write degradation as the file-system is> populated is possible if you had formatted the file-system with "-j> cluster".>> For consistent file-system performance, we recommend *mmcrfs "-j> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is> set properly.>> [snip from mmcrfs]/> # mmlsfs | egrep 'Block allocation| Estimated number'> -j scatter Block allocation type> -n 128 Estimated number of> nodes that will mount file system/> [/snip]>>> [snip from man mmcrfs]/> *layoutMap={scatter|*//*cluster}*//> Specifies the block allocation map type. When> allocating blocks for a given file, GPFS first> uses a round?robin algorithm to spread the data> across all disks in the storage pool. After a> disk is selected, the location of the data> block on the disk is determined by the block> allocation map type*. If cluster is> specified, GPFS attempts to allocate blocks in> clusters. Blocks that belong to a particular> file are kept adjacent to each other within> each cluster. If scatter is specified,> the location of the block is chosen randomly.*/> /> * The cluster allocation method may provide> better disk performance for some disk> subsystems in relatively small installations.> The benefits of clustered block allocation> diminish when the number of nodes in the> cluster or the number of disks in a file system> increases, or when the file system?s free space> becomes fragmented. *//The *cluster*//> allocation method is the default for GPFS> clusters with eight or fewer nodes and for file> systems with eight or fewer disks./> /> *The scatter allocation method provides> more consistent file system performance by> averaging out performance variations due to> block location (for many disk subsystems, the> location of the data relative to the disk edge> has a substantial effect on performance).*//This> allocation method is appropriate in most cases> and is the default for GPFS clusters with more> than eight nodes or file systems with more than> eight disks./> /> The block allocation map type cannot be changed> after the storage pool has been created./>> */> -n/*/*NumNodes*//> The estimated number of nodes that will mount the file> system in the local cluster and all remote clusters.> This is used as a best guess for the initial size of> some file system data structures. The default is 32.> This value can be changed after the file system has been> created but it does not change the existing data> structures. Only the newly created data structure is> affected by the new value. For example, new storage> pool./> /> When you create a GPFS file system, you might want to> overestimate the number of nodes that will mount the> file system. GPFS uses this information for creating> data structures that are essential for achieving maximum> parallelism in file system operations (For more> information, see GPFS architecture in IBM Spectrum> Scale: Concepts, Planning, and Installation Guide ). If> you are sure there will never be more than 64 nodes,> allow the default value to be applied. If you are> planning to add nodes to your system, you should specify> a number larger than the default./>> [/snip from man mmcrfs]>> Regards,> -Kums>>>>>> From: Ivano Talamo >> To: >> Date: 11/15/2017 11:25 AM> Subject: [gpfsug-discuss] Write performances and filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> > ------------------------------------------------------------------------>>>> Hello everybody,>> together with my colleagues we are actually running some tests on a new> DSS G220 system and we see some unexpected behaviour.>> What we actually see is that write performances (we did not test read> yet) decreases with the decrease of filesystem size.>> I will not go into the details of the tests, but here are some numbers:>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the> sum of the disk activity on the two IO servers;> - with a filesystem using half of the space we get 10 GB/s;> - with a filesystem using 1/4 of the space we get 5 GB/s.>> We also saw that performances are not affected by the vdisks layout,> ie.> taking the full space with one big vdisk or 2 half-size vdisks per RG> gives the same performances.>> To our understanding the IO should be spread evenly across all the> pdisks in the declustered array, and looking at iostat all disks> seem to> be accessed. But so there must be some other element that affects> performances.>> Am I missing something? Is this an expected behaviour and someone> has an> explanation for this?>> Thank you,> Ivano> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org _> __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>_______________________________________________gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Nov 16 12:37:41 2017 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 16 Nov 2017 12:37:41 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: , Message-ID: <83A6EEB0EC738F459A39439733AE80451BB738BC@MBX214.d.ethz.ch> Hi Olaf, yes we have separate vdisks for MD: 2 vdisks, each is 100GBytes large, 1MBytes blocksize, 3WayReplication. A ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Olaf Weiser [olaf.weiser at de.ibm.com] Sent: Thursday, November 16, 2017 1:03 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Write performances and filesystem size Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. .. You mean something about vdisk Layout. .. So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right? What about Md .. did you create separate vdisk for MD / what size then ? Gesendet von IBM Verse Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von: "Ivano Talamo" An: "gpfsug main discussion list" Datum: Do. 16.11.2017 03:49 Betreff: Re: [gpfsug-discuss] Write performances and filesystem size ________________________________ Hello Olaf, yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total. Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size. Regarding the layout allocation we used scatter. The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq. Thanks, Ivano Il 16/11/17 04:42, Olaf Weiser ha scritto: > Sure... as long we assume that really all physical disk are used .. the > fact that was told 1/2 or 1/4 might turn out that one / two complet > enclosures 're eliminated ... ? ..that s why I was asking for more > details .. > > I dont see this degration in my environments. . as long the vdisks are > big enough to span over all pdisks ( which should be the case for > capacity in a range of TB ) ... the performance stays the same > > Gesendet von IBM Verse > > Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and > filesystem size --- > > Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same > number of spindles for any size filesystem, so I would also expect them > to perform the same. > > > > -jf > > > ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >: > > to add a comment ... .. very simply... depending on how you > allocate the physical block storage .... if you - simply - using > less physical resources when reducing the capacity (in the same > ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do > you using RAID controllers , where are your LUNs coming from, are > then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the > hardware can deliver.. if you reduce resource.. ... you'll get less > , if you enhance your hardware .. you get more... almost regardless > of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > > To: gpfsug main discussion list > > > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and > filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone > has an explanation for this? > > Based on your scenario, write degradation as the file-system is > populated is possible if you had formatted the file-system with "-j > cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j > scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is > set properly. > > [snip from mmcrfs]/ > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of > nodes that will mount file system/ > [/snip] > > > [snip from man mmcrfs]/ > *layoutMap={scatter|*//*cluster}*// > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type*. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly.*/ > / > * The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. *//The *cluster*// > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks./ > / > *The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).*//This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks./ > / > The block allocation map type cannot be changed > after the storage pool has been created./ > > */ > -n/*/*NumNodes*// > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool./ > / > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default./ > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > > To: > > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, > ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks > seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone > has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org _ > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Thu Nov 16 13:51:51 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Thu, 16 Nov 2017 14:51:51 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hi, as additional information I past the recovery group information in the full and half size cases. In both cases: - data is on sf_g_01_vdisk01 - metadata on sf_g_01_vdisk02 - sf_g_01_vdisk07 is not used in the filesystem. This is with the full-space filesystem: declustered current allowable recovery group arrays vdisks pdisks format version format version ----------------- ----------- ------ ------ -------------- -------------- sf-g-01 3 6 86 4.2.2.0 4.2.2.0 declustered needs replace scrub background activity array service vdisks pdisks spares threshold free space duration task progress priority ----------- ------- ------ ------ ------ --------- ---------- -------- ------------------------- NVR no 1 2 0,0 1 3632 MiB 14 days scrub 95% low DA1 no 4 83 2,44 1 57 TiB 14 days scrub 0% low SSD no 1 1 0,0 1 372 GiB 14 days scrub 79% low declustered checksum vdisk RAID code array vdisk size block size granularity state remarks ------------------ ------------------ ----------- ---------- ---------- ----------- ----- ------- sf_g_01_logTip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip sf_g_01_logTipBackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup sf_g_01_logHome 4WayReplication DA1 144 GiB 2 MiB 4096 ok log sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk01 8+2p DA1 540 TiB 16 MiB 32 KiB ok config data declustered array spare space remarks ------------------ ------------------ ------------- ------- rebuild space DA1 53 pdisk increasing VCD spares is suggested config data disk group fault tolerance remarks ------------------ --------------------------------- ------- rg descriptor 1 enclosure + 1 drawer + 2 pdisk limited by rebuild space system index 1 enclosure + 1 drawer + 2 pdisk limited by rebuild space vdisk disk group fault tolerance remarks ------------------ --------------------------------- ------- sf_g_01_logTip 1 pdisk sf_g_01_logTipBackup 0 pdisk sf_g_01_logHome 1 enclosure + 1 drawer + 1 pdisk limited by rebuild space sf_g_01_vdisk02 1 enclosure + 1 drawer limited by rebuild space sf_g_01_vdisk07 1 enclosure + 1 drawer limited by rebuild space sf_g_01_vdisk01 2 pdisk This is with the half-space filesystem: declustered current allowable recovery group arrays vdisks pdisks format version format version ----------------- ----------- ------ ------ -------------- -------------- sf-g-01 3 6 86 4.2.2.0 4.2.2.0 declustered needs replace scrub background activity array service vdisks pdisks spares threshold free space duration task progress priority ----------- ------- ------ ------ ------ --------- ---------- -------- ------------------------- NVR no 1 2 0,0 1 3632 MiB 14 days scrub 4% low DA1 no 4 83 2,44 1 395 TiB 14 days scrub 0% low SSD no 1 1 0,0 1 372 GiB 14 days scrub 79% low declustered checksum vdisk RAID code array vdisk size block size granularity state remarks ------------------ ------------------ ----------- ---------- ---------- ----------- ----- ------- sf_g_01_logTip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip sf_g_01_logTipBackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup sf_g_01_logHome 4WayReplication DA1 144 GiB 2 MiB 4096 ok log sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk01 8+2p DA1 270 TiB 16 MiB 32 KiB ok config data declustered array spare space remarks ------------------ ------------------ ------------- ------- rebuild space DA1 68 pdisk increasing VCD spares is suggested config data disk group fault tolerance remarks ------------------ --------------------------------- ------- rg descriptor 1 node + 3 pdisk limited by rebuild space system index 1 node + 3 pdisk limited by rebuild space vdisk disk group fault tolerance remarks ------------------ --------------------------------- ------- sf_g_01_logTip 1 pdisk sf_g_01_logTipBackup 0 pdisk sf_g_01_logHome 1 node + 2 pdisk limited by rebuild space sf_g_01_vdisk02 1 node + 1 pdisk limited by rebuild space sf_g_01_vdisk07 1 node + 1 pdisk limited by rebuild space sf_g_01_vdisk01 2 pdisk Thanks, Ivano Il 16/11/17 13:03, Olaf Weiser ha scritto: > Rjx, that makes it a bit clearer.. as your vdisk is big enough to span > over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... > should bring the same performance. .. > > You mean something about vdisk Layout. .. > So in your test, for the full capacity test, you use just one vdisk per > RG - so 2 in total for 'data' - right? > > What about Md .. did you create separate vdisk for MD / what size then > ? > > Gesendet von IBM Verse > > Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem > size --- > > Von: "Ivano Talamo" > An: "gpfsug main discussion list" > Datum: Do. 16.11.2017 03:49 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Hello Olaf, > > yes, I confirm that is the Lenovo version of the ESS GL2, so 2 > enclosures/4 drawers/166 disks in total. > > Each recovery group has one declustered array with all disks inside, so > vdisks use all the physical ones, even in the case of a vdisk that is > 1/4 of the total size. > > Regarding the layout allocation we used scatter. > > The tests were done on the just created filesystem, so no close-to-full > effect. And we run gpfsperf write seq. > > Thanks, > Ivano > > > Il 16/11/17 04:42, Olaf Weiser ha scritto: >> Sure... as long we assume that really all physical disk are used .. the >> fact that was told 1/2 or 1/4 might turn out that one / two complet >> enclosures 're eliminated ... ? ..that s why I was asking for more >> details .. >> >> I dont see this degration in my environments. . as long the vdisks are >> big enough to span over all pdisks ( which should be the case for >> capacity in a range of TB ) ... the performance stays the same >> >> Gesendet von IBM Verse >> >> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and >> filesystem size --- >> >> Von: "Jan-Frode Myklebust" >> An: "gpfsug main discussion list" >> Datum: Mi. 15.11.2017 21:35 >> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size >> >> ------------------------------------------------------------------------ >> >> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same >> number of spindles for any size filesystem, so I would also expect them >> to perform the same. >> >> >> >> -jf >> >> >> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser > >: >> >> to add a comment ... .. very simply... depending on how you >> allocate the physical block storage .... if you - simply - using >> less physical resources when reducing the capacity (in the same >> ratio) .. you get , what you see.... >> >> so you need to tell us, how you allocate your block-storage .. (Do >> you using RAID controllers , where are your LUNs coming from, are >> then less RAID groups involved, when reducing the capacity ?...) >> >> GPFS can be configured to give you pretty as much as what the >> hardware can deliver.. if you reduce resource.. ... you'll get less >> , if you enhance your hardware .. you get more... almost regardless >> of the total capacity in #blocks .. >> >> >> >> >> >> >> From: "Kumaran Rajaram" > > >> To: gpfsug main discussion list >> > > >> Date: 11/15/2017 11:56 AM >> Subject: Re: [gpfsug-discuss] Write performances and >> filesystem size >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> > ------------------------------------------------------------------------ >> >> >> >> Hi, >> >> >>Am I missing something? Is this an expected behaviour and someone >> has an explanation for this? >> >> Based on your scenario, write degradation as the file-system is >> populated is possible if you had formatted the file-system with "-j >> cluster". >> >> For consistent file-system performance, we recommend *mmcrfs "-j >> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is >> set properly. >> >> [snip from mmcrfs]/ >> # mmlsfs | egrep 'Block allocation| Estimated number' >> -j scatter Block allocation type >> -n 128 Estimated number of >> nodes that will mount file system/ >> [/snip] >> >> >> [snip from man mmcrfs]/ >> *layoutMap={scatter|*//*cluster}*// >> Specifies the block allocation map type. When >> allocating blocks for a given file, GPFS first >> uses a round?robin algorithm to spread the data >> across all disks in the storage pool. After a >> disk is selected, the location of the data >> block on the disk is determined by the block >> allocation map type*. If cluster is >> specified, GPFS attempts to allocate blocks in >> clusters. Blocks that belong to a particular >> file are kept adjacent to each other within >> each cluster. If scatter is specified, >> the location of the block is chosen randomly.*/ >> / >> * The cluster allocation method may provide >> better disk performance for some disk >> subsystems in relatively small installations. >> The benefits of clustered block allocation >> diminish when the number of nodes in the >> cluster or the number of disks in a file system >> increases, or when the file system?s free space >> becomes fragmented. *//The *cluster*// >> allocation method is the default for GPFS >> clusters with eight or fewer nodes and for file >> systems with eight or fewer disks./ >> / >> *The scatter allocation method provides >> more consistent file system performance by >> averaging out performance variations due to >> block location (for many disk subsystems, the >> location of the data relative to the disk edge >> has a substantial effect on performance).*//This >> allocation method is appropriate in most cases >> and is the default for GPFS clusters with more >> than eight nodes or file systems with more than >> eight disks./ >> / >> The block allocation map type cannot be changed >> after the storage pool has been created./ >> >> */ >> -n/*/*NumNodes*// >> The estimated number of nodes that will mount the file >> system in the local cluster and all remote clusters. >> This is used as a best guess for the initial size of >> some file system data structures. The default is 32. >> This value can be changed after the file system has been >> created but it does not change the existing data >> structures. Only the newly created data structure is >> affected by the new value. For example, new storage >> pool./ >> / >> When you create a GPFS file system, you might want to >> overestimate the number of nodes that will mount the >> file system. GPFS uses this information for creating >> data structures that are essential for achieving maximum >> parallelism in file system operations (For more >> information, see GPFS architecture in IBM Spectrum >> Scale: Concepts, Planning, and Installation Guide ). If >> you are sure there will never be more than 64 nodes, >> allow the default value to be applied. If you are >> planning to add nodes to your system, you should specify >> a number larger than the default./ >> >> [/snip from man mmcrfs] >> >> Regards, >> -Kums >> >> >> >> >> >> From: Ivano Talamo > > >> To: > > >> Date: 11/15/2017 11:25 AM >> Subject: [gpfsug-discuss] Write performances and filesystem > size >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> > ------------------------------------------------------------------------ >> >> >> >> Hello everybody, >> >> together with my colleagues we are actually running some tests on > a new >> DSS G220 system and we see some unexpected behaviour. >> >> What we actually see is that write performances (we did not test read >> yet) decreases with the decrease of filesystem size. >> >> I will not go into the details of the tests, but here are some > numbers: >> >> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the >> sum of the disk activity on the two IO servers; >> - with a filesystem using half of the space we get 10 GB/s; >> - with a filesystem using 1/4 of the space we get 5 GB/s. >> >> We also saw that performances are not affected by the vdisks layout, >> ie. >> taking the full space with one big vdisk or 2 half-size vdisks per RG >> gives the same performances. >> >> To our understanding the IO should be spread evenly across all the >> pdisks in the declustered array, and looking at iostat all disks >> seem to >> be accessed. But so there must be some other element that affects >> performances. >> >> Am I missing something? Is this an expected behaviour and someone >> has an >> explanation for this? >> >> Thank you, >> Ivano >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org _ >> > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From sandeep.patil at in.ibm.com Thu Nov 16 14:45:18 2017 From: sandeep.patil at in.ibm.com (Sandeep Ramesh) Date: Thu, 16 Nov 2017 20:15:18 +0530 Subject: [gpfsug-discuss] Latest Technical Blogs on Spectrum Scale Message-ID: Dear User Group members, Here are the Development Blogs in last 3 months on Spectrum Scale Technical Topics. Spectrum Scale Monitoring ? Know More ? https://developer.ibm.com/storage/2017/11/16/spectrum-scale-monitoring-know/ IBM Spectrum Scale 5.0 Release ? What?s coming ! https://developer.ibm.com/storage/2017/11/14/ibm-spectrum-scale-5-0-release-whats-coming/ Four Essentials things to know for managing data ACLs on IBM Spectrum Scale? from Windows https://developer.ibm.com/storage/2017/11/13/four-essentials-things-know-managing-data-acls-ibm-spectrum-scale-windows/ GSSUTILS: A new way of running SSR, Deploying or Upgrading ESS Server https://developer.ibm.com/storage/2017/11/13/gssutils/ IBM Spectrum Scale Object Authentication https://developer.ibm.com/storage/2017/11/02/spectrum-scale-object-authentication/ Video Surveillance ? Choosing the right storage https://developer.ibm.com/storage/2017/11/02/video-surveillance-choosing-right-storage/ IBM Spectrum scale object deep dive training with problem determination https://www.slideshare.net/SmitaRaut/ibm-spectrum-scale-object-deep-dive-training Spectrum Scale as preferred software defined storage for Ubuntu OpenStack https://developer.ibm.com/storage/2017/09/29/spectrum-scale-preferred-software-defined-storage-ubuntu-openstack/ IBM Elastic Storage Server 2U24 Storage ? an All-Flash offering, a performance workhorse https://developer.ibm.com/storage/2017/10/06/ess-5-2-flash-storage/ A Complete Guide to Configure LDAP-based authentication with IBM Spectrum Scale? for File Access https://developer.ibm.com/storage/2017/09/21/complete-guide-configure-ldap-based-authentication-ibm-spectrum-scale-file-access/ Deploying IBM Spectrum Scale on AWS Quick Start https://developer.ibm.com/storage/2017/09/18/deploy-ibm-spectrum-scale-on-aws-quick-start/ Monitoring Spectrum Scale Object metrics https://developer.ibm.com/storage/2017/09/14/monitoring-spectrum-scale-object-metrics/ Tier your data with ease to Spectrum Scale Private Cloud(s) using Moonwalk Universal https://developer.ibm.com/storage/2017/09/14/tier-data-ease-spectrum-scale-private-clouds-using-moonwalk-universal/ Why do I see owner as ?Nobody? for my export mounted using NFSV4 Protocol on IBM Spectrum Scale?? https://developer.ibm.com/storage/2017/09/08/see-owner-nobody-export-mounted-using-nfsv4-protocol-ibm-spectrum-scale/ IBM Spectrum Scale? Authentication using Active Directory and LDAP https://developer.ibm.com/storage/2017/09/01/ibm-spectrum-scale-authentication-using-active-directory-ldap/ IBM Spectrum Scale? Authentication using Active Directory and RFC2307 https://developer.ibm.com/storage/2017/09/01/ibm-spectrum-scale-authentication-using-active-directory-rfc2307/ High Availability Implementation with IBM Spectrum Virtualize and IBM Spectrum Scale https://developer.ibm.com/storage/2017/08/30/high-availability-implementation-ibm-spectrum-virtualize-ibm-spectrum-scale/ 10 Frequently asked Questions on configuring Authentication using AD + AUTO ID mapping on IBM Spectrum Scale?. https://developer.ibm.com/storage/2017/08/04/10-frequently-asked-questions-configuring-authentication-using-ad-auto-id-mapping-ibm-spectrum-scale/ IBM Spectrum Scale? Authentication using Active Directory https://developer.ibm.com/storage/2017/07/30/ibm-spectrum-scale-auth-using-active-directory/ Five cool things that you didn?t know Transparent Cloud Tiering on Spectrum Scale can do https://developer.ibm.com/storage/2017/07/29/five-cool-things-didnt-know-transparent-cloud-tiering-spectrum-scale-can/ IBM Spectrum Scale GUI videos https://developer.ibm.com/storage/2017/07/25/ibm-spectrum-scale-gui-videos/ IBM Spectrum Scale? Authentication ? Planning for NFS Access https://developer.ibm.com/storage/2017/07/24/ibm-spectrum-scale-planning-nfs-access/ For more : Search /browse here: https://developer.ibm.com/storage/blog Consolidation list: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/White%20Papers%20%26%20Media -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 16:08:18 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 11:08:18 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From aelkhouly at sidra.org Thu Nov 16 18:40:51 2017 From: aelkhouly at sidra.org (Ahmad El Khouly) Date: Thu, 16 Nov 2017 18:40:51 +0000 Subject: [gpfsug-discuss] GPFS long waiter Message-ID: <66C328F7-94E9-474F-8AE4-7A4A50DF70E7@sidra.org> Hello all I?m facing long waiter issue and I could not find any way to clear it, I can see all filesystems are responsive and look normal but I can not perform any GPFS commands like mmdf or adding or removing any vdisk, could you please advise how to show more details about this waiter and which pool it is talking about? and any workaround to clear it. 0x7FA0446BF1A0 ( 27706) waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10 (0xFFFFC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace recovery' Ahmed M. Elkhouly Systems Administrator, Scientific Computing Bioinformatics Division Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 23:51:39 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 18:51:39 -0500 Subject: [gpfsug-discuss] GPFS long waiter In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Nov 17 13:03:48 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 17 Nov 2017 13:03:48 +0000 Subject: [gpfsug-discuss] GPFS long waiter Message-ID: Hi Ahmed You might take a look at the file system manager nodes (mmlsmgr) and see if any of them are having problems. It looks like some previous ?mmdf? command was launched and got hung up (and perhaps was terminated by ctrl-c) and the helper process is still running. I have seen mmdf get hung up before, and it?s (almost always) associated with the file system manager node in some way. And I?ve had a few PMRs open on this (vers 4.1, early 4.2) ? I have not seen this on any of the latest code levels) But, as Olaf states, getting a mmsnap and opening a PMR might be worthwhile ? what level of GPFS are you running on? Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of Ahmad El Khouly Reply-To: gpfsug main discussion list Date: Thursday, November 16, 2017 at 12:41 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] GPFS long waiter I?m facing long waiter issue and I could not find any way to clear it, I can see all filesystems are responsive and look normal but I can not perform any GPFS commands like mmdf or adding or removing any vdisk, could you please advise how to show more details about this waiter and which pool it is talking about? and any workaround to clear it. 0x7FA0446BF1A0 ( 27706) waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10 (0xFFFFC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace recovery' -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 13:39:47 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 14:39:47 +0100 Subject: [gpfsug-discuss] gpfs.so vfs samba module is missing Message-ID: Hello at all, anyone know in which package I can find the gpfs vfs module? Currently I am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module is still missing. Any ideas for me? Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Fri Nov 17 16:51:02 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Fri, 17 Nov 2017 11:51:02 -0500 Subject: [gpfsug-discuss] gpfs.so vfs samba module is missing In-Reply-To: References: Message-ID: <8805.1510937462@turing-police.cc.vt.edu> On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 19:04:03 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 20:04:03 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** In-Reply-To: <8805.1510937462@turing-police.cc.vt.edu> References: <8805.1510937462@turing-police.cc.vt.edu> Message-ID: https://manpages.debian.org/testing/samba-vfs-modules/vfs_gpfs.8.en.html I do not think so, the module is a part of samba. I installed the package gpfs.smb too but with the same result. Before I use the normal version of samba I used the version of sernet. There was the module available. Now I am working with CentOS 7.3 and samba of the offical repository of CentOS. Thanks, Matthias Von: valdis.kletnieks at vt.edu An: gpfsug main discussion list Datum: 17.11.2017 17:51 Betreff: [Newsletter] Re: [gpfsug-discuss] gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [Anhang "RohdeSchwarzSecure_E-Mail.html" gel?scht von Matthias Knigge/DVS] -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Nov 17 19:45:30 2017 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 17 Nov 2017 19:45:30 +0000 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** In-Reply-To: References: , <8805.1510937462@turing-police.cc.vt.edu> Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 19:50:27 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 20:50:27 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing In-Reply-To: References: , <8805.1510937462@turing-police.cc.vt.edu> Message-ID: That helps me! Thanks! Von: "Christof Schmitt" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 17.11.2017 20:45 Betreff: [Newsletter] Re: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Whether the gpfs.so module is included depends on each Samba build. Samba provided by Linux distributions typically does not include the gpfs.so module. Sernet package include it. The gpfs.smb Samba build we use in Spectrum Scale also obviously includes the gpfs.so module: # rpm -ql gpfs.smb | grep gpfs.so /usr/lpp/mmfs/lib64/samba/vfs/gpfs.so The main point from a Spectrum Scale point of view: Spectrum Scale only supports the Samba from the gpfs.smb package that was provided with the product. Using any other Samba version is outside of the scope of Spectrum Scale support. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Date: Fri, Nov 17, 2017 12:04 PM https://manpages.debian.org/testing/samba-vfs-modules/vfs_gpfs.8.en.html I do not think so, the module is a part of samba. I installed the package gpfs.smb too but with the same result. Before I use the normal version of samba I used the version of sernet. There was the module available. Now I am working with CentOS 7.3 and samba of the offical repository of CentOS. Thanks, Matthias Von: valdis.kletnieks at vt.edu An: gpfsug main discussion list Datum: 17.11.2017 17:51 Betreff: [Newsletter] Re: [gpfsug-discuss] gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [Anhang "RohdeSchwarzSecure_E-Mail.html" gel?scht von Matthias Knigge/DVS] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=M1Ebd4GVVmaCFs3t0xgGUpgZUM9CzrxWR9I6cvzUqns&s=ONPhff8MP60AoglpZvh9xBAPlV98nW-SmuWoN4EVzUk&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus at koenighome.de Wed Nov 22 03:13:18 2017 From: marcus at koenighome.de (Marcus Koenig) Date: Wed, 22 Nov 2017 16:13:18 +1300 Subject: [gpfsug-discuss] setxattr via policy Message-ID: Hi there, I've got a question around setting userdefined extended attributes. I have played around a bit with setting certain attributes via mmchattr - but now I want to run a policy to do this for me for certain filesets or file sizes. How would I write my policy to set an attribute like user.testflag1=projectX on a number of files in a fileset that are bigger than 1G for example? Thanks folks. Cheers, Marcus -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 22 06:23:08 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 22 Nov 2017 07:23:08 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] setxattr via policy In-Reply-To: References: Message-ID: Good morning, take a look in this directory: cd /usr/lpp/mmfs/samples/ilm/ mmfind or rather tr_findToPol.pl could help you to create a rule/policy. Regards, Matthias Von: Marcus Koenig An: gpfsug-discuss at spectrumscale.org Datum: 22.11.2017 04:13 Betreff: [Newsletter] [gpfsug-discuss] setxattr via policy Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi there, I've got a question around setting userdefined extended attributes. I have played around a bit with setting certain attributes via mmchattr - but now I want to run a policy to do this for me for certain filesets or file sizes. How would I write my policy to set an attribute like user.testflag1=projectX on a number of files in a fileset that are bigger than 1G for example? Thanks folks. Cheers, Marcus_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Wed Nov 22 08:23:22 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 22 Nov 2017 09:23:22 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hello Olaf, thank you for your reply and for confirming that this is not expected, as we also thought. We did repeat the test with 2 vdisks only without dedicated ones for metadata but the result did not change. We now opened a PMR. Thanks, Ivano Il 16/11/17 17:08, Olaf Weiser ha scritto: > Hi Ivano, > so from this output, the performance degradation is not explainable .. > in my current environments.. , having multiple file systems (so vdisks > on one BB) .. and it works fine .. > > as said .. just open a PMR.. I would'nt consider this as the "expected > behavior" > the only thing is.. the MD disks are a bit small.. so maybe redo your > tests and for a simple compare between 1/2 1/1 or 1/4 capacity test > with 2 vdisks only and /dataAndMetadata/ > cheers > > > > > > From: Ivano Talamo > To: gpfsug main discussion list > Date: 11/16/2017 08:52 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > Hi, > > as additional information I past the recovery group information in the > full and half size cases. > In both cases: > - data is on sf_g_01_vdisk01 > - metadata on sf_g_01_vdisk02 > - sf_g_01_vdisk07 is not used in the filesystem. > > This is with the full-space filesystem: > > declustered current allowable > recovery group arrays vdisks pdisks format version format > version > ----------------- ----------- ------ ------ -------------- > -------------- > sf-g-01 3 6 86 4.2.2.0 4.2.2.0 > > > declustered needs replace > scrub background activity > array service vdisks pdisks spares threshold free space > duration task progress priority > ----------- ------- ------ ------ ------ --------- ---------- > -------- ------------------------- > NVR no 1 2 0,0 1 3632 MiB > 14 days scrub 95% low > DA1 no 4 83 2,44 1 57 TiB > 14 days scrub 0% low > SSD no 1 1 0,0 1 372 GiB > 14 days scrub 79% low > > declustered > checksum > vdisk RAID code array vdisk size block > size granularity state remarks > ------------------ ------------------ ----------- ---------- > ---------- ----------- ----- ------- > sf_g_01_logTip 2WayReplication NVR 48 MiB 2 > MiB 4096 ok logTip > sf_g_01_logTipBackup Unreplicated SSD 48 MiB > 2 MiB 4096 ok logTipBackup > sf_g_01_logHome 4WayReplication DA1 144 GiB 2 > MiB 4096 ok log > sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk01 8+2p DA1 540 TiB 16 > MiB 32 KiB ok > > config data declustered array spare space remarks > ------------------ ------------------ ------------- ------- > rebuild space DA1 53 pdisk > increasing VCD spares is suggested > > config data disk group fault tolerance remarks > ------------------ --------------------------------- ------- > rg descriptor 1 enclosure + 1 drawer + 2 pdisk limited by > rebuild space > system index 1 enclosure + 1 drawer + 2 pdisk limited by > rebuild space > > vdisk disk group fault tolerance remarks > ------------------ --------------------------------- ------- > sf_g_01_logTip 1 pdisk > sf_g_01_logTipBackup 0 pdisk > sf_g_01_logHome 1 enclosure + 1 drawer + 1 pdisk limited by > rebuild space > sf_g_01_vdisk02 1 enclosure + 1 drawer limited by > rebuild space > sf_g_01_vdisk07 1 enclosure + 1 drawer limited by > rebuild space > sf_g_01_vdisk01 2 pdisk > > > This is with the half-space filesystem: > > declustered current allowable > recovery group arrays vdisks pdisks format version format > version > ----------------- ----------- ------ ------ -------------- > -------------- > sf-g-01 3 6 86 4.2.2.0 4.2.2.0 > > > declustered needs replace > scrub background activity > array service vdisks pdisks spares threshold free space > duration task progress priority > ----------- ------- ------ ------ ------ --------- ---------- > -------- ------------------------- > NVR no 1 2 0,0 1 3632 MiB > 14 days scrub 4% low > DA1 no 4 83 2,44 1 395 TiB > 14 days scrub 0% low > SSD no 1 1 0,0 1 372 GiB > 14 days scrub 79% low > > declustered > checksum > vdisk RAID code array vdisk size block > size granularity state remarks > ------------------ ------------------ ----------- ---------- > ---------- ----------- ----- ------- > sf_g_01_logTip 2WayReplication NVR 48 MiB 2 > MiB 4096 ok logTip > sf_g_01_logTipBackup Unreplicated SSD 48 MiB > 2 MiB 4096 ok logTipBackup > sf_g_01_logHome 4WayReplication DA1 144 GiB 2 > MiB 4096 ok log > sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk01 8+2p DA1 270 TiB 16 > MiB 32 KiB ok > > config data declustered array spare space remarks > ------------------ ------------------ ------------- ------- > rebuild space DA1 68 pdisk > increasing VCD spares is suggested > > config data disk group fault tolerance remarks > ------------------ --------------------------------- ------- > rg descriptor 1 node + 3 pdisk limited by > rebuild space > system index 1 node + 3 pdisk limited by > rebuild space > > vdisk disk group fault tolerance remarks > ------------------ --------------------------------- ------- > sf_g_01_logTip 1 pdisk > sf_g_01_logTipBackup 0 pdisk > sf_g_01_logHome 1 node + 2 pdisk limited by > rebuild space > sf_g_01_vdisk02 1 node + 1 pdisk limited by > rebuild space > sf_g_01_vdisk07 1 node + 1 pdisk limited by > rebuild space > sf_g_01_vdisk01 2 pdisk > > > Thanks, > Ivano > > > > > Il 16/11/17 13:03, Olaf Weiser ha scritto: >> Rjx, that makes it a bit clearer.. as your vdisk is big enough to span >> over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... >> should bring the same performance. .. >> >> You mean something about vdisk Layout. .. >> So in your test, for the full capacity test, you use just one vdisk per >> RG - so 2 in total for 'data' - right? >> >> What about Md .. did you create separate vdisk for MD / what size then >> ? >> >> Gesendet von IBM Verse >> >> Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem >> size --- >> >> Von: "Ivano Talamo" >> An: "gpfsug main discussion list" > >> Datum: Do. 16.11.2017 03:49 >> Betreff: Re: [gpfsug-discuss] Write performances and > filesystem size >> >> ------------------------------------------------------------------------ >> >> Hello Olaf, >> >> yes, I confirm that is the Lenovo version of the ESS GL2, so 2 >> enclosures/4 drawers/166 disks in total. >> >> Each recovery group has one declustered array with all disks inside, so >> vdisks use all the physical ones, even in the case of a vdisk that is >> 1/4 of the total size. >> >> Regarding the layout allocation we used scatter. >> >> The tests were done on the just created filesystem, so no close-to-full >> effect. And we run gpfsperf write seq. >> >> Thanks, >> Ivano >> >> >> Il 16/11/17 04:42, Olaf Weiser ha scritto: >>> Sure... as long we assume that really all physical disk are used .. the >>> fact that was told 1/2 or 1/4 might turn out that one / two complet >>> enclosures 're eliminated ... ? ..that s why I was asking for more >>> details .. >>> >>> I dont see this degration in my environments. . as long the vdisks are >>> big enough to span over all pdisks ( which should be the case for >>> capacity in a range of TB ) ... the performance stays the same >>> >>> Gesendet von IBM Verse >>> >>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and >>> filesystem size --- >>> >>> Von: "Jan-Frode Myklebust" >>> An: "gpfsug main discussion list" >>> Datum: Mi. 15.11.2017 21:35 >>> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size >>> >>> ------------------------------------------------------------------------ >>> >>> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same >>> number of spindles for any size filesystem, so I would also expect them >>> to perform the same. >>> >>> >>> >>> -jf >>> >>> >>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >> >: >>> >>> to add a comment ... .. very simply... depending on how you >>> allocate the physical block storage .... if you - simply - using >>> less physical resources when reducing the capacity (in the same >>> ratio) .. you get , what you see.... >>> >>> so you need to tell us, how you allocate your block-storage .. (Do >>> you using RAID controllers , where are your LUNs coming from, are >>> then less RAID groups involved, when reducing the capacity ?...) >>> >>> GPFS can be configured to give you pretty as much as what the >>> hardware can deliver.. if you reduce resource.. ... you'll get less >>> , if you enhance your hardware .. you get more... almost regardless >>> of the total capacity in #blocks .. >>> >>> >>> >>> >>> >>> >>> From: "Kumaran Rajaram" >> > >>> To: gpfsug main discussion list >>> >> > >>> Date: 11/15/2017 11:56 AM >>> Subject: Re: [gpfsug-discuss] Write performances and >>> filesystem size >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> >>> >> ------------------------------------------------------------------------ >>> >>> >>> >>> Hi, >>> >>> >>Am I missing something? Is this an expected behaviour and someone >>> has an explanation for this? >>> >>> Based on your scenario, write degradation as the file-system is >>> populated is possible if you had formatted the file-system with "-j >>> cluster". >>> >>> For consistent file-system performance, we recommend *mmcrfs "-j >>> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is >>> set properly. >>> >>> [snip from mmcrfs]/ >>> # mmlsfs | egrep 'Block allocation| Estimated number' >>> -j scatter Block allocation type >>> -n 128 Estimated number of >>> nodes that will mount file system/ >>> [/snip] >>> >>> >>> [snip from man mmcrfs]/ >>> *layoutMap={scatter|*//*cluster}*// >>> Specifies the block allocation map type. When >>> allocating blocks for a given file, GPFS first >>> uses a round?robin algorithm to spread the data >>> across all disks in the storage pool. After a >>> disk is selected, the location of the data >>> block on the disk is determined by the block >>> allocation map type*. If cluster is >>> specified, GPFS attempts to allocate blocks in >>> clusters. Blocks that belong to a particular >>> file are kept adjacent to each other within >>> each cluster. If scatter is specified, >>> the location of the block is chosen randomly.*/ >>> / >>> * The cluster allocation method may provide >>> better disk performance for some disk >>> subsystems in relatively small installations. >>> The benefits of clustered block allocation >>> diminish when the number of nodes in the >>> cluster or the number of disks in a file system >>> increases, or when the file system?s free space >>> becomes fragmented. *//The *cluster*// >>> allocation method is the default for GPFS >>> clusters with eight or fewer nodes and for file >>> systems with eight or fewer disks./ >>> / >>> *The scatter allocation method provides >>> more consistent file system performance by >>> averaging out performance variations due to >>> block location (for many disk subsystems, the >>> location of the data relative to the disk edge >>> has a substantial effect on performance).*//This >>> allocation method is appropriate in most cases >>> and is the default for GPFS clusters with more >>> than eight nodes or file systems with more than >>> eight disks./ >>> / >>> The block allocation map type cannot be changed >>> after the storage pool has been created./ >>> >>> */ >>> -n/*/*NumNodes*// >>> The estimated number of nodes that will mount the file >>> system in the local cluster and all remote clusters. >>> This is used as a best guess for the initial size of >>> some file system data structures. The default is 32. >>> This value can be changed after the file system has been >>> created but it does not change the existing data >>> structures. Only the newly created data structure is >>> affected by the new value. For example, new storage >>> pool./ >>> / >>> When you create a GPFS file system, you might want to >>> overestimate the number of nodes that will mount the >>> file system. GPFS uses this information for creating >>> data structures that are essential for achieving maximum >>> parallelism in file system operations (For more >>> information, see GPFS architecture in IBM Spectrum >>> Scale: Concepts, Planning, and Installation Guide ). If >>> you are sure there will never be more than 64 nodes, >>> allow the default value to be applied. If you are >>> planning to add nodes to your system, you should specify >>> a number larger than the default./ >>> >>> [/snip from man mmcrfs] >>> >>> Regards, >>> -Kums >>> >>> >>> >>> >>> >>> From: Ivano Talamo >> > >>> To: >> > >>> Date: 11/15/2017 11:25 AM >>> Subject: [gpfsug-discuss] Write performances and filesystem >> size >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> >>> >> ------------------------------------------------------------------------ >>> >>> >>> >>> Hello everybody, >>> >>> together with my colleagues we are actually running some tests on >> a new >>> DSS G220 system and we see some unexpected behaviour. >>> >>> What we actually see is that write performances (we did not test read >>> yet) decreases with the decrease of filesystem size. >>> >>> I will not go into the details of the tests, but here are some >> numbers: >>> >>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the >>> sum of the disk activity on the two IO servers; >>> - with a filesystem using half of the space we get 10 GB/s; >>> - with a filesystem using 1/4 of the space we get 5 GB/s. >>> >>> We also saw that performances are not affected by the vdisks layout, >>> ie. >>> taking the full space with one big vdisk or 2 half-size vdisks per RG >>> gives the same performances. >>> >>> To our understanding the IO should be spread evenly across all the >>> pdisks in the declustered array, and looking at iostat all disks >>> seem to >>> be accessed. But so there must be some other element that affects >>> performances. >>> >>> Am I missing something? Is this an expected behaviour and someone >>> has an >>> explanation for this? >>> >>> Thank you, >>> Ivano >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >_ >>> >> > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org > >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org > >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jtucker at pixitmedia.com Wed Nov 22 09:20:55 2017 From: jtucker at pixitmedia.com (Jez Tucker) Date: Wed, 22 Nov 2017 09:20:55 +0000 Subject: [gpfsug-discuss] setxattr via policy In-Reply-To: References: Message-ID: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> Hi Marcus, ? Something like this should do you: RULE 'setxattr' LIST 'do_setxattr' FOR FILESET ('xattrfileset') WEIGHT(DIRECTORY_HASH) ACTION(SETXATTR('user.testflag1','projectX')) WHERE ??? KB_ALLOCATED >? [insert required file size limit] Then with one file larger and another file smaller than the limit: root at elmo:/mmfs1/policies# getfattr -n user.testflag1 /mmfs1/data/xattrfileset/* getfattr: Removing leading '/' from absolute path names # file: mmfs1/data/xattrfileset/file.1 user.testflag1="projectX" /mmfs1/data/xattrfileset/file.2: user.testflag1: No such attribute As xattrs are a superb way of automating data operations, for those of you with our Python API have a look over the xattr examples in the git repo: https://github.com/arcapix/gpfsapi-examples as an alternative Pythonic way to achieve this. Cheers, Jez On 22/11/17 03:13, Marcus Koenig wrote: > Hi there, > > I've got a question around setting userdefined extended attributes. I > have played around a bit with setting certain attributes via mmchattr > - but now I want to run a policy to do this for me for certain > filesets or file sizes. > > How would I write my policy to set an attribute like > user.testflag1=projectX on a number of files in a fileset that are > bigger than 1G for example? > > Thanks folks. > > Cheers, > Marcus > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- *Jez Tucker* Head of Research and Development, Pixit Media 07764193820 | jtucker at pixitmedia.com www.pixitmedia.com | Tw:@pixitmedia.com -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus at koenighome.de Wed Nov 22 09:28:56 2017 From: marcus at koenighome.de (Marcus Koenig) Date: Wed, 22 Nov 2017 22:28:56 +1300 Subject: [gpfsug-discuss] setxattr via policy In-Reply-To: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> References: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> Message-ID: Thanks guys - will test it now - much appreciated. On Wed, Nov 22, 2017 at 10:20 PM, Jez Tucker wrote: > Hi Marcus, > > Something like this should do you: > > RULE 'setxattr' LIST 'do_setxattr' > FOR FILESET ('xattrfileset') > WEIGHT(DIRECTORY_HASH) > ACTION(SETXATTR('user.testflag1','projectX')) > WHERE > KB_ALLOCATED > [insert required file size limit] > > > Then with one file larger and another file smaller than the limit: > > root at elmo:/mmfs1/policies# getfattr -n user.testflag1 > /mmfs1/data/xattrfileset/* > getfattr: Removing leading '/' from absolute path names > # file: mmfs1/data/xattrfileset/file.1 > user.testflag1="projectX" > > /mmfs1/data/xattrfileset/file.2: user.testflag1: No such attribute > > > As xattrs are a superb way of automating data operations, for those of you > with our Python API have a look over the xattr examples in the git repo: > https://github.com/arcapix/gpfsapi-examples as an alternative Pythonic > way to achieve this. > > Cheers, > > Jez > > > > > On 22/11/17 03:13, Marcus Koenig wrote: > > Hi there, > > I've got a question around setting userdefined extended attributes. I have > played around a bit with setting certain attributes via mmchattr - but now > I want to run a policy to do this for me for certain filesets or file sizes. > > How would I write my policy to set an attribute like > user.testflag1=projectX on a number of files in a fileset that are bigger > than 1G for example? > > Thanks folks. > > Cheers, > Marcus > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > -- > *Jez Tucker* > Head of Research and Development, Pixit Media > 07764193820 <07764%20193820> | jtucker at pixitmedia.com > www.pixitmedia.com | Tw:@pixitmedia.com > > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Nov 22 16:51:27 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 22 Nov 2017 11:51:27 -0500 Subject: [gpfsug-discuss] setxattr via policy - extended attributes - tips and hints In-Reply-To: References: Message-ID: Assuming you have a recent version of Spectrum Scale... You can use ACTION(SetXattr(...)) in mmapplypolicy {MIGRATE,LIST} rules and/or in {SET POOL} rules that are evaluated at file creation time. Later... You can use WHERE .... Xattr(...) in any policy rules to test/compare an extended attribute. But watch out for NULL! See the "tips" section of the ILM chapter of the admin guide for some ways to deal with NULL (hints: COALESCE , expr IS NULL, expr IS NOT NULL, CASE ... ) See also mm{ch|ls}attr -d -X --hex-attr and so forth. Also can be used compatibly with {set|get}fattr on Linux --marc -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Nov 23 06:21:10 2017 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Thu, 23 Nov 2017 06:21:10 +0000 Subject: [gpfsug-discuss] tar sparse file data loss Message-ID: Somehow this nugget of joy (that?s most definitely sarcasm, this really sucks) slipped past my radar: http://www-01.ibm.com/support/docview.wss?uid=isg1IV96475 Anyone know if there?s a fix in the 4.1 stream? In my opinion this is 100% a tar bug as the APAR suggests but GPFS has implemented a workaround. See this post from the tar mailing list: https://www.mail-archive.com/bug-tar at gnu.org/msg04209.html It looks like the troublesome code may still exist upstream: http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c#n273 No better way to ensure you?ll hit a problem than to assume you won?t :) -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Thu Nov 23 23:02:46 2017 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 23 Nov 2017 23:02:46 +0000 Subject: [gpfsug-discuss] tar sparse file data loss In-Reply-To: References: Message-ID: <61aa823e50ad4cf3a59de063528e6d12@exch1-cdc.nexus.csiro.au> I logged perhaps the original service request on this but must admit we haven?t tried it of late as have worked around the issue. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] Sent: Thursday, 23 November 2017 4:21 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] tar sparse file data loss Somehow this nugget of joy (that?s most definitely sarcasm, this really sucks) slipped past my radar: http://www-01.ibm.com/support/docview.wss?uid=isg1IV96475 Anyone know if there?s a fix in the 4.1 stream? In my opinion this is 100% a tar bug as the APAR suggests but GPFS has implemented a workaround. See this post from the tar mailing list: https://www.mail-archive.com/bug-tar at gnu.org/msg04209.html It looks like the troublesome code may still exist upstream: http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c#n273 No better way to ensure you?ll hit a problem than to assume you won?t :) -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Sun Nov 26 18:00:37 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Sun, 26 Nov 2017 13:00:37 -0500 Subject: [gpfsug-discuss] Online data migration tool Message-ID: With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron From Greg.Lehmann at csiro.au Sun Nov 26 22:33:45 2017 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Sun, 26 Nov 2017 22:33:45 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: I personally don?t think lack of a migration tool is a problem. I do think that 2 format changes in such quick succession is a problem. I am willing to migrate occasionally, but then the amount of data we have in GPFS is still small. I do value my data, so I'd trust a manual migration using standard tools that have been around for a while over a custom migration tool any day. This last format change seems fairly major to me, so doubly so in this case. Trying to find a plus in this, maybe use it test DR procedures at the same time. Apologies in advance to those, that simply can't. I hope you get your migration tool. To IBM, thank you for making GPFS faster. Greg -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, 27 November 2017 4:01 AM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Online data migration tool With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From S.J.Thompson at bham.ac.uk Sun Nov 26 22:39:48 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Sun, 26 Nov 2017 22:39:48 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: I agree that migration is not easy. We thought we might be able to accomplish it using SOBAR, but the block size has to match in the old and new file-systems. In fact mmfsd asserts if you try. I had a PMR open on this and was told SoBAR can only be used to restore to the same block size and they aren't going to fix it. (Seriously how many people using SOBAR for DR are likely to be able to restore to identical hardware?). Second we thought maybe AFM would help, but we use IFS and child dependent filesets and we can't replicate the structure in the AFM cache. Given there is no other supported way of moving data or converting file-systems, like you we are stuck with significant disruption when we want to replace some aging hardware next year. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of aaron.knister at gmail.com [aaron.knister at gmail.com] Sent: 26 November 2017 18:00 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Online data migration tool With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From abeattie at au1.ibm.com Sun Nov 26 22:46:13 2017 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Sun, 26 Nov 2017 22:46:13 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Nov 27 14:56:56 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 27 Nov 2017 14:56:56 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: <1511794616.18554.121.camel@strath.ac.uk> On Sun, 2017-11-26 at 13:00 -0500, Aaron Knister wrote: > With the release of Scale 5.0 it?s no secret that some of the > performance features of 5.0 require a new disk format and existing > filesystems cannot be migrated in place to get these features.? > > There?s also an issue for long time customers who have had scale > since before the 4.1 days where filesystems crested prior to I think > 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold > metadata. At some point we?re not going to be able to buy storage > that?s not got 4K sectors.? This has been going on since forever. We have had change to 64bit inodes for more than 2 billion files and the ability to mount on Windows. They are like 2.3 and 3.0 changes from memory going back around a decade now. I have a feeling there was another change for mounting HSM'ed file systems on Windows too. I just don't think IBM care. The answer has always been well just start again. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From yguvvala at cambridgecomputer.com Wed Nov 29 16:00:33 2017 From: yguvvala at cambridgecomputer.com (Yugendra Guvvala) Date: Wed, 29 Nov 2017 11:00:33 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511794616.18554.121.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: Hi, I am trying to understand the technical challenges to migrate to GPFS 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to see 5.0 release and hear about some promising features available. But not sure about complexity involved to migrate. ? Thanks, Yugi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:35:04 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:35:04 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: <1511973304.18554.133.camel@strath.ac.uk> On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: > Hi,? > > I am trying to understand the technical challenges to migrate to GPFS > 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to > see 5.0 release and hear about some promising features available. But > not sure about complexity involved to migrate.? > Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 29 16:37:02 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 29 Nov 2017 11:37:02 -0500 Subject: [gpfsug-discuss] SOBAR restore with new blocksize and/or inodesize In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: This redbook http://w3-03.ibm.com/support/techdocs/atsmastr.nsf/3af3af29ce1f19cf86256c7100727a9f/335d8a48048ea78d85258059006dad33/$FILE/SOBAR_Migration_SpectrumScale_v1.0.pdf has these and other hints: -B blocksize, should match the file system block size of the source system, but can also be larger (not smaller). To obtain the file system block size in the source system use the command: mmlsfs -B -i inodesize, should match the file system inode size of the source system, but can also be larger (not smaller). To obtain the inode size in the source system use the following command: mmlsfs -i. Note, in Spectrum Scale it is recommended to use a inodesize of 4K because this well aligns to disk I/O. Our tests have shown that having a greater inode size on the target than on the source works as well. If you really want to shrink the blocksize, some internal testing indicates that works also. Shrinking the inodesize also works, although this will impact the efficiency of small file and extended attributes in-the-inode support. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Wed Nov 29 16:39:25 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 29 Nov 2017 16:39:25 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511973304.18554.133.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. ? Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 29 November 2017 16:35 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Online data migration tool On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: > Hi, > > I am trying to understand the technical challenges to migrate to GPFS > 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to > see 5.0 release and hear about some promising features available. But > not sure about complexity involved to migrate. > Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From scottg at emailhosting.com Wed Nov 29 16:38:07 2017 From: scottg at emailhosting.com (scott) Date: Wed, 29 Nov 2017 11:38:07 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511973304.18554.133.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: Question: Who at IBM is going to reach out to ESPN - a 24/7 online user - with >15PETABYTES of content? Asking customers to copy, reformat, copy back will just cause IBM to have to support the older version for a longer period of time Just my $.03 (adjusted for inflation) On 11/29/2017 11:35 AM, Jonathan Buzzard wrote: > On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: >> Hi, >> >> I am trying to understand the technical challenges to migrate to GPFS >> 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to >> see 5.0 release and hear about some promising features available. But >> not sure about complexity involved to migrate. >> > Oh that's simple. You copy all your data somewhere else (good luck if > you happen to have a few hundred TB or maybe a PB or more) then > reformat your files system with the new disk format then restore all > your data to your shiny new file system. > > Over the years there have been a number of these "reformats" to get all > the new shiny features, which is the cause of the grumbles because it > is not funny and most people don't have the disk space to just hold > another copy of the data, and even if they did it is extremely > disruptive. > > JAB. > From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:47:27 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:47:27 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <1511974047.18554.135.camel@strath.ac.uk> On Wed, 2017-11-29 at 11:38 -0500, scott wrote: > Question: Who at IBM is going to reach out to ESPN - a 24/7 online > user? > - with >15PETABYTES of content? > > Asking customers to copy, reformat, copy back will just cause IBM to? > have to support the older version for a longer period of time > > Just my $.03 (adjusted for inflation) > Oh you can upgrade to 5.0, it's just if your file system was created with a previous version then you won't get to use all the new features.? I would imagine if you still had a file system created under 2.3 you could mount it on 5.0. Just you would be missing a bunch of features like support for more than 2 billion files, or the ability to mount in on Windows or ... JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Kevin.Buterbaugh at Vanderbilt.Edu Wed Nov 29 16:51:51 2017 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 29 Nov 2017 16:51:51 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Hi All, Well, actually a year ago we started the process of doing pretty much what Richard describes below ? the exception being that we rsync?d data over to the new filesystem group by group. It was no fun but it worked. And now GPFS (and it will always be GPFS ? it will never be Spectrum Scale) version 5 is coming and there are compelling reasons to want to do the same thing over again ? despite the pain. Having said all that, I think it would be interesting to have someone from IBM give an explanation of why Apple can migrate millions of devices to a new filesystem with 99.999999% of the users never even knowing they did it ? but IBM can?t provide a way to migrate to a new filesystem ?in place.? And to be fair to IBM, they do ship AIX with root having a password and Apple doesn?t, so we all have our strengths and weaknesses! ;-) Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Nov 29, 2017, at 10:39 AM, Sobey, Richard A > wrote: Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. ? Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 29 November 2017 16:35 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Online data migration tool On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: Hi, I am trying to understand the technical challenges to migrate to GPFS 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to see 5.0 release and hear about some promising features available. But not sure about complexity involved to migrate. Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:55:46 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:55:46 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Message-ID: <1511974546.18554.138.camel@strath.ac.uk> On Wed, 2017-11-29 at 16:51 +0000, Buterbaugh, Kevin L wrote: [SNIP] > And now GPFS (and it will always be GPFS ? it will never be > Spectrum Scale) Splitter, its Tiger Shark forever ;-) JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 29 17:37:29 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 29 Nov 2017 12:37:29 -0500 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Wed Nov 29 17:40:51 2017 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Wed, 29 Nov 2017 17:40:51 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Wednesday, November 29, 2017 at 11:38 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Nov 29 17:43:11 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Wed, 29 Nov 2017 17:43:11 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> , Message-ID: You can in place upgrade. I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] Sent: 29 November 2017 17:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Wednesday, November 29, 2017 at 11:38 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Nov 29 17:50:50 2017 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 29 Nov 2017 17:50:50 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <4FB50580-B5E2-45AD-BABB-C2BE9E99012F@vanderbilt.edu> Simon in correct ? I?d love to be able to support a larger block size for my users who have sane workflows while still not wasting a ton of space for the biomedical folks?. ;-) A question ? will the new, much improved, much faster mmrestripefs that was touted at SC17 require a filesystem that was created with GPFS / Tiger Shark / Spectrum Scale / Multi-media filesystem () version 5 or simply one that has been ?upgraded? to that format? Thanks? Kevin > On Nov 29, 2017, at 11:43 AM, Simon Thompson (IT Research Support) wrote: > > You can in place upgrade. > > I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] > Sent: 29 November 2017 17:40 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. > > From: on behalf of Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Wednesday, November 29, 2017 at 11:38 AM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? > > > The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C755e8b13215f48e4e21508d53750ac45%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636475741979446614&sdata=RpfsLbGTRtlZQ06Winrn65jXQlDYjFHdWuKMvEyZwBI%3D&reserved=0 From knop at us.ibm.com Wed Nov 29 18:27:40 2017 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 29 Nov 2017 13:27:40 -0500 Subject: [gpfsug-discuss] 5.0 features? -- mmrestripefs -b In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: Kevin, The improved rebalance function (mmrestripefs -b) only depends on the cluster level being (at least) 5.0.0, and will work with older file system formats as well. This particular improvement did not require a change in the format/structure of the file system. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 11/29/2017 12:51 PM Subject: Re: [gpfsug-discuss] 5.0 features? Sent by: gpfsug-discuss-bounces at spectrumscale.org Simon in correct ? I?d love to be able to support a larger block size for my users who have sane workflows while still not wasting a ton of space for the biomedical folks?. ;-) A question ? will the new, much improved, much faster mmrestripefs that was touted at SC17 require a filesystem that was created with GPFS / Tiger Shark / Spectrum Scale / Multi-media filesystem () version 5 or simply one that has been ?upgraded? to that format? Thanks? Kevin > On Nov 29, 2017, at 11:43 AM, Simon Thompson (IT Research Support) wrote: > > You can in place upgrade. > > I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] > Sent: 29 November 2017 17:40 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. > > From: on behalf of Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Wednesday, November 29, 2017 at 11:38 AM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? > > > The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttp-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss-26data-3D02-257C01-257CKevin.Buterbaugh-2540vanderbilt.edu-257C755e8b13215f48e4e21508d53750ac45-257Cba5a7f39e3be4ab3b45067fa80faecad-257C0-257C0-257C636475741979446614-26sdata-3DRpfsLbGTRtlZQ06Winrn65jXQlDYjFHdWuKMvEyZwBI-253D-26reserved-3D0&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=T_wlNQsuQkBDoQhdS2fe4nbIoDOo5oywJRYfJ6849M8&s=C6m8yyvkVEqEmpozrpgGHNidk4SwpbgpCWO1fvYKffA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=T_wlNQsuQkBDoQhdS2fe4nbIoDOo5oywJRYfJ6849M8&s=JFaXBwXQ8aaDrZ1mdCvsZ6siAktHtOVvZr7vqiy_Tp4&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From nikhilk at us.ibm.com Wed Nov 29 19:08:11 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Wed, 29 Nov 2017 12:08:11 -0700 Subject: [gpfsug-discuss] Online data migration tool Message-ID: Hi, I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. I hope that clarifies things a little and makes the upgrade path more accessible. Please let me know if there are any other questions or concerns. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Nov 29 19:19:11 2017 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 29 Nov 2017 14:19:11 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Message-ID: <49425FCD-D1CA-46FE-B1F1-98E5F464707C@ulmer.org> About five years ago (I think) Apple slipped a volume manager[1] in on the unsuspecting. :) If you have a Mac, you might have noticed that the mount type/pattern changed with Lion. CoreStorage was the beginning of building the infrastructure to change a million(?) Macs and several hundred million iPhones and iPads under the users? noses. :) Has anyone seen list of the features that would require the on-disk upgrade? If there isn?t one yet, I think that the biggest failing is not not publishing it ? the natives are restless and it?s not like IBM wouldn?t know... [1] This is what Apple calls it. If you?ve ever used AIX or Linux you?ll just chuckle when you look at the limitations. -- Stephen > On Nov 29, 2017, at 11:51 AM, Buterbaugh, Kevin L wrote: > > Hi All, > > Well, actually a year ago we started the process of doing pretty much what Richard describes below ? the exception being that we rsync?d data over to the new filesystem group by group. It was no fun but it worked. And now GPFS (and it will always be GPFS ? it will never be Spectrum Scale) version 5 is coming and there are compelling reasons to want to do the same thing over again ? despite the pain. > > Having said all that, I think it would be interesting to have someone from IBM give an explanation of why Apple can migrate millions of devices to a new filesystem with 99.999999% of the users never even knowing they did it ? but IBM can?t provide a way to migrate to a new filesystem ?in place.? > > And to be fair to IBM, they do ship AIX with root having a password and Apple doesn?t, so we all have our strengths and weaknesses! ;-) > > Kevin > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > >> On Nov 29, 2017, at 10:39 AM, Sobey, Richard A wrote: >> >> Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. >> >> ? >> >> Richard >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard >> Sent: 29 November 2017 16:35 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Online data migration tool >> >> On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: >>> Hi, >>> >>> I am trying to understand the technical challenges to migrate to GPFS >>> 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to >>> see 5.0 release and hear about some promising features available. But >>> not sure about complexity involved to migrate. >>> >> >> Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. >> >> Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From ulmer at ulmer.org Wed Nov 29 19:21:00 2017 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 29 Nov 2017 14:21:00 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Thank you. -- Stephen > On Nov 29, 2017, at 2:08 PM, Nikhil Khandelwal > wrote: > > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Wed Nov 29 22:41:48 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Wed, 29 Nov 2017 17:41:48 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf from those presentations regarding 32 subblocks: "It has a significant performance penalty for small files in large block size filesystems" although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. -Aaron On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For > all Spectrum Scale clusters that are currently at 4.X.X, it is possible to > migrate to 5.0.0 with no offline data migration and no need to move data. > Once these clusters are at 5.0.0, they will benefit from the performance > improvements, new features (such as file audit logging), and various > enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to > these clusters, and that is the increased number of sub-blocks per block > for small file allocation. This means that for file systems with a large > block size and a lot of small files, the overall space utilization will be > the same it currently is in 4.X.X. Since file systems created at 4.X.X and > earlier used a block size that kept this allocation in mind, there should > be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance > improvements, metadata improvements, updated compatibility, new > functionality, and all of the other enhancements will be immediately > available to you once you complete the upgrade to 5.0.0 -- with no need to > reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more > accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikhilk at us.ibm.com Thu Nov 30 00:00:23 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Wed, 29 Nov 2017 17:00:23 -0700 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Hi Aaron, By large block size we are primarily talking about block sizes 4 MB and greater. You are correct, in my previous message I neglected to mention the file create performance for small files on these larger block sizes due to the subblock change. In addition to the added space efficiency, small file creation (for example 32kB files) on large block size filesystems will improve. In the case of a 1 MB block size, there would be no real difference in file creates. For a 16 MB block size, however there will be a performance improvement for small file creation as a part of the subblock change for new filesystems. For users who are upgrading from 4.X.X to 5.0.0, the file creation speed will remain the same after the upgrade. I hope that helps, sorry for the confusion. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption From: Aaron Knister To: gpfsug main discussion list Date: 11/29/2017 03:42 PM Subject: Re: [gpfsug-discuss] Online data migration tool Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf from those presentations regarding 32 subblocks: "It has a significant performance penalty for small files in large block size filesystems" although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. -Aaron On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: Hi, I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. I hope that clarifies things a little and makes the upgrade path more accessible. Please let me know if there are any other questions or concerns. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WUJ15T9xHCCIfLm1wqC74jhfu28fXGLotYoHQvJlMCg&m=GNrHjCLvQL1u_WHVimX2lAlYOGPzciCFrYHGlae3h_E&s=VtVgCRl7kxNRgcl5QeHdZJ0Rz6jCA-jfQXyLztbr5TY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From abeattie at au1.ibm.com Thu Nov 30 01:55:54 2017 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 30 Nov 2017 01:55:54 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: , <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Thu Nov 30 15:35:32 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Thu, 30 Nov 2017 10:35:32 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Oh? I specifically remember Sven talking about the >32 subblocks on the context of file creation speed in addition to space efficiency. If what you?re saying is true, then why do those charts show that feature in the context of file creation performance and specifically mention it as a performance bottleneck? Are the slides incorrect or am I just reading them wrong? Sent from my iPhone > On Nov 30, 2017, at 10:05, Lyle Gayne wrote: > > Aaron, > that is a misunderstanding. The new feature for larger numbers of sub-blocks (varying by block size) has nothing to do with the 50K creates per second or many other performance patterns in GPFS. > > The improved create (and other metadata ops) rates came from identifying and mitigating various locking bottlenecks and optimizing the code paths specifically involved in those ops. > > Thanks > Lyle > > > Aaron Knister ---11/29/2017 05:42:26 PM---Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impressio > > From: Aaron Knister > To: gpfsug main discussion list > Date: 11/29/2017 05:42 PM > Subject: Re: [gpfsug-discuss] Online data migration tool > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: > > http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf > http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf > http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf > > from those presentations regarding 32 subblocks: > > "It has a significant performance penalty for small files in large block size filesystems" > > although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. > > -Aaron > > > On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=irBRNHjLNBazoPW27vuMTJGyZjdo_8yqZZNkY7RRh5I&s=8nZVi2Wp8LPbXo0Pg6ItJv6GEOk5jINHR05MY_H7a4w&e= > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Thu Nov 30 16:13:30 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 30 Nov 2017 16:13:30 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: <1512058410.18554.151.camel@strath.ac.uk> On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: [SNIP] > Since file systems created at 4.X.X and earlier used a block size > that kept this allocation in mind, there should be very little impact > on existing file systems. That is quite a presumption. I would say that file systems created at 4.X.X and earlier potentially used a block size that was the best *compromise*, and the new options would work a lot better. So for example supporting a larger block size for users who have sane workflows while still not wasting a ton of space for the biomedical folks who abuse the file system as a database. Though I have come to the conclusion to stop them using the file system as a database (no don't do ls in that directory there is 200,000 files and takes minutes to come back) is to put your BOFH hat on quota them on maximum file numbers and suggest to them that they use a database even if it is just sticking it all in SQLite :-D JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From valdis.kletnieks at vt.edu Thu Nov 30 16:27:39 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Thu, 30 Nov 2017 11:27:39 -0500 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? Message-ID: <20014.1512059259@turing-police.cc.vt.edu> We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From S.J.Thompson at bham.ac.uk Thu Nov 30 16:31:48 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 30 Nov 2017 16:31:48 +0000 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? In-Reply-To: <20014.1512059259@turing-police.cc.vt.edu> References: <20014.1512059259@turing-police.cc.vt.edu> Message-ID: Um no, you are talking GPFS protocol between cluster nodes still in multicluster. Contact nodes are where the remote cluster goes to start with, but after that it's just normal node to node gpfs traffic (not just the contact nodes). At least that is my understanding. If you want traffic separation, you need something like AFM. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valdis.kletnieks at vt.edu [valdis.kletnieks at vt.edu] Sent: 30 November 2017 16:27 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? From aaron.s.knister at nasa.gov Thu Nov 30 16:35:04 2017 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Thu, 30 Nov 2017 16:35:04 +0000 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? In-Reply-To: <20014.1512059259@turing-police.cc.vt.edu> References: <20014.1512059259@turing-police.cc.vt.edu> Message-ID: It?s my understanding and experience that all member nodes of two clusters that are multi-clustered must be able to (and will eventually given enough time/activity) make connections to any and all nodes in both clusters. Even if you don?t designate the 2 protocol nodes as contact nodes I would expect to see connections from remote clusters to the protocol nodes just because of the nature of the beast. If you don?t want remote nodes to make connections to the protocol nodes then I believe you would need to put the protocol nodes in their own cluster. CES/CNFS hasn?t always supported this but I think it is now supported, at least with NFS. On November 30, 2017 at 11:28:03 EST, valdis.kletnieks at vt.edu wrote: We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikhilk at us.ibm.com Thu Nov 30 17:00:08 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Thu, 30 Nov 2017 10:00:08 -0700 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: That is fair, there certainly are compromises that have to be made with regards to file space/size/performance when choosing a block size, especially with varied workloads or users who may create 200,000 files at a time :). With an increased the number of subblocks, the compromises and parameters going into this choice change. However, I just didn't want to lose sight of the fact that the remainder of the 5.0.0 features and enhancements (and there are a lot :-) ) are available to all systems, with no need to go through painful data movement or recreating of filesystems. Thanks, Nikhil Khandelwal Spectrum Scale Development Client Adoption From: Jonathan Buzzard To: gpfsug main discussion list Date: 11/30/2017 09:13 AM Subject: Re: [gpfsug-discuss] Online data migration tool Sent by: gpfsug-discuss-bounces at spectrumscale.org On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: [SNIP] > Since file systems created at 4.X.X and earlier used a block size > that kept this allocation in mind, there should be very little impact > on existing file systems. That is quite a presumption. I would say that file systems created at 4.X.X and earlier potentially used a block size that was the best *compromise*, and the new options would work a lot better. So for example supporting a larger block size for users who have sane workflows while still not wasting a ton of space for the biomedical folks who abuse the file system as a database. Though I have come to the conclusion to stop them using the file system as a database (no don't do ls in that directory there is 200,000 files and takes minutes to come back) is to put your BOFH hat on quota them on maximum file numbers and suggest to them that they use a database even if it is just sticking it all in SQLite :-D JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WUJ15T9xHCCIfLm1wqC74jhfu28fXGLotYoHQvJlMCg&m=RrwCj4KWyu_ykACVG1SYu8EJiDZnH6edu-2rnoalOg4&s=p7xlojuTYL5csXYA94NyL-R5hk7OgLH0qKGTN0peGFk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From skylar2 at u.washington.edu Thu Nov 30 18:01:48 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 18:01:48 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1512058410.18554.151.camel@strath.ac.uk> References: <1512058410.18554.151.camel@strath.ac.uk> Message-ID: <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> On Thu, Nov 30, 2017 at 04:13:30PM +0000, Jonathan Buzzard wrote: > On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: > > [SNIP] > > > Since file systems created at 4.X.X and earlier used a block size > > that kept this allocation in mind, there should be very little impact > > on existing file systems. > > That is quite a presumption. I would say that file systems created at > 4.X.X and earlier potentially used a block size that was the best > *compromise*, and the new options would work a lot better. > > So for example supporting a larger block size for users who have sane > workflows while still not wasting a ton of space for the biomedical > folks who abuse the file system as a database. > > Though I have come to the conclusion to stop them using the file system > as a database (no don't do ls in that directory there is 200,000 files > and takes minutes to come back) is to put your BOFH hat on quota them > on maximum file numbers and suggest to them that they use a database > even if it is just sticking it all in SQLite :-D To be fair, a lot of our biomedical/informatics folks have no choice in the matter because the vendors are imposing a filesystem-as-a-database paradigm on them. Each of our Illumina sequencers, for instance, generates a few million files per run, many of which are images containing raw data from the sequencers that are used to justify refunds for defective reagents. Sure, we could turn them off, but then we're eating $$$ we could be getting back from the vendor. At least SSD prices have come down far enough that we can put our metadata on fast disks now, even if we can't take advantage of the more efficient small file allocation yet. -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From makaplan at us.ibm.com Thu Nov 30 18:34:05 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 30 Nov 2017 13:34:05 -0500 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: It would be interesting to know how well Spectrum Scale large directory and small file features work in these sort of DB-ish applications. You might want to optimize by creating a file system provisioned and tuned for such application... Regardless of file system, `ls -1 | grep ...` in a huge directory is not going to be a good idea. But stats and/or opens on a huge directory to look for a particular file should work pretty well... -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Thu Nov 30 18:41:52 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 18:41:52 +0000 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: <20171130184152.ivvduyzjlp7etys2@utumno.gs.washington.edu> On Thu, Nov 30, 2017 at 01:34:05PM -0500, Marc A Kaplan wrote: > It would be interesting to know how well Spectrum Scale large directory > and small file features work in these sort of DB-ish applications. > > You might want to optimize by creating a file system provisioned and tuned > for such application... > > Regardless of file system, `ls -1 | grep ...` in a huge directory is not > going to be a good idea. But stats and/or opens on a huge directory to > look for a particular file should work pretty well... I've wondered if it would be worthwhile having POSIX look-alike commands like ls and find that plug into the GPFS API rather than making VFS calls. That's of course a project for my Copious Free Time... -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From makaplan at us.ibm.com Thu Nov 30 20:52:09 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 30 Nov 2017 15:52:09 -0500 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: Generally the GPFS API will give you access to some information and functionality that are not available via the Posix API. But I don't think you'll find significant performance difference in cases where there is functional overlap. Going either way (Posix or GPFS-specific) - for each API call the execution path drops into the kernel - and then if required - an inter-process call to the mmfsd daemon process. From: Skylar Thompson To: gpfsug-discuss at spectrumscale.org Date: 11/30/2017 01:42 PM Subject: Re: [gpfsug-discuss] FIle system vs Database Sent by: gpfsug-discuss-bounces at spectrumscale.org On Thu, Nov 30, 2017 at 01:34:05PM -0500, Marc A Kaplan wrote: > It would be interesting to know how well Spectrum Scale large directory > and small file features work in these sort of DB-ish applications. > > You might want to optimize by creating a file system provisioned and tuned > for such application... > > Regardless of file system, `ls -1 | grep ...` in a huge directory is not > going to be a good idea. But stats and/or opens on a huge directory to > look for a particular file should work pretty well... I've wondered if it would be worthwhile having POSIX look-alike commands like ls and find that plug into the GPFS API rather than making VFS calls. That's of course a project for my Copious Free Time... -- -- Skylar Thompson (skylar2 at u.washington.edu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Thu Nov 30 21:42:21 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 21:42:21 +0000 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: <20171130214220.pqtizt2q6ysu6cds@utumno.gs.washington.edu> Interesting, thanks for the information Marc. Could there be an improvement for something like "ls -l some-dir" using the API, though? Instead of getdents + stat for every file (entering and leaving kernel mode many times), could it be done in one operation with one context switch? On Thu, Nov 30, 2017 at 03:52:09PM -0500, Marc A Kaplan wrote: > Generally the GPFS API will give you access to some information and > functionality that are not available via the Posix API. > > But I don't think you'll find significant performance difference in cases > where there is functional overlap. > > Going either way (Posix or GPFS-specific) - for each API call the > execution path drops into the kernel - and then if required - an > inter-process call to the mmfsd daemon process. -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From jonathan.buzzard at strath.ac.uk Thu Nov 30 22:02:35 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 30 Nov 2017 22:02:35 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> References: <1512058410.18554.151.camel@strath.ac.uk> <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> Message-ID: <17e108bf-67af-78af-3e2d-e4a4b99c178d@strath.ac.uk> On 30/11/17 18:01, Skylar Thompson wrote: [SNIP] > To be fair, a lot of our biomedical/informatics folks have no choice in the > matter because the vendors are imposing a filesystem-as-a-database paradigm > on them. Each of our Illumina sequencers, for instance, generates a few > million files per run, many of which are images containing raw data from > the sequencers that are used to justify refunds for defective reagents. > Sure, we could turn them off, but then we're eating $$$ we could be getting > back from the vendor. > Been there too. What worked was having a find script that ran through their files, found directories that had not been accessed for a week and zipped them all up, before nuking the original files. The other thing I would suggest is if they want to buy sequencers from vendors who are brain dead, then that's fine but they are going to have to pay extra for the storage because they are costing way more than the average to store their files. Far to much buying of kit goes on without any thought of the consequences of how to deal with the data it generates. Then there where the proteomics bunch who basically just needed a good thrashing with a very large clue stick, because the zillions of files where the result of their own Perl scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Matthias.Knigge at rohde-schwarz.com Wed Nov 1 10:55:31 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 1 Nov 2017 11:55:31 +0100 Subject: [gpfsug-discuss] Combine different rules Message-ID: Hi at all, I configured a tiered storage with two pools. pool1 >> fast >> ssd pool2 >> slow >> sata First I created a fileset and a placement rule to copy the files to the fast storage. After a time of no access the files and folders should be moved to the slower storage. This could be done by a migration rule. I want to move the whole project folder to the slower storage. If a file in a project folder on the slower storage will be accessed this whole folder should be moved back to the faster storage. The rules must not run automatically. It is ok when this could be done by a cronjob over night. I am a beginner in writing rules. My idea is to write rules which listed files by date and by access and put the output into a file. After that a bash script can change the attributes of these files or rather folders. This could be done by the mmchattr command. If it is possible the mmapplypolicy command could be useful. Someone experiences in those cases? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 1 12:17:45 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 01 Nov 2017 12:17:45 +0000 Subject: [gpfsug-discuss] Combine different rules In-Reply-To: References: Message-ID: <1509538665.18554.1.camel@strath.ac.uk> On Wed, 2017-11-01 at 11:55 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi at all,? > > I configured a tiered storage with two pools.? > > pool1 ? ? ? ?>> ? ? ? ?fast ? ? ? ?>> ? ? ? ?ssd? > pool2 ? ? ? ?>> ? ? ? ?slow ? ? ? ?>> ? ? ? ?sata? > > First I created a fileset and a placement rule to copy the files to > the fast storage.? > > After a time of no access the files and folders should be moved to > the slower storage. This could be done by a migration rule. I want to > move the whole project folder to the slower storage.? Why move the whole project? Just wait if the files are not been accessed they will get moved in short order. You are really making it more complicated for no useful or practical gain. This is a basic policy to move old stuff from fast to slow disks. define(age,(DAYS(CURRENT_TIMESTAMP)-DAYS(ACCESS_TIME))) define(weighting, CASE ????????WHEN age>365 ????????????THEN age*KB_ALLOCATED ????????WHEN age<30 ????????????THEN 0 ????????ELSE ????????????KB_ALLOCATED ???????END ) RULE 'ilm' MIGRATE FROM POOL 'fast' THRESHOLD(90,70) WEIGHT(weighting) TO POOL 'slow' RULE 'new' SET POOL 'fast' LIMIT(95) RULE 'spillover' SET POOL 'slow' Basically it says when fast pool is 90% full, flush it down to 70% full, based on a weighting of the size and age. Basically older bigger files go first. The last two are critical. Allocate new files to the fast pool till it gets 95% full then start using the slow pool. Basically you have to stop allocating files to the fast pool long before it gets full otherwise you will end up with problems. Basically imagine there is 100KB left in the fast pool. I create a file which succeeds because there is space and start writing. When I get to 100KB the write fails because there is no space left in the pool, and a file can only be in one pool at a time. Generally programs will cleanup deleting the failed write at which point there will be space left and so the cycle goes on. You might want to force some file types onto slower disk. For example ISO images?don't really benefit from ever being on the fast disk. /* force ISO images onto nearline storage */ RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso' You also might want to punish people storing inappropriate files on your server so /* force MP3's and the like onto nearline storage forever */ RULE 'mp3' SET POOL 'slow' ????WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR LOWER(NAME) LIKE '%.wma' Another rule I used was to migrate files over to a certain size to the slow pool too. > > If a file in a project folder on the slower storage will be accessed > this whole folder should be moved back to the faster storage.? > Waste of time. In my experience the slow disks when not actually taking new files from a flush of the fast pools will be doing jack all. That is under 10 IOPS per second. That's because if you have everything sized correctly and the right rules people rarely go back to old files. As such the penalty for being on the slower disks is most none existent because there is loads of spare IO capacity on those disks. Secondly by the time you have spotted the files need moving the chances are your users have finished with them so moving them gains nothing. Thirdly if the users start working with those files any change to the file will result in a new file being written which will automatically go to the fast disks. It's the standard dance when you save a file; create new temporary file, write the contents, then do some renaming before deleting the old one. If you are insistent then something like the following would be a start, but moving a whole project would be a *lot* more complicated. I disabled the rule because it was a waste of time. I suggest running a similar rule that prints the files out so you can see how pointless it is. /* migrate recently accessed files back the fast disks */ RULE 'restore' MIGRATE FROM POOL 'slow' WEIGHT(KB_ALLOCATED) TO POOL 'fast' WHERE age < 1 Depending on the number of "projects" you anticipate you could allocate a project to a fileset and then move whole filesets about but I really think the idea is one of those that looks sensible at a high level but in practice is not sensible. > The rules must ?not run automatically. It is ok when this could be > done by a cronjob over night.? > I would argue strongly, very strongly that while you might want to flush the fast pool down every night to a certain amount free, you must have it set so that should it become full during the day an automatic flush is triggered. Failure to do so is guaranteed to bite you in the backside some time down the line. > I am a beginner in writing rules. My idea is to write rules which > listed files by date and by access and put the output into a file. > After that a bash script can change the attributes of these files or > rather folders.? Eh, you apply the policy and it does the work!!! More reading required on the subject I think. A bash script would be horribly slow. IBM have put a lot of work into making the policy engine really really fast. Messing about changing thousands if not millions of files with a bash script will be much much slower and is a recipe for disaster. Your users will put all sorts of random crap into file and directory names; backtick's, asterix's, question marks, newlines, UTF-8 characters etc. that will invariably break your bash script unless carefully escaped. There is no way for you to prevent this. It's the reason find/xargs have the -print0/-0 options, otherwise stuff will just mysteriously break on you. It's really better to just sidestep the whole issue and not process the files with scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From david_johnson at brown.edu Wed Nov 1 12:21:05 2017 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Wed, 1 Nov 2017 08:21:05 -0400 Subject: [gpfsug-discuss] Combine different rules In-Reply-To: References: Message-ID: <3D17430A-B572-4E8E-8CA3-0C308D38AE7B@brown.edu> Filesets and storage pools are for the most part orthogonal concepts. You would sort your users and apply quotas with filesets. You would use storage pools underneath filesets and the filesystem to migrate between faster and slower media. Migration between storage pools is done well by the policy engine with mmapplypolicy. Moving between filesets is entirely up to you, but the path names will change. Migration within a filesystem using storage pools preserves path names. -- ddj Dave Johnson > On Nov 1, 2017, at 6:55 AM, Matthias.Knigge at rohde-schwarz.com wrote: > > Hi at all, > > I configured a tiered storage with two pools. > > pool1 >> fast >> ssd > pool2 >> slow >> sata > > First I created a fileset and a placement rule to copy the files to the fast storage. > > After a time of no access the files and folders should be moved to the slower storage. This could be done by a migration rule. I want to move the whole project folder to the slower storage. > > If a file in a project folder on the slower storage will be accessed this whole folder should be moved back to the faster storage. > > The rules must not run automatically. It is ok when this could be done by a cronjob over night. > > I am a beginner in writing rules. My idea is to write rules which listed files by date and by access and put the output into a file. After that a bash script can change the attributes of these files or rather folders. > > This could be done by the mmchattr command. If it is possible the mmapplypolicy command could be useful. > > Someone experiences in those cases? > > Many thanks in advance! > > Matthias > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Wed Nov 1 12:36:18 2017 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Wed, 1 Nov 2017 07:36:18 -0500 Subject: [gpfsug-discuss] SC17 Spectrum Scale U/G Message-ID: Reminder: Please sign up so we have numbers for planning the happy hour. http://www.spectrumscale.org/ssug-at-sc17/ Douglas O'Flaherty IBM Spectrum Solutions -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 1 14:01:35 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 1 Nov 2017 15:01:35 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules In-Reply-To: <1509538665.18554.1.camel@strath.ac.uk> References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: Hi JAB, many thanks for your answer. Ok, some more background information: We are working with video realtime applications and uncompressed files. So one project is one folder and some subfolders. The size of one project could be more than 1TB. That is the reason why I want to move the whole folder tree. Moving old stuff to the slower storage is not the problem but moving the files back for working with the realtime applications. Not every file will be accessed when you open a project. The Clients get access via GPFS-Client (Windows) and over Samba. Another tool on storage side scan the files for creating playlists etc. While the migration the playout of the video files may not dropped. So I think the best way is to find a solution with mmapplypolicy manually or via crontab. Im must check the access time and the types of files. If I do not do this never a file will be moved the slower storage because the special tool always have access to the files. I will try some concepts and give feedback which solution is working for me. Matthias Von: Jonathan Buzzard An: gpfsug main discussion list Datum: 01.11.2017 13:18 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Wed, 2017-11-01 at 11:55 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi at all, > > I configured a tiered storage with two pools. > > pool1 >> fast >> ssd > pool2 >> slow >> sata > > First I created a fileset and a placement rule to copy the files to > the fast storage. > > After a time of no access the files and folders should be moved to > the slower storage. This could be done by a migration rule. I want to > move the whole project folder to the slower storage. Why move the whole project? Just wait if the files are not been accessed they will get moved in short order. You are really making it more complicated for no useful or practical gain. This is a basic policy to move old stuff from fast to slow disks. define(age,(DAYS(CURRENT_TIMESTAMP)-DAYS(ACCESS_TIME))) define(weighting, CASE WHEN age>365 THEN age*KB_ALLOCATED WHEN age<30 THEN 0 ELSE KB_ALLOCATED END ) RULE 'ilm' MIGRATE FROM POOL 'fast' THRESHOLD(90,70) WEIGHT(weighting) TO POOL 'slow' RULE 'new' SET POOL 'fast' LIMIT(95) RULE 'spillover' SET POOL 'slow' Basically it says when fast pool is 90% full, flush it down to 70% full, based on a weighting of the size and age. Basically older bigger files go first. The last two are critical. Allocate new files to the fast pool till it gets 95% full then start using the slow pool. Basically you have to stop allocating files to the fast pool long before it gets full otherwise you will end up with problems. Basically imagine there is 100KB left in the fast pool. I create a file which succeeds because there is space and start writing. When I get to 100KB the write fails because there is no space left in the pool, and a file can only be in one pool at a time. Generally programs will cleanup deleting the failed write at which point there will be space left and so the cycle goes on. You might want to force some file types onto slower disk. For example ISO images don't really benefit from ever being on the fast disk. /* force ISO images onto nearline storage */ RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso' You also might want to punish people storing inappropriate files on your server so /* force MP3's and the like onto nearline storage forever */ RULE 'mp3' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR LOWER(NAME) LIKE '%.wma' Another rule I used was to migrate files over to a certain size to the slow pool too. > > If a file in a project folder on the slower storage will be accessed > this whole folder should be moved back to the faster storage. > Waste of time. In my experience the slow disks when not actually taking new files from a flush of the fast pools will be doing jack all. That is under 10 IOPS per second. That's because if you have everything sized correctly and the right rules people rarely go back to old files. As such the penalty for being on the slower disks is most none existent because there is loads of spare IO capacity on those disks. Secondly by the time you have spotted the files need moving the chances are your users have finished with them so moving them gains nothing. Thirdly if the users start working with those files any change to the file will result in a new file being written which will automatically go to the fast disks. It's the standard dance when you save a file; create new temporary file, write the contents, then do some renaming before deleting the old one. If you are insistent then something like the following would be a start, but moving a whole project would be a *lot* more complicated. I disabled the rule because it was a waste of time. I suggest running a similar rule that prints the files out so you can see how pointless it is. /* migrate recently accessed files back the fast disks */ RULE 'restore' MIGRATE FROM POOL 'slow' WEIGHT(KB_ALLOCATED) TO POOL 'fast' WHERE age < 1 Depending on the number of "projects" you anticipate you could allocate a project to a fileset and then move whole filesets about but I really think the idea is one of those that looks sensible at a high level but in practice is not sensible. > The rules must not run automatically. It is ok when this could be > done by a cronjob over night. > I would argue strongly, very strongly that while you might want to flush the fast pool down every night to a certain amount free, you must have it set so that should it become full during the day an automatic flush is triggered. Failure to do so is guaranteed to bite you in the backside some time down the line. > I am a beginner in writing rules. My idea is to write rules which > listed files by date and by access and put the output into a file. > After that a bash script can change the attributes of these files or > rather folders. Eh, you apply the policy and it does the work!!! More reading required on the subject I think. A bash script would be horribly slow. IBM have put a lot of work into making the policy engine really really fast. Messing about changing thousands if not millions of files with a bash script will be much much slower and is a recipe for disaster. Your users will put all sorts of random crap into file and directory names; backtick's, asterix's, question marks, newlines, UTF-8 characters etc. that will invariably break your bash script unless carefully escaped. There is no way for you to prevent this. It's the reason find/xargs have the -print0/-0 options, otherwise stuff will just mysteriously break on you. It's really better to just sidestep the whole issue and not process the files with scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 1 14:12:43 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 01 Nov 2017 14:12:43 +0000 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules In-Reply-To: References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: <1509545563.18554.3.camel@strath.ac.uk> On Wed, 2017-11-01 at 15:01 +0100, Matthias.Knigge at rohde-schwarz.com wrote: > Hi JAB,? > > many thanks for your answer.? > > Ok, some more background information:? > > We are working with video realtime applications and uncompressed > files. So one project is one folder and some subfolders. The size of > one project could be more than 1TB. That is the reason why I want to > move the whole folder tree.? > That is not a reason to move the whole folder tree. If the "project" is inactive then the files in it are inactive and the normal "this file has not been accessed" type rules will in due course move the whole lot over to the slower storage. > Moving old stuff to the slower storage is not the problem but moving > the files back for working with the realtime applications. Not every > file will be accessed when you open a project.? > Yeah but you don't want these sorts of policies kicking in automatically. Further if someone where just to check or update a summary document stored with the videos, the whole lot would get moved back to fast disk. By the sounds of it you are going to have to run manual mmapplypolicies to move the groups of files around. Automating what you want is going to be next to impossible. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 1 14:43:27 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 1 Nov 2017 09:43:27 -0500 Subject: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Nov 1 14:59:22 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 1 Nov 2017 09:59:22 -0500 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - STAGING a fileset to a particular POOL In-Reply-To: References: <1509538665.18554.1.camel@strath.ac.uk> Message-ID: Not withstanding JAB's remark that this may not necessary: Some customers/admins will want to "stage" a fileset in anticipation of using the data therein. Conversely you can "destage" - just set the TO POOL accordingly. This can be accomplished with a policy rule like: RULE 'stage' MIGRATE FOR FILESET('myfileset') TO POOL 'mypool' /* no FROM POOL clause is required, files will come from any pool - for files already in mypool, no work is done */ And running a command like: mmapplypolicy /path-to/myfileset -P file-with-the-above-policy-rule -g /path-to/shared-temp -N nodelist-to-do-the-work ... (Specifying the path-to/myfileset on the command line will restrict the directory scan, making it go faster.) As JAB remarked, for GPFS POOL to GPFS POOL this may be overkill, but if the files have been "HSMed" migrated or archived to some really slow storage like TAPE ... they an analyst who want to explore the data interactively, might request a migration back to "real" disks (or SSDs) then go to lunch or go to bed ... --marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Wed Nov 1 22:54:04 2017 From: griznog at gmail.com (John Hanks) Date: Wed, 1 Nov 2017 15:54:04 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Message-ID: Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Thu Nov 2 07:11:58 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Thu, 02 Nov 2017 03:11:58 -0400 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: <44655.1509606718@turing-police.cc.vt.edu> On Wed, 01 Nov 2017 15:54:04 -0700, John Hanks said: > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device Check 'df -i' to make sure no file systems are out of inodes. That's From YARD at il.ibm.com Thu Nov 2 07:28:06 2017 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 2 Nov 2017 09:28:06 +0200 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: Hi Please check mmdf output to see that MetaData disks are not full, or you have i-nodes issue. In case you have Independent File-Sets , please run : mmlsfileset -L -i to get the status of each fileset inodes. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: John Hanks To: gpfsug Date: 11/02/2017 12:54 AM Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=WTfQpWOsmp-BdHZ0PWDbaYsxq-5Q1ZH26IyfrBRe3_c&s=SJg8NrUXWEpaxDhqECkwkbJ71jtxjLZz5jX7FxmYMBk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: From Matthias.Knigge at rohde-schwarz.com Thu Nov 2 09:07:48 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Thu, 2 Nov 2017 10:07:48 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: Thanks for this tip. I will try these commands and give feedback in the next week. Matthias Von: "Marc A Kaplan" An: gpfsug main discussion list Datum: 01.11.2017 15:43 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Nov 2 11:19:05 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 2 Nov 2017 11:19:05 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" Message-ID: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 14:43:31 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 07:43:31 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Nov 2 14:57:45 2017 From: david_johnson at brown.edu (David Johnson) Date: Thu, 2 Nov 2017 10:57:45 -0400 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: Message-ID: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert > wrote: > One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > From: > on behalf of John Hanks > > Reply-To: gpfsug main discussion list > > Date: Wednesday, November 1, 2017 at 5:55 PM > To: gpfsug > > Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" > > > > Hi all, <> > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 15:33:11 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 08:33:11 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > >> One thing that I?ve run into before is that on older file systems you had >> the ?*.quota? files in the file system root. If you upgraded the file >> system to a newer version (so these files aren?t used) - There was a bug at >> one time where these didn?t get properly migrated during a restripe. >> Solution was to just remove them >> >> >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> *From: * on behalf of John >> Hanks >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Wednesday, November 1, 2017 at 5:55 PM >> *To: *gpfsug >> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on >> device" >> >> >> >> Hi all, >> >> >> >> I'm trying to do a restripe after setting some nsds to metadataOnly and I >> keep running into this error: >> >> >> >> Scanning user file metadata ... >> >> 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with >> total 531689 MB data processed) >> >> Error processing user file metadata. >> >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on >> scg-gs0 for inodes with broken disk addresses or failures. >> >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> >> >> The file it points to says: >> >> >> >> This inode list was generated in the Parallel Inode Traverse on Wed Nov >> 1 15:36:06 2017 >> >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> >> 53504 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> >> >> >> >> /var on the node I am running this on has > 128 GB free, all the NSDs >> have plenty of free space, the filesystem being restriped has plenty of >> free space and if I watch the node while running this no filesystem on it >> even starts to get full. Could someone tell me where mmrestripefs is >> attempting to write and/or how to point it at a different location? >> >> >> >> Thanks, >> >> >> >> jbh >> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 15:44:08 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 15:44:08 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 15:55:12 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 15:55:12 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 16:13:16 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 09:13:16 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Hmm, this sounds suspicious. We have 10 NSDs in a pool called system. These were previously set to data+metaData with a policy that placed our home directory filesets on this pool. A few weeks ago the NSDs in this pool all filled up. To remedy that I 1. removed old snapshots 2. deleted some old homedir filesets 3. set the NSDs in this pool to metadataOnly 4. changed the policy to point homedir filesets to another pool. 5. ran a migrate policy to migrate all homedir filesets to this other pool After all that I now have ~30% free space on the metadata pool. Our three pools are system (metadataOnly), sas0 (data), sata0 (data) mmrestripefs gsfs0 -r fails immdieately mmrestripefs gsfs0 -r -P system fails immediately mmrestripefs gsfs0 -r -P sas0 fails immediately mmrestripefs gsfs0 -r -P sata0 is running (currently about 3% done) Is the change from data+metadata to metadataOnly the same as removing a disk (for the purposes of this problem) or is it possible my policy is confusing things? [root at scg-gs0 ~]# mmlspolicy gsfs0 Policy for file system '/dev/gsfs0': Installed by root at scg-gs0 on Wed Nov 1 09:30:40 2017. First line of policy 'policy_placement.txt' is: RULE 'homedirs' SET POOL 'sas0' WHERE FILESET_NAME LIKE 'home.%' The policy I used to migrate these filesets is: RULE 'homedirs' MIGRATE TO POOL 'sas0' WHERE FILESET_NAME LIKE 'home.%' jbh On Thu, Nov 2, 2017 at 8:44 AM, Scott Fadden wrote: > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: John Hanks > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson > wrote: > > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r= > WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m= > hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s= > j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 16:19:55 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 09:19:55 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: gpfsug-discuss at spectrumscale.org > Cc: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > ----- Original message ----- > From: John Hanks > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson > wrote: > > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > > On Nov 2, 2017, at 10:43 AM, John Hanks wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: * on behalf of John > Hanks > *Reply-To: *gpfsug main discussion list > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r= > WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m= > hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s= > j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Thu Nov 2 16:41:36 2017 From: sfadden at us.ibm.com (Scott Fadden) Date: Thu, 2 Nov 2017 16:41:36 +0000 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: , <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Nov 2 16:45:30 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 2 Nov 2017 11:45:30 -0500 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Assuming you are replicating data and metadata have you confirmed that all failure groups have the same free space? That is could it be that one of your failure groups has less space than the others? You can verify this with the output of mmdf and look at the NSD sizes and space available. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 12:20 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: Sorry just reread as I hit send and saw this was mmrestripe, in my case it was mmdeledisk. Did you try running the command on just one pool. Or using -B instead? What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ? Looks like it could be related to the maxfeaturelevel of the cluster. Have you recently upgraded? Is everything up to the same level? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Scott Fadden/Portland/IBM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:44 AM I opened a defect on this the other day, in my case it was an incorrect error message. What it meant to say was,"The pool is not empty." Are you trying to remove the last disk in a pool? If so did you empty the pool with a MIGRATE policy first? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: John Hanks Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:34 AM We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University On Nov 2, 2017, at 10:43 AM, John Hanks wrote: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks < griznog at gmail.com> Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 17:16:36 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 10:16:36 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: We do have different amounts of space in the system pool which had the changes applied: [root at scg4-hn01 ~]# mmdf gsfs0 -P system disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) VD000 377487360 100 Yes No 143109120 ( 38%) 35708688 ( 9%) DMD_NSD_804 377487360 100 Yes No 79526144 ( 21%) 2924584 ( 1%) VD002 377487360 100 Yes No 143067136 ( 38%) 35713888 ( 9%) DMD_NSD_802 377487360 100 Yes No 79570432 ( 21%) 2926672 ( 1%) VD004 377487360 100 Yes No 143107584 ( 38%) 35727776 ( 9%) DMD_NSD_805 377487360 200 Yes No 79555584 ( 21%) 2940040 ( 1%) VD001 377487360 200 Yes No 142964992 ( 38%) 35805384 ( 9%) DMD_NSD_803 377487360 200 Yes No 79580160 ( 21%) 2919560 ( 1%) VD003 377487360 200 Yes No 143132672 ( 38%) 35764200 ( 9%) DMD_NSD_801 377487360 200 Yes No 79550208 ( 21%) 2915232 ( 1%) ------------- -------------------- ------------------- (pool total) 3774873600 1113164032 ( 29%) 193346024 ( 5%) and mmldisk shows that there is a problem with replication: ... Number of quorum disks: 5 Read quorum value: 3 Write quorum value: 3 Attention: Due to an earlier configuration change the file system is no longer properly replicated. I thought a 'mmrestripe -r' would fix this, not that I have to fix it first before restriping? jbh On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock wrote: > Assuming you are replicating data and metadata have you confirmed that all > failure groups have the same free space? That is could it be that one of > your failure groups has less space than the others? You can verify this > with the output of mmdf and look at the NSD sizes and space available. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > From: John Hanks > To: gpfsug main discussion list > Date: 11/02/2017 12:20 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Addendum to last message: > > We haven't upgraded recently as far as I know (I just inherited this a > couple of months ago.) but am planning an outage soon to upgrade from > 4.2.0-4 to 4.2.3-5. > > My growing collection of output files generally contain something like > > This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 > 08:34:22 2017 > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > 53506 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > With that inode varying slightly. > > jbh > > On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* > > wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: *gpfsug-discuss at spectrumscale.org* > Cc: *gpfsug-discuss at spectrumscale.org* > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: John Hanks <*griznog at gmail.com* > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* > > wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* > > wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > *Robert.Oesterlin at nuance.com* > wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: *<*gpfsug-discuss-bounces at spectrumscale.org* > > on behalf of John Hanks < > *griznog at gmail.com* > > *Reply-To: *gpfsug main discussion list < > *gpfsug-discuss at spectrumscale.org* > > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* > > > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_ > FF6spqHVpo_0joLY&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Nov 2 17:57:45 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 2 Nov 2017 12:57:45 -0500 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Did you run the tsfindinode command to see where that file is located? Also, what does the mmdf show for your other pools notably the sas0 storage pool? Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 01:17 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org We do have different amounts of space in the system pool which had the changes applied: [root at scg4-hn01 ~]# mmdf gsfs0 -P system disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) VD000 377487360 100 Yes No 143109120 ( 38%) 35708688 ( 9%) DMD_NSD_804 377487360 100 Yes No 79526144 ( 21%) 2924584 ( 1%) VD002 377487360 100 Yes No 143067136 ( 38%) 35713888 ( 9%) DMD_NSD_802 377487360 100 Yes No 79570432 ( 21%) 2926672 ( 1%) VD004 377487360 100 Yes No 143107584 ( 38%) 35727776 ( 9%) DMD_NSD_805 377487360 200 Yes No 79555584 ( 21%) 2940040 ( 1%) VD001 377487360 200 Yes No 142964992 ( 38%) 35805384 ( 9%) DMD_NSD_803 377487360 200 Yes No 79580160 ( 21%) 2919560 ( 1%) VD003 377487360 200 Yes No 143132672 ( 38%) 35764200 ( 9%) DMD_NSD_801 377487360 200 Yes No 79550208 ( 21%) 2915232 ( 1%) ------------- -------------------- ------------------- (pool total) 3774873600 1113164032 ( 29%) 193346024 ( 5%) and mmldisk shows that there is a problem with replication: ... Number of quorum disks: 5 Read quorum value: 3 Write quorum value: 3 Attention: Due to an earlier configuration change the file system is no longer properly replicated. I thought a 'mmrestripe -r' would fix this, not that I have to fix it first before restriping? jbh On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock wrote: Assuming you are replicating data and metadata have you confirmed that all failure groups have the same free space? That is could it be that one of your failure groups has less space than the others? You can verify this with the output of mmdf and look at the NSD sizes and space available. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: John Hanks To: gpfsug main discussion list Date: 11/02/2017 12:20 PM Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Sent by: gpfsug-discuss-bounces at spectrumscale.org Addendum to last message: We haven't upgraded recently as far as I know (I just inherited this a couple of months ago.) but am planning an outage soon to upgrade from 4.2.0-4 to 4.2.3-5. My growing collection of output files generally contain something like This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 08:34:22 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53506 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device With that inode varying slightly. jbh On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden wrote: Sorry just reread as I hit send and saw this was mmrestripe, in my case it was mmdeledisk. Did you try running the command on just one pool. Or using -B instead? What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ? Looks like it could be related to the maxfeaturelevel of the cluster. Have you recently upgraded? Is everything up to the same level? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Scott Fadden/Portland/IBM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:44 AM I opened a defect on this the other day, in my case it was an incorrect error message. What it meant to say was,"The pool is not empty." Are you trying to remove the last disk in a pool? If so did you empty the pool with a MIGRATE policy first? Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: John Hanks Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" Date: Thu, Nov 2, 2017 8:34 AM We have no snapshots ( they were the first to go when we initially hit the full metadata NSDs). I've increased quotas so that no filesets have hit a space quota. Verified that there are no inode quotas anywhere. mmdf shows the least amount of free space on any nsd to be 9% free. Still getting this error: [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 Scanning file system metadata, phase 1 ... Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata for sas0 storage pool Scanning file system metadata for sata0 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... Error processing user file metadata. No space left on device Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. I should note too that this fails almost immediately, far to quickly to fill up any location it could be trying to write to. jbh On Thu, Nov 2, 2017 at 7:57 AM, David Johnson wrote: One thing that may be relevant is if you have snapshots, depending on your release level, inodes in the snapshot may considered immutable, and will not be migrated. Once the snapshots have been deleted, the inodes are freed up and you won?t see the (somewhat misleading) message about no space. ? ddj Dave Johnson Brown University On Nov 2, 2017, at 10:43 AM, John Hanks wrote: Thanks all for the suggestions. Having our metadata NSDs fill up was what prompted this exercise, but space was previously feed up on those by switching them from metadata+data to metadataOnly and using a policy to migrate files out of that pool. So these now have about 30% free space (more if you include fragmented space). The restripe attempt is just to make a final move of any remaining data off those devices. All the NSDs now have free space on them. df -i shows inode usage at about 84%, so plenty of free inodes for the filesystem as a whole. We did have old .quota files laying around but removing them didn't have any impact. mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer while getting to work. mmrepquota does show about a half-dozen filesets that have hit their quota for space (we don't set quotas on inodes). Once I'm settled in this morning I'll try giving them a little extra space and see what happens. jbh On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < Robert.Oesterlin at nuance.com> wrote: One thing that I?ve run into before is that on older file systems you had the ?*.quota? files in the file system root. If you upgraded the file system to a newer version (so these files aren?t used) - There was a bug at one time where these didn?t get properly migrated during a restripe. Solution was to just remove them Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hanks < griznog at gmail.com> Reply-To: gpfsug main discussion list Date: Wednesday, November 1, 2017 at 5:55 PM To: gpfsug Subject: [EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on device" Hi all, I'm trying to do a restripe after setting some nsds to metadataOnly and I keep running into this error: Scanning user file metadata ... 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with total 531689 MB data processed) Error processing user file metadata. Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on scg-gs0 for inodes with broken disk addresses or failures. mmrestripefs: Command failed. Examine previous error messages to determine cause. The file it points to says: This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 15:36:06 2017 INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID MEMO(INODE_FLAGS FILE_TYPE [ERROR]) 53504 0:0 0 1 0 illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device /var on the node I am running this on has > 128 GB free, all the NSDs have plenty of free space, the filesystem being restriped has plenty of free space and if I watch the node while running this no filesystem on it even starts to get full. Could someone tell me where mmrestripefs is attempting to write and/or how to point it at a different location? Thanks, jbh _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s=yDRpuvz3LOTwvP2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 18:14:44 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 11:14:44 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: tsfindiconde tracked the file to user.quota, which somehow escaped my previous attempt to "mv *.quota /elsewhere/" I've moved that now and verified it is actually gone and will retry once the current restripe on the sata0 pool is wrapped up. jbh On Thu, Nov 2, 2017 at 10:57 AM, Frederick Stock wrote: > Did you run the tsfindinode command to see where that file is located? > Also, what does the mmdf show for your other pools notably the sas0 storage > pool? > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > From: John Hanks > To: gpfsug main discussion list > Date: 11/02/2017 01:17 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > We do have different amounts of space in the system pool which had the > changes applied: > > [root at scg4-hn01 ~]# mmdf gsfs0 -P system > disk disk size failure holds holds free > KB free KB > name in KB group metadata data in full > blocks in fragments > --------------- ------------- -------- -------- ----- -------------------- > ------------------- > Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) > VD000 377487360 100 Yes No 143109120 ( > 38%) 35708688 ( 9%) > DMD_NSD_804 377487360 100 Yes No 79526144 ( > 21%) 2924584 ( 1%) > VD002 377487360 100 Yes No 143067136 ( > 38%) 35713888 ( 9%) > DMD_NSD_802 377487360 100 Yes No 79570432 ( > 21%) 2926672 ( 1%) > VD004 377487360 100 Yes No 143107584 ( > 38%) 35727776 ( 9%) > DMD_NSD_805 377487360 200 Yes No 79555584 ( > 21%) 2940040 ( 1%) > VD001 377487360 200 Yes No 142964992 ( > 38%) 35805384 ( 9%) > DMD_NSD_803 377487360 200 Yes No 79580160 ( > 21%) 2919560 ( 1%) > VD003 377487360 200 Yes No 143132672 ( > 38%) 35764200 ( 9%) > DMD_NSD_801 377487360 200 Yes No 79550208 ( > 21%) 2915232 ( 1%) > ------------- -------------------- > ------------------- > (pool total) 3774873600 1113164032 ( > 29%) 193346024 ( 5%) > > > and mmldisk shows that there is a problem with replication: > > ... > Number of quorum disks: 5 > Read quorum value: 3 > Write quorum value: 3 > Attention: Due to an earlier configuration change the file system > is no longer properly replicated. > > > I thought a 'mmrestripe -r' would fix this, not that I have to fix it > first before restriping? > > jbh > > > On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <*stockf at us.ibm.com* > > wrote: > Assuming you are replicating data and metadata have you confirmed that all > failure groups have the same free space? That is could it be that one of > your failure groups has less space than the others? You can verify this > with the output of mmdf and look at the NSD sizes and space available. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *stockf at us.ibm.com* > > > > From: John Hanks <*griznog at gmail.com* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 11/02/2017 12:20 PM > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on > device" > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Addendum to last message: > > We haven't upgraded recently as far as I know (I just inherited this a > couple of months ago.) but am planning an outage soon to upgrade from > 4.2.0-4 to 4.2.3-5. > > My growing collection of output files generally contain something like > > This inode list was generated in the Parallel Inode Traverse on Thu Nov 2 > 08:34:22 2017 > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > 53506 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > With that inode varying slightly. > > jbh > > On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* > > wrote: > Sorry just reread as I hit send and saw this was mmrestripe, in my case it > was mmdeledisk. > > Did you try running the command on just one pool. Or using -B instead? > > What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" > ? > > Looks like it could be related to the maxfeaturelevel of the cluster. Have > you recently upgraded? Is everything up to the same level? > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: Scott Fadden/Portland/IBM > To: *gpfsug-discuss at spectrumscale.org* > Cc: *gpfsug-discuss at spectrumscale.org* > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:44 AM > > I opened a defect on this the other day, in my case it was an incorrect > error message. What it meant to say was,"The pool is not empty." Are you > trying to remove the last disk in a pool? If so did you empty the pool with > a MIGRATE policy first? > > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: *(503) 880-5833* <(503)%20880-5833> > *sfadden at us.ibm.com* > *http://www.ibm.com/systems/storage/spectrum/scale* > > > > ----- Original message ----- > From: John Hanks <*griznog at gmail.com* > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" > Date: Thu, Nov 2, 2017 8:34 AM > > We have no snapshots ( they were the first to go when we initially hit the > full metadata NSDs). > > I've increased quotas so that no filesets have hit a space quota. > > Verified that there are no inode quotas anywhere. > > mmdf shows the least amount of free space on any nsd to be 9% free. > > Still getting this error: > > [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3 > Scanning file system metadata, phase 1 ... > Scan completed successfully. > Scanning file system metadata, phase 2 ... > Scanning file system metadata for sas0 storage pool > Scanning file system metadata for sata0 storage pool > Scan completed successfully. > Scanning file system metadata, phase 3 ... > Scan completed successfully. > Scanning file system metadata, phase 4 ... > Scan completed successfully. > Scanning user file metadata ... > Error processing user file metadata. > No space left on device > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on > scg-gs0 for inodes with broken disk addresses or failures. > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > I should note too that this fails almost immediately, far to quickly to > fill up any location it could be trying to write to. > > jbh > > On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* > > wrote: > One thing that may be relevant is if you have snapshots, depending on your > release level, > inodes in the snapshot may considered immutable, and will not be > migrated. Once the snapshots > have been deleted, the inodes are freed up and you won?t see the (somewhat > misleading) message > about no space. > > ? ddj > Dave Johnson > Brown University > > On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* > > wrote: > Thanks all for the suggestions. > > Having our metadata NSDs fill up was what prompted this exercise, but > space was previously feed up on those by switching them from metadata+data > to metadataOnly and using a policy to migrate files out of that pool. So > these now have about 30% free space (more if you include fragmented space). > The restripe attempt is just to make a final move of any remaining data off > those devices. All the NSDs now have free space on them. > > df -i shows inode usage at about 84%, so plenty of free inodes for the > filesystem as a whole. > > We did have old .quota files laying around but removing them didn't have > any impact. > > mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer > while getting to work. > > mmrepquota does show about a half-dozen filesets that have hit their quota > for space (we don't set quotas on inodes). Once I'm settled in this morning > I'll try giving them a little extra space and see what happens. > > jbh > > > On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < > *Robert.Oesterlin at nuance.com* > wrote: > One thing that I?ve run into before is that on older file systems you had > the ?*.quota? files in the file system root. If you upgraded the file > system to a newer version (so these files aren?t used) - There was a bug at > one time where these didn?t get properly migrated during a restripe. > Solution was to just remove them > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > *From: *<*gpfsug-discuss-bounces at spectrumscale.org* > > on behalf of John Hanks < > *griznog at gmail.com* > > *Reply-To: *gpfsug main discussion list < > *gpfsug-discuss at spectrumscale.org* > > *Date: *Wednesday, November 1, 2017 at 5:55 PM > *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* > > > *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on > device" > > > > Hi all, > > > > I'm trying to do a restripe after setting some nsds to metadataOnly and I > keep running into this error: > > > > Scanning user file metadata ... > > 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with > total 531689 MB data processed) > > Error processing user file metadata. > > Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on > scg-gs0 for inodes with broken disk addresses or failures. > > mmrestripefs: Command failed. Examine previous error messages to determine > cause. > > > > The file it points to says: > > > > This inode list was generated in the Parallel Inode Traverse on Wed Nov 1 > 15:36:06 2017 > > INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID > MEMO(INODE_FLAGS FILE_TYPE [ERROR]) > > 53504 0:0 0 1 0 > illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device > > > > > > /var on the node I am running this on has > 128 GB free, all the NSDs have > plenty of free space, the filesystem being restriped has plenty of free > space and if I watch the node while running this no filesystem on it even > starts to get full. Could someone tell me where mmrestripefs is attempting > to write and/or how to point it at a different location? > > > > Thanks, > > > > jbh > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman* > > /listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s= > yDRpuvz3LOTwvP2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griznog at gmail.com Thu Nov 2 18:18:27 2017 From: griznog at gmail.com (John Hanks) Date: Thu, 2 Nov 2017 11:18:27 -0700 Subject: [gpfsug-discuss] mmrestripefs "No space left on device" In-Reply-To: References: <58E1A256-CCE0-4C87-83C5-D0A7AC50A880@brown.edu> Message-ID: Yep, looks like Robert Oesterlin was right, it was the old quota files causing the snag. Now sure how "mv *.quota" managed to move the group file and not the user file, but I'll let that remain a mystery of the universe. In any case I have a restripe running now and have learned a LOT about all the bits in the process. Many thanks to everyone who replied, I learn something from this list every time I get near it. Thank you, jbh On Thu, Nov 2, 2017 at 11:14 AM, John Hanks wrote: > tsfindiconde tracked the file to user.quota, which somehow escaped my > previous attempt to "mv *.quota /elsewhere/" I've moved that now and > verified it is actually gone and will retry once the current restripe on > the sata0 pool is wrapped up. > > jbh > > On Thu, Nov 2, 2017 at 10:57 AM, Frederick Stock > wrote: > >> Did you run the tsfindinode command to see where that file is located? >> Also, what does the mmdf show for your other pools notably the sas0 storage >> pool? >> >> Fred >> __________________________________________________ >> Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> >> stockf at us.ibm.com >> >> >> >> From: John Hanks >> To: gpfsug main discussion list >> Date: 11/02/2017 01:17 PM >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on >> device" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> ------------------------------ >> >> >> >> We do have different amounts of space in the system pool which had the >> changes applied: >> >> [root at scg4-hn01 ~]# mmdf gsfs0 -P system >> disk disk size failure holds holds free >> KB free KB >> name in KB group metadata data in full >> blocks in fragments >> --------------- ------------- -------- -------- ----- >> -------------------- ------------------- >> Disks in storage pool: system (Maximum disk size allowed is 3.6 TB) >> VD000 377487360 100 Yes No 143109120 ( >> 38%) 35708688 ( 9%) >> DMD_NSD_804 377487360 100 Yes No 79526144 ( >> 21%) 2924584 ( 1%) >> VD002 377487360 100 Yes No 143067136 ( >> 38%) 35713888 ( 9%) >> DMD_NSD_802 377487360 100 Yes No 79570432 ( >> 21%) 2926672 ( 1%) >> VD004 377487360 100 Yes No 143107584 ( >> 38%) 35727776 ( 9%) >> DMD_NSD_805 377487360 200 Yes No 79555584 ( >> 21%) 2940040 ( 1%) >> VD001 377487360 200 Yes No 142964992 ( >> 38%) 35805384 ( 9%) >> DMD_NSD_803 377487360 200 Yes No 79580160 ( >> 21%) 2919560 ( 1%) >> VD003 377487360 200 Yes No 143132672 ( >> 38%) 35764200 ( 9%) >> DMD_NSD_801 377487360 200 Yes No 79550208 ( >> 21%) 2915232 ( 1%) >> ------------- >> -------------------- ------------------- >> (pool total) 3774873600 1113164032 ( >> 29%) 193346024 ( 5%) >> >> >> and mmldisk shows that there is a problem with replication: >> >> ... >> Number of quorum disks: 5 >> Read quorum value: 3 >> Write quorum value: 3 >> Attention: Due to an earlier configuration change the file system >> is no longer properly replicated. >> >> >> I thought a 'mmrestripe -r' would fix this, not that I have to fix it >> first before restriping? >> >> jbh >> >> >> On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <*stockf at us.ibm.com* >> > wrote: >> Assuming you are replicating data and metadata have you confirmed that >> all failure groups have the same free space? That is could it be that one >> of your failure groups has less space than the others? You can verify this >> with the output of mmdf and look at the NSD sizes and space available. >> >> Fred >> __________________________________________________ >> Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> >> *stockf at us.ibm.com* >> >> >> >> From: John Hanks <*griznog at gmail.com* > >> To: gpfsug main discussion list < >> *gpfsug-discuss at spectrumscale.org* > >> Date: 11/02/2017 12:20 PM >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on >> device" >> Sent by: *gpfsug-discuss-bounces at spectrumscale.org* >> >> ------------------------------ >> >> >> >> Addendum to last message: >> >> We haven't upgraded recently as far as I know (I just inherited this a >> couple of months ago.) but am planning an outage soon to upgrade from >> 4.2.0-4 to 4.2.3-5. >> >> My growing collection of output files generally contain something like >> >> This inode list was generated in the Parallel Inode Traverse on Thu Nov >> 2 08:34:22 2017 >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> 53506 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> With that inode varying slightly. >> >> jbh >> >> On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <*sfadden at us.ibm.com* >> > wrote: >> Sorry just reread as I hit send and saw this was mmrestripe, in my case >> it was mmdeledisk. >> >> Did you try running the command on just one pool. Or using -B instead? >> >> What is the file it is complaining about in "/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" >> ? >> >> Looks like it could be related to the maxfeaturelevel of the cluster. >> Have you recently upgraded? Is everything up to the same level? >> >> Scott Fadden >> Spectrum Scale - Technical Marketing >> Phone: *(503) 880-5833* <(503)%20880-5833> >> *sfadden at us.ibm.com* >> *http://www.ibm.com/systems/storage/spectrum/scale* >> >> >> >> ----- Original message ----- >> From: Scott Fadden/Portland/IBM >> To: *gpfsug-discuss at spectrumscale.org* >> Cc: *gpfsug-discuss at spectrumscale.org* >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" >> Date: Thu, Nov 2, 2017 8:44 AM >> >> I opened a defect on this the other day, in my case it was an incorrect >> error message. What it meant to say was,"The pool is not empty." Are you >> trying to remove the last disk in a pool? If so did you empty the pool with >> a MIGRATE policy first? >> >> >> Scott Fadden >> Spectrum Scale - Technical Marketing >> Phone: *(503) 880-5833* <(503)%20880-5833> >> *sfadden at us.ibm.com* >> *http://www.ibm.com/systems/storage/spectrum/scale* >> >> >> >> ----- Original message ----- >> From: John Hanks <*griznog at gmail.com* > >> Sent by: *gpfsug-discuss-bounces at spectrumscale.org* >> >> To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* >> > >> Cc: >> Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device" >> Date: Thu, Nov 2, 2017 8:34 AM >> >> We have no snapshots ( they were the first to go when we initially hit >> the full metadata NSDs). >> >> I've increased quotas so that no filesets have hit a space quota. >> >> Verified that there are no inode quotas anywhere. >> >> mmdf shows the least amount of free space on any nsd to be 9% free. >> >> Still getting this error: >> >> [root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs >> 3 >> Scanning file system metadata, phase 1 ... >> Scan completed successfully. >> Scanning file system metadata, phase 2 ... >> Scanning file system metadata for sas0 storage pool >> Scanning file system metadata for sata0 storage pool >> Scan completed successfully. >> Scanning file system metadata, phase 3 ... >> Scan completed successfully. >> Scanning file system metadata, phase 4 ... >> Scan completed successfully. >> Scanning user file metadata ... >> Error processing user file metadata. >> No space left on device >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on >> scg-gs0 for inodes with broken disk addresses or failures. >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> I should note too that this fails almost immediately, far to quickly to >> fill up any location it could be trying to write to. >> >> jbh >> >> On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <*david_johnson at brown.edu* >> > wrote: >> One thing that may be relevant is if you have snapshots, depending on >> your release level, >> inodes in the snapshot may considered immutable, and will not be >> migrated. Once the snapshots >> have been deleted, the inodes are freed up and you won?t see the >> (somewhat misleading) message >> about no space. >> >> ? ddj >> Dave Johnson >> Brown University >> >> On Nov 2, 2017, at 10:43 AM, John Hanks <*griznog at gmail.com* >> > wrote: >> Thanks all for the suggestions. >> >> Having our metadata NSDs fill up was what prompted this exercise, but >> space was previously feed up on those by switching them from metadata+data >> to metadataOnly and using a policy to migrate files out of that pool. So >> these now have about 30% free space (more if you include fragmented space). >> The restripe attempt is just to make a final move of any remaining data off >> those devices. All the NSDs now have free space on them. >> >> df -i shows inode usage at about 84%, so plenty of free inodes for the >> filesystem as a whole. >> >> We did have old .quota files laying around but removing them didn't have >> any impact. >> >> mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer >> while getting to work. >> >> mmrepquota does show about a half-dozen filesets that have hit their >> quota for space (we don't set quotas on inodes). Once I'm settled in this >> morning I'll try giving them a little extra space and see what happens. >> >> jbh >> >> >> On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert < >> *Robert.Oesterlin at nuance.com* > wrote: >> One thing that I?ve run into before is that on older file systems you had >> the ?*.quota? files in the file system root. If you upgraded the file >> system to a newer version (so these files aren?t used) - There was a bug at >> one time where these didn?t get properly migrated during a restripe. >> Solution was to just remove them >> >> >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> *From: *<*gpfsug-discuss-bounces at spectrumscale.org* >> > on behalf of John Hanks < >> *griznog at gmail.com* > >> *Reply-To: *gpfsug main discussion list < >> *gpfsug-discuss at spectrumscale.org* > >> *Date: *Wednesday, November 1, 2017 at 5:55 PM >> *To: *gpfsug <*gpfsug-discuss at spectrumscale.org* >> > >> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on >> device" >> >> >> >> Hi all, >> >> >> >> I'm trying to do a restripe after setting some nsds to metadataOnly and I >> keep running into this error: >> >> >> >> Scanning user file metadata ... >> >> 0.01 % complete on Wed Nov 1 15:36:01 2017 ( 40960 inodes with >> total 531689 MB data processed) >> >> Error processing user file metadata. >> >> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on >> scg-gs0 for inodes with broken disk addresses or failures. >> >> mmrestripefs: Command failed. Examine previous error messages to >> determine cause. >> >> >> >> The file it points to says: >> >> >> >> This inode list was generated in the Parallel Inode Traverse on Wed Nov >> 1 15:36:06 2017 >> >> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID >> MEMO(INODE_FLAGS FILE_TYPE [ERROR]) >> >> 53504 0:0 0 1 0 >> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device >> >> >> >> >> >> /var on the node I am running this on has > 128 GB free, all the NSDs >> have plenty of free space, the filesystem being restriped has plenty of >> free space and if I watch the node while running this no filesystem on it >> even starts to get full. Could someone tell me where mmrestripefs is >> attempting to write and/or how to point it at a different location? >> >> >> >> Thanks, >> >> >> >> jbh >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman* >> >> /listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman* >> >> /listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> >> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=hKtOnoUDijNQoFnSlxQfek9m6h2qKbqjcCswbjHg2-E&s=j7eYU1VnwYXrTnflbJki13EfnMjqAro0RdCiLkVrgzE&e=* >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> >> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=uLFESUsuxpmf07haYD3Sl-DpeYkm3t_r0WVV2AZ9Jk0&s=RGgSZEisfDpxsKl3PFUWh6DtzD_FF6spqHVpo_0joLY&e=* >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.o >> rg_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObT >> bx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= >> XPw1EyoosGN5bt3yLIT1JbUJ73B6iWH2gBaDJ2xHW8M&s=yDRpuvz3LOTwvP >> 2pkIJEU7NWUxwMOcYHyXBRoWCPF-s&e= >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Sat Nov 4 16:14:46 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Sat, 4 Nov 2017 12:14:46 -0400 Subject: [gpfsug-discuss] file layout API + file fragmentation Message-ID: <83ed4b5a-cf9e-12da-e460-e34a6492afcf@nasa.gov> I've got a question about the file layout API and how it reacts in the case of fragmented files. I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code based on tsGetDataBlk.C. I'm basing the block size based off of what's returned by filemapOut.blockSize but that only seems to return a value > 0 when filemapIn.startOffset is 0. In a case where a file were to be made up of a significant number of non-contiguous fragments (which... would be awful in of itself) how would this be reported by the file layout API? Does the interface technically just report the disk location information of the first block of the $blockSize range and assume that it's contiguous? Thanks! -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From makaplan at us.ibm.com Sun Nov 5 23:01:25 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Sun, 5 Nov 2017 18:01:25 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: I googled GPFS_FCNTL_GET_DATABLKDISKIDX and found this discussion: https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 In general, GPFS files ARE deliberately "fragmented" but we don't say that - we say they are "striped" over many disks -- and that is generally a good thing for parallel performance. Also, in GPFS, if the last would-be block of a file is less than a block, then it is stored in a "fragment" of a block. So you see we use "fragment" to mean something different than other file systems you may know. --marc From: Aaron Knister To: gpfsug main discussion list Date: 11/04/2017 12:22 PM Subject: [gpfsug-discuss] file layout API + file fragmentation Sent by: gpfsug-discuss-bounces at spectrumscale.org I've got a question about the file layout API and how it reacts in the case of fragmented files. I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code based on tsGetDataBlk.C. I'm basing the block size based off of what's returned by filemapOut.blockSize but that only seems to return a value > 0 when filemapIn.startOffset is 0. In a case where a file were to be made up of a significant number of non-contiguous fragments (which... would be awful in of itself) how would this be reported by the file layout API? Does the interface technically just report the disk location information of the first block of the $blockSize range and assume that it's contiguous? Thanks! -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Sun Nov 5 23:39:07 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Sun, 5 Nov 2017 18:39:07 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: <2c1a16ab-9be7-c019-8338-c1dc50d3e069@nasa.gov> Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working on since it needs to be run as unprivileged users. Perhaps I'm not asking the right question. I'm wondering how the file layout api behaves if a given "block"-aligned offset in a file is made up of sub-blocks/fragments that are not all on the same NSD. The assumption based on how I've seen the API used so far is that all sub-blocks within a block at a given offset within a file are all on the same NSD. -Aaron On 11/5/17 6:01 PM, Marc A Kaplan wrote: > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > and found this discussion: > > ?https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > In general, GPFS files ARE deliberately "fragmented" but we don't say > that - we say they are "striped" over many disks -- and that is > generally a good thing for parallel performance. > > Also, in GPFS, if the last would-be block of a file is less than a > block, then it is stored in a "fragment" of a block. ? > So you see we use "fragment" to mean something different than other file > systems you may know. > > --marc > > > > From: ? ? ? ?Aaron Knister > To: ? ? ? ?gpfsug main discussion list > Date: ? ? ? ?11/04/2017 12:22 PM > Subject: ? ? ? ?[gpfsug-discuss] file layout API + file fragmentation > Sent by: ? ? ? ?gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > I've got a question about the file layout API and how it reacts in the > case of fragmented files. > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code > based on tsGetDataBlk.C. I'm basing the block size based off of what's > returned by filemapOut.blockSize but that only seems to return a value > > 0 when filemapIn.startOffset is 0. > > In a case where a file were to be made up of a significant number of > non-contiguous fragments (which... would be awful in of itself) how > would this be reported by the file layout API? Does the interface > technically just report the disk location information of the first block > of the $blockSize range and assume that it's contiguous? > > Thanks! > > -Aaron > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From fschmuck at us.ibm.com Mon Nov 6 00:57:46 2017 From: fschmuck at us.ibm.com (Frank Schmuck) Date: Mon, 6 Nov 2017 00:57:46 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From mutantllama at gmail.com Mon Nov 6 03:35:58 2017 From: mutantllama at gmail.com (Carl) Date: Mon, 6 Nov 2017 14:35:58 +1100 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full Message-ID: Hi Folk, Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. How much degradation do you see above 80% usage, 90% usage? Cheers, Carl. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Nov 6 05:10:30 2017 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 6 Nov 2017 00:10:30 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: Thanks, Frank! That's truly fascinating and has some interesting implications that I hadn't thought of before. I just ran a test on an ~8G fs with a block size of 1M: for i in `seq 1 100000`; do dd if=/dev/zero of=foofile${i} bs=520K count=1 done The fs is "full" according to df/mmdf but there's 3.6G left in subblocks but yeah, I can't allocate any new files that wouldn't fit into the inode and I can't seem to allocate any new subblocks to existing files (e.g. append). What's interesting is if I do the same exercise but with a file size of 30K or even 260K I don't seem to run into the same issue. I'm not sure I understand that yet. I was curious about what this meant in the case of appending to a file where the last offset in the file was allocated to a fragment. By looking at "tsdbfs listda" and appending to a file I could see that the last DA would change (presumably to point to the DA of the start of a contiguous subblock) once the amount of data appended caused the file size to exceed the space available in the trailing subblocks. -Aaron On 11/5/17 7:57 PM, Frank Schmuck wrote: > In GPFS blocks within a file are never fragmented.? For example, if you > have a file of size 7.3 MB and your file system block size is 1MB, then > this file will be made up of 7 full blocks and one fragment of size 320k > (10 subblocks).? Each of the 7 full blocks will be contiguous on a singe > diks (LUN) behind a single NSD server.? The fragment that makes up the > last part of the file will also be contiguous on a single disk, just > shorter than a full block. > ? > Frank Schmuck > IBM Almaden Research Center > ? > ? > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > ? > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > > ??https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ?? > > So you see we use "fragment" to mean something different than > other file > > systems you may know. > > > > --marc > > > > > > > > From: ?? ?? ?? ??Aaron Knister > > To: ?? ?? ?? ??gpfsug main discussion list > > > Date: ?? ?? ?? ??11/04/2017 12:22 PM > > Subject: ?? ?? ?? ??[gpfsug-discuss] file layout API + file > fragmentation > > Sent by: ?? ?? ?? ??gpfsug-discuss-bounces at spectrumscale.org > > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have > some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a > value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first > block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > ? > > ? > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From peter.chase at metoffice.gov.uk Mon Nov 6 09:20:11 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Mon, 6 Nov 2017 09:20:11 +0000 Subject: [gpfsug-discuss] Introduction/Question Message-ID: Hello to all! I'm pleased to have joined the GPFS UG mailing list, I'm experimenting with GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office in the GPCS team (general purpose compute service/mainframe team) and I'm based in Exeter, Devon. I've joined with a specific question to ask, in short: how can I automate sending files to a cloud object store as they arrive in GPFS and keep a copy of the file in GPFS? The longer spiel is this: We have a HPC that throws out a lot of NetCDF files via FTP for use in forecasts. We're currently undergoing a change in working practice, so that data processing is beginning to be done in the cloud. At the same time we're also attempting to de-duplicate the data being sent from the HPC by creating one space to receive it and then have consumers use it or send it on as necessary from there. The data is in terabytes a day sizes, and the timeliness of it's arrival to systems is fairly important (forecasts cease to be forecasts if they're too late). We're using zLinux because the mainframe already receives much of the data from the HPC and has access to a SAN with SSD storage, has the right network connections it needs and generally seems the least amount of work to put something in place. Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the bill and having hardware, storage, OS and filesystem from one provider (IBM) should hopefully save some headaches. We're using Amazon as our cloud provider, and have 2x10GB direct links to their London data centre with a ping of about 15ms, so fairly low latency. The developers using the data want it in s3 so they can access it from server-less environments and won't need to have ec2 instances loitering to look after the data. We were initially interested in using mmcloudgateway/cloud data sharing to send the data, but it's not available for s390x (only x86_64), so I'm now looking at setting up a external storage pool for talking to s3 and then having some kind of ilm soft quota trigger to send the data once enough of it has arrived, but I'm still exploring options. Options such as asking the user group of experienced folks what they think is best! So, any help or advice would be greatly appreciated! Regards, Peter Chase GPCS Team Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 09:37:15 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 09:37:15 +0000 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: Message-ID: Peter, Welcome to the mailing list! Can I summarise in saying that you are looking for a way for GPFS to recognise that a file has just arrived in the filesystem (via FTP) and so trigger an action, in this case to trigger to push to Amazon S3 ? I think that you also have a second question about coping with the restrictions on GPFS on zLinux? ie CES is not supported and hence TCT isn?t either. Looking at the docs, there appears to be many restrictions on TCT for MultiCluster, AFM, Heterogeneous setups, DMAPI tape tiers, etc. So my question to add is; what success have people had in using a TCT in more than the simplest use case of a single small isolated x86 cluster? Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 6 Nov 2017, at 09:20, Chase, Peter wrote: > > Hello to all! > > I?m pleased to have joined the GPFS UG mailing list, I?m experimenting with GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office in the GPCS team (general purpose compute service/mainframe team) and I?m based in Exeter, Devon. > > I?ve joined with a specific question to ask, in short: how can I automate sending files to a cloud object store as they arrive in GPFS and keep a copy of the file in GPFS? > > The longer spiel is this: We have a HPC that throws out a lot of NetCDF files via FTP for use in forecasts. We?re currently undergoing a change in working practice, so that data processing is beginning to be done in the cloud. At the same time we?re also attempting to de-duplicate the data being sent from the HPC by creating one space to receive it and then have consumers use it or send it on as necessary from there. The data is in terabytes a day sizes, and the timeliness of it?s arrival to systems is fairly important (forecasts cease to be forecasts if they?re too late). > > We?re using zLinux because the mainframe already receives much of the data from the HPC and has access to a SAN with SSD storage, has the right network connections it needs and generally seems the least amount of work to put something in place. > > Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the bill and having hardware, storage, OS and filesystem from one provider (IBM) should hopefully save some headaches. > > We?re using Amazon as our cloud provider, and have 2x10GB direct links to their London data centre with a ping of about 15ms, so fairly low latency. The developers using the data want it in s3 so they can access it from server-less environments and won?t need to have ec2 instances loitering to look after the data. > > We were initially interested in using mmcloudgateway/cloud data sharing to send the data, but it?s not available for s390x (only x86_64), so I?m now looking at setting up a external storage pool for talking to s3 and then having some kind of ilm soft quota trigger to send the data once enough of it has arrived, but I?m still exploring options. Options such as asking the user group of experienced folks what they think is best! > > So, any help or advice would be greatly appreciated! > > Regards, > > Peter Chase > GPCS Team > Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom > Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 10:00:39 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 10:00:39 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: Message-ID: Frank, For clarity in the understanding the underlying mechanism in GPFS, could you describe what happens in the case say of a particular file that is appended to every 24 hours? ie. as that file gets to 7MB, it then writes to a new sub-block (1/32 of the next 1MB block). I guess that sub block could be 10th in a a block that already has 9 used. Later on, the file grows to need an 11th subblock and so on. So at what point does this growing file at 8MB occupy all 32 sunblocks of 8 full blocks? Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 6 Nov 2017, at 00:57, Frank Schmuck wrote: > > In GPFS blocks within a file are never fragmented. For example, if you have a file of size 7.3 MB and your file system block size is 1MB, then this file will be made up of 7 full blocks and one fragment of size 320k (10 subblocks). Each of the 7 full blocks will be contiguous on a singe diks (LUN) behind a single NSD server. The fragment that makes up the last part of the file will also be contiguous on a single disk, just shorter than a full block. > > Frank Schmuck > IBM Almaden Research Center > > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > ? https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ? > > So you see we use "fragment" to mean something different than other file > > systems you may know. > > > > --marc > > > > > > > > From: ? ? ? ? Aaron Knister > > To: ? ? ? ? gpfsug main discussion list > > Date: ? ? ? ? 11/04/2017 12:22 PM > > Subject: ? ? ? ? [gpfsug-discuss] file layout API + file fragmentation > > Sent by: ? ? ? ? gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=WH1GLDCza1Rvd9bzdVYoz2Pdzs7l90XNnhUb40FYCqQ&s=LOkUY79m5Ow2FeKqfCqc31cfXZVmqYlvBuQRPirGOFU&e= > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Nov 6 10:01:28 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 06 Nov 2017 10:01:28 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets Message-ID: Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Nov 6 10:22:18 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 6 Nov 2017 15:52:18 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Nov 6 12:25:43 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 6 Nov 2017 12:25:43 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full Message-ID: Hi Carl I don?t have any direct metrics, but we frequently run our file systems above the 80% level, run split data and metadata.I haven?t experienced any GPFS performance issues that I can attribute to high utilization. I know the documentation talks about this, and the lower values of blocks and sub-blocks will make the file system work harder, but so far I haven?t seen any issues. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of Carl Reply-To: gpfsug main discussion list Date: Sunday, November 5, 2017 at 9:36 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Performance of GPFS when filesystem is almost full Hi Folk, Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. How much degradation do you see above 80% usage, 90% usage? Cheers, Carl. -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Nov 6 12:31:30 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 06 Nov 2017 12:31:30 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: > Is this problem happens only for the fileset root directory ? Could you > try accessing the fileset as privileged user after the fileset link and > verify if ACLs are set properly ? AFM reads the ACLs from home and sets in > the cache automatically during the file/dir lookup. What is the Spectrum > Scale version ? > > ~Venkat (vpuvvada at in.ibm.com) > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/06/2017 03:32 PM > Subject: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Dear SpectrumScale Experts, > > > > When creating an IW cache view of a directory in a remote GPFS filesystem, > I prepare the AFM "home" directory using 'mmafmconfig enable ' > command. > > I wish the cache fileset junction point to inherit the ACL for the home > directory when I link it to the filesystem. > > Currently I'm using a flimsy workaround: > > 1. Read the GPFS ACL from the remote directory => store in some file > acl.txt > > 2. Link the AFM fileset to the local filesystem, > > 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i > acl.txt > > Is there a way for the local cache fileset to automatically inherit/clone > the remote directory's ACL, e.g. at mmlinkfileset time? > > > > Thanks! > > Luke._______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Nov 6 13:39:20 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 6 Nov 2017 08:39:20 -0500 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: Aaron, brilliant! Your example is close to the worst case, where every file is 512K+1 bytes and the blocksize is 1024K. Yes, in the worse case 49.99999% of space is "lost" or wasted. Don't do that! One can construct such a worst case for any system that allocates by blocks or sectors or whatever you want to call it. Just fill the system with files that are each 0.5*Block_Size+1 bytes and argue that 1/2 the space is wasted. From: Aaron Knister To: Date: 11/06/2017 12:10 AM Subject: Re: [gpfsug-discuss] file layout API + file fragmentation Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, Frank! That's truly fascinating and has some interesting implications that I hadn't thought of before. I just ran a test on an ~8G fs with a block size of 1M: for i in `seq 1 100000`; do dd if=/dev/zero of=foofile${i} bs=520K count=1 done The fs is "full" according to df/mmdf but there's 3.6G left in subblocks but yeah, I can't allocate any new files that wouldn't fit into the inode and I can't seem to allocate any new subblocks to existing files (e.g. append). What's interesting is if I do the same exercise but with a file size of 30K or even 260K I don't seem to run into the same issue. I'm not sure I understand that yet. I was curious about what this meant in the case of appending to a file where the last offset in the file was allocated to a fragment. By looking at "tsdbfs listda" and appending to a file I could see that the last DA would change (presumably to point to the DA of the start of a contiguous subblock) once the amount of data appended caused the file size to exceed the space available in the trailing subblocks. -Aaron On 11/5/17 7:57 PM, Frank Schmuck wrote: > In GPFS blocks within a file are never fragmented. For example, if you > have a file of size 7.3 MB and your file system block size is 1MB, then > this file will be made up of 7 full blocks and one fragment of size 320k > (10 subblocks). Each of the 7 full blocks will be contiguous on a singe > diks (LUN) behind a single NSD server. The fragment that makes up the > last part of the file will also be contiguous on a single disk, just > shorter than a full block. > > Frank Schmuck > IBM Almaden Research Center > > > > ----- Original message ----- > From: Aaron Knister > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: > Cc: > Subject: Re: [gpfsug-discuss] file layout API + file fragmentation > Date: Sun, Nov 5, 2017 3:39 PM > > Thanks Marc, that helps. I can't easily use tsdbfs for what I'm working > on since it needs to be run as unprivileged users. > > Perhaps I'm not asking the right question. I'm wondering how the file > layout api behaves if a given "block"-aligned offset in a file is made > up of sub-blocks/fragments that are not all on the same NSD. The > assumption based on how I've seen the API used so far is that all > sub-blocks within a block at a given offset within a file are all on the > same NSD. > > -Aaron > > On 11/5/17 6:01 PM, Marc A Kaplan wrote: > > I googled GPFS_FCNTL_GET_DATABLKDISKIDX > > > > and found this discussion: > > > > > ? https://www.ibm.com/developerworks/community/forums/html/topic?id=db48b190-4f2f-4e24-a035-25d3e2b06b2d&ps=50 > > > > In general, GPFS files ARE deliberately "fragmented" but we don't say > > that - we say they are "striped" over many disks -- and that is > > generally a good thing for parallel performance. > > > > Also, in GPFS, if the last would-be block of a file is less than a > > block, then it is stored in a "fragment" of a block. ? > > So you see we use "fragment" to mean something different than > other file > > systems you may know. > > > > --marc > > > > > > > > From: ? ? ? ? Aaron Knister > > To: ? ? ? ? gpfsug main discussion list > > > Date: ? ? ? ? 11/04/2017 12:22 PM > > Subject: ? ? ? ? [gpfsug-discuss] file layout API + file > fragmentation > > Sent by: ? ? ? ? gpfsug-discuss-bounces at spectrumscale.org > > > ------------------------------------------------------------------------ > > > > > > > > I've got a question about the file layout API and how it reacts in the > > case of fragmented files. > > > > I'm using the GPFS_FCNTL_GET_DATABLKDISKIDX structure and have > some code > > based on tsGetDataBlk.C. I'm basing the block size based off of what's > > returned by filemapOut.blockSize but that only seems to return a > value > > > 0 when filemapIn.startOffset is 0. > > > > In a case where a file were to be made up of a significant number of > > non-contiguous fragments (which... would be awful in of itself) how > > would this be reported by the file layout API? Does the interface > > technically just report the disk location information of the first > block > > of the $blockSize range and assume that it's contiguous? > > > > Thanks! > > > > -Aaron > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=wnR7m6d4urZ_8dM4mkHQjMbFD9xJEeesmJyzt1osCnM&s=-dgGO6O5i1EqWj-8MmzjxJ1Iz2I5gT1aRmtyP44Cvdg&e= > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=ai3ddVzf50ktH78ovGv6NU4O2LZUOWLpiUiggb8lEgA&m=pUdB4fbWLD03ZTAhk9OlpRdIasz628Oa_yG8z8NOjsk&s=kisarJ7IVnyYBx05ZZiGzdwaXnPqNR8UJoywU1OJNRU&e= > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=_xM9xVsqOuNiCqn3ikx6ZaaIHChTPhz_8iDmEKoteX4&s=uy462L5sxX_3Mm3Dh824ptJIxtah9LVRPMmyKz1lAdg&e= > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=_xM9xVsqOuNiCqn3ikx6ZaaIHChTPhz_8iDmEKoteX4&s=uy462L5sxX_3Mm3Dh824ptJIxtah9LVRPMmyKz1lAdg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Nov 6 14:16:34 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 6 Nov 2017 14:16:34 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Message-ID: We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon From peter.smith at framestore.com Mon Nov 6 14:16:42 2017 From: peter.smith at framestore.com (Peter Smith) Date: Mon, 6 Nov 2017 14:16:42 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almost full In-Reply-To: References: Message-ID: Hi Carl. When we commissioned our system we ran an NFS stress tool, and filled the system to the top. No performance degradation was seen until it was 99.7% full. I believe that after this point it takes longer to find free blocks to write to. YMMV. On 6 November 2017 at 03:35, Carl wrote: > Hi Folk, > > Does anyone have much experience with the performance of GPFS as it > becomes close to full. In particular I am referring to split data/meta > data, where the data pool goes over 80% utilisation. > > How much degradation do you see above 80% usage, 90% usage? > > Cheers, > > Carl. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- [image: Framestore] Peter Smith ? Senior Systems Engineer London ? New York ? Los Angeles ? Chicago ? Montr?al T +44 (0)20 7344 8000 ? M +44 (0)7816 123009 <+44%20%280%297816%20123009> 19-23 Wells Street, London W1T 3PQ Twitter ? Facebook ? framestore.com [image: https://www.framestore.com/] -------------- next part -------------- An HTML attachment was scrubbed... URL: From Achim.Rehor at de.ibm.com Mon Nov 6 16:18:39 2017 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Mon, 6 Nov 2017 11:18:39 -0500 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almostfull In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 7182 bytes Desc: not available URL: From robbyb at us.ibm.com Mon Nov 6 18:02:14 2017 From: robbyb at us.ibm.com (Rob Basham) Date: Mon, 6 Nov 2017 18:02:14 +0000 Subject: [gpfsug-discuss] Fw: Introduction/Question Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15099587293244.png Type: image/png Size: 481 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15099587293245.png Type: image/png Size: 2741 bytes Desc: not available URL: From ewahl at osc.edu Mon Nov 6 19:43:28 2017 From: ewahl at osc.edu (Edward Wahl) Date: Mon, 6 Nov 2017 14:43:28 -0500 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: Message-ID: <20171106144328.58a233f2@osc.edu> On Mon, 6 Nov 2017 09:20:11 +0000 "Chase, Peter" wrote: > how can I automate sending files to a cloud object store as they arrive in > GPFS and keep a copy of the file in GPFS? Sounds like you already have an idea how to do this by using ILM policies. Either quota based as you mention or 'placement' policies should work, though I cannot speak to placement in an S3 environment, the policy engine has a way to call external commands for that if necessary. Though if you create an external pool, a placement policy may be much simpler and possibly faster as well as data would be sent to S3 on write, rather than on a quota trigger. If an external storage pool works properly for S3, I'd probably use a placement policy myself. This also would depend on how/when I needed the data on S3 and your mention of timeliness tells me placement rather than quota may be best. Weighing the solutions for this may be better tested(and timed!) than anything. EVERYONE wants a timely weather forecast. ^_- Ed -- Ed Wahl Ohio Supercomputer Center 614-292-9302 From scale at us.ibm.com Mon Nov 6 19:51:40 2017 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 6 Nov 2017 14:51:40 -0500 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" To: "gpfsug-discuss at spectrumscale.org" Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Mon Nov 6 20:48:45 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Mon, 6 Nov 2017 20:48:45 +0000 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From fschmuck at us.ibm.com Mon Nov 6 20:59:02 2017 From: fschmuck at us.ibm.com (Frank Schmuck) Date: Mon, 6 Nov 2017 20:59:02 +0000 Subject: [gpfsug-discuss] file layout API + file fragmentation In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Nov 6 20:59:32 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 6 Nov 2017 20:59:32 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar > on behalf of "scale at us.ibm.com" > Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" >, Simon Thompson > Cc: IBM Spectrum Scale > Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Mon Nov 6 21:09:18 2017 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 6 Nov 2017 21:09:18 +0000 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: <7f4c1bf980514e39b2691b15f9b35083@jumptrading.com> Hi Simon, It will only trigger the callback on the currently appointed File System Manager, so you need to make sure your callback scripts are installed on all nodes that can occupy this role. HTH, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Monday, November 06, 2017 3:00 PM To: scale at us.ibm.com; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Note: External Email ________________________________ Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar > on behalf of "scale at us.ibm.com" > Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" >, Simon Thompson > Cc: IBM Spectrum Scale > Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Mon Nov 6 22:18:12 2017 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 6 Nov 2017 17:18:12 -0500 Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded In-Reply-To: References: Message-ID: Right, Bryan. To expand on that a bit, I'll make two additional points. (1) Only a node in the cluster that owns the file system can be appointed a file system manager for the file system. Nodes that remote mount the file system from other clusters cannot be appointed the file system manager of the remote file system. (2) A node need not have the manager designation (as seen in mmlscluster output) to become a file system manager; nodes with the manager designation are preferred, but one could use mmchmgr to assign the role to a non-manager node (for instance). Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Bryan Banister To: gpfsug main discussion list , "scale at us.ibm.com" Date: 11/06/2017 04:09 PM Subject: RE: [gpfsug-discuss] Callbacks / softQuotaExceeded Hi Simon, It will only trigger the callback on the currently appointed File System Manager, so you need to make sure your callback scripts are installed on all nodes that can occupy this role. HTH, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Monday, November 06, 2017 3:00 PM To: scale at us.ibm.com; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Note: External Email Thanks Eric, One other question, when it says it must run on a manager node, I'm assuming that means a manager node in a storage cluster (we multi-cluster clients clusters in). Thanks Simon From: Eric Agar on behalf of "scale at us.ibm.com" < scale at us.ibm.com> Date: Monday, 6 November 2017 at 19:51 To: "gpfsug-discuss at spectrumscale.org" , Simon Thompson Cc: IBM Spectrum Scale Subject: Re: [gpfsug-discuss] Callbacks / softQuotaExceeded Simon, Based on my reading of the code, when a softQuotaExceeded event callback is invoked with %quotaType having the value "FILESET", the following arguments correspond with each other for filesetLimitExceeded and softQuotaExceeded: - filesetLimitExceeded %inodeUsage and softQuotaExceeded %filesUsage - filesetLimitExceeded %inodeQuota and softQuotaExceeded %filesQuota - filesetLimitExceeded %inodeLimit and softQuotaExceeded %filesLimit - filesetLimitExceeded %filesetSize and softQuotaExceeded %blockUsage - filesetLimitExceeded %softLimit and softQuotaExceeded %blockQuota - filesetLimitExceeded %hardLimit and softQuotaExceeded %blockLimit So, terms have changed to make them a little friendlier and to generalize them. An inode is a file. Limits related to inodes and to blocks are being reported. Regards, The Spectrum Scale (GPFS) team Eric Agar ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Simon Thompson (IT Research Support)" < S.J.Thompson at bham.ac.uk> To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: 11/06/2017 09:17 AM Subject: [gpfsug-discuss] Callbacks / softQuotaExceeded Sent by: gpfsug-discuss-bounces at spectrumscale.org We were looking at adding some callbacks to notify us when file-sets go over their inode limit by implementing it as a soft inode quota. In the docs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectru m.scale.v4r23.doc/bl1adm_mmaddcallback.htm#mmaddcallback__Table1 There is an event filesetLimitExceeded, which has parameters: %inodeUsage %inodeQuota, however the docs say that we should instead use softQuotaExceeded as filesetLimitExceeded "It exists only for compatibility (and may be deleted in a future version); therefore, using softQuotaExceeded is recommended instead" However. softQuotaExceeded seems to have no %inodeQuota of %inodeUsage parameters. Is this a doc error or is there genuinely no way to get the inodeQuota/Usage with softQuotaExceeded? The same applies to passing %quotaEventType. Any suggestions? Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=7fytZP7U6ExP93umOcOUIXEUXD2KWdWEsrEqMtxOB0I&s=BiROZ43JuhZRhqOOpqTvHvl7bTqjPFxIrCxqIWAWa7U&e= Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Nov 6 23:49:39 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 6 Nov 2017 18:49:39 -0500 Subject: [gpfsug-discuss] Introduction/Question In-Reply-To: References: , Message-ID: Placement policy rules "SET POOL 'xyz'... " may only name GPFS data pools. NOT "EXTERNAL POOLs" -- EXTERNAL POOL is a concept only supported by MIGRATE rules. However you may be interested in "mmcloudgateway" & co, which is all about combining GPFS with Cloud storage. AKA IBM Transparent Cloud Tiering https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Transparent%20Cloud%20Tiering -------------- next part -------------- An HTML attachment was scrubbed... URL: From mutantllama at gmail.com Tue Nov 7 00:12:11 2017 From: mutantllama at gmail.com (Carl) Date: Tue, 7 Nov 2017 11:12:11 +1100 Subject: [gpfsug-discuss] Performance of GPFS when filesystem is almostfull In-Reply-To: References: Message-ID: Thanks to all for the information. Im happy to say that it is close to what I hoped would be the case. Interesting to see the effect of the -n value. Reinforces the need to think about it and not go with the defaults. Thanks again, Carl. On 7 November 2017 at 03:18, Achim Rehor wrote: > I have no practical experience on these numbers, however, Peters > experience below is matching what i learned from Dan years ago. > > As long as the -n setting of the FS (the number of nodes potentially > mounting the fs) is more or less matching the actual number of mounts, > this 99.x % before degradation is expected. If you are far off with that > -n estimate, like having it set to 32, but the actual number of mounts is > in the thousands, > then degradation happens earlier, since the distribution of free blocks in > the allocation maps is not matching the actual setup as good as it could > be. > > Naturally, this depends also on how you do filling of the FS. If it is > only a small percentage of the nodes, doing the creates, then the > distribution can > be 'wrong' as well, and single nodes run earlier out of allocation map > space, and need to look for free blocks elsewhere, costing RPC cycles and > thus performance. > > Putting this in numbers seems quite difficult ;) > > > Mit freundlichen Gr??en / Kind regards > > *Achim Rehor* > > ------------------------------ > > Software Technical Support Specialist AIX/ Emea HPC Support > IBM Certified Advanced Technical Expert - Power Systems with AIX > TSCC Software Service, Dept. 7922 > Global Technology Services > > ------------------------------ > Phone: +49-7034-274-7862 <+49%207034%202747862> IBM Deutschland > E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 > 65451 Kelsterbach > Germany > > > > ------------------------------ > > IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter > Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, > Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 14562 WEEE-Reg.-Nr. DE 99369940 > > > > > > From: Peter Smith > To: gpfsug main discussion list > Date: 11/06/2017 09:17 AM > Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem > is almost full > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Carl. > > When we commissioned our system we ran an NFS stress tool, and filled the > system to the top. > > No performance degradation was seen until it was 99.7% full. > > I believe that after this point it takes longer to find free blocks to > write to. > > YMMV. > > On 6 November 2017 at 03:35, Carl <*mutantllama at gmail.com* > > wrote: > Hi Folk, > > Does anyone have much experience with the performance of GPFS as it > becomes close to full. In particular I am referring to split data/meta > data, where the data pool goes over 80% utilisation. > > How much degradation do you see above 80% usage, 90% usage? > > Cheers, > > Carl. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > -- > *Peter Smith* ? Senior Systems Engineer > *London* ? New York ? Los Angeles ? Chicago ? Montr?al > T +44 (0)20 7344 8000 <+44%2020%207344%208000> ? M +44 (0)7816 123009 > <+44%20%280%297816%20123009> > *19-23 Wells Street, London W1T 3PQ* > > Twitter ? Facebook > ? framestore.com > > ______________________________ > _________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 7182 bytes Desc: not available URL: From vpuvvada at in.ibm.com Tue Nov 7 07:45:37 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 7 Nov 2017 13:15:37 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Luke, This issue has been fixed. As a workaround you could you also try resetting the same ACLs at home (instead of cache) or change directory ctime at home and verify that ACLs are updated correctly on fileset root. You can contact customer support or open a PMR and request efix. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 06:01 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Tue Nov 7 07:57:46 2017 From: john.hearns at asml.com (John Hearns) Date: Tue, 7 Nov 2017 07:57:46 +0000 Subject: [gpfsug-discuss] Spectrum Scale with NVMe Message-ID: I am looking for anyone with experience of using Spectrum Scale with nvme devices. I could use an offline brain dump... The specific issue I have is with the nsd device discovery and the naming. Before anyone replies, I am gettign excellent support from IBM and have been directed to the correct documentation. I am just looking for any wrinkles or tips that anyone has. Thanks -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at spectrumscale.org Tue Nov 7 09:18:52 2017 From: chair at spectrumscale.org (Spectrum Scale UG Chair (Simon Thompson)) Date: Tue, 07 Nov 2017 09:18:52 +0000 Subject: [gpfsug-discuss] SSUG CIUK Call for Speakers Message-ID: The last Spectrum Scale user group meeting of the year will be taking place as part of the Computing Insights UK (CIUK) event in December. We are currently looking for user speakers to talk about their Spectrum Scale implementation. It doesn't have to be a huge deployment, even just a small couple of nodes cluster, we'd love to hear how you are using Scale and about any challenges and successes you've had with it. If you are interested in speaking, you must be registered to attend CIUK and the user group will be taking place on Tuesday 12th December in the afternoon. More details on CIUK and registration at: http://www.stfc.ac.uk/news-events-and-publications/events/general-interest- events/computing-insight-uk/ If you would like to speak, please drop me an email and we can find a slot. Simon From daniel.kidger at uk.ibm.com Tue Nov 7 09:19:24 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Tue, 7 Nov 2017 09:19:24 +0000 Subject: [gpfsug-discuss] Performance of GPFS when filesystem isalmostfull In-Reply-To: Message-ID: I understand that this near linear performance is one of the differentiators of Spectrum Scale. Others with more field experience than me might want to comment on how Lustre and other distributed filesystem perform as they approaches near full capacity. Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 7 Nov 2017, at 00:12, Carl wrote: > > Thanks to all for the information. > > Im happy to say that it is close to what I hoped would be the case. > > Interesting to see the effect of the -n value. Reinforces the need to think about it and not go with the defaults. > > Thanks again, > > Carl. > > >> On 7 November 2017 at 03:18, Achim Rehor wrote: >> I have no practical experience on these numbers, however, Peters experience below is matching what i learned from Dan years ago. >> >> As long as the -n setting of the FS (the number of nodes potentially mounting the fs) is more or less matching the actual number of mounts, >> this 99.x % before degradation is expected. If you are far off with that -n estimate, like having it set to 32, but the actual number of mounts is in the thousands, >> then degradation happens earlier, since the distribution of free blocks in the allocation maps is not matching the actual setup as good as it could be. >> >> Naturally, this depends also on how you do filling of the FS. If it is only a small percentage of the nodes, doing the creates, then the distribution can >> be 'wrong' as well, and single nodes run earlier out of allocation map space, and need to look for free blocks elsewhere, costing RPC cycles and thus performance. >> >> Putting this in numbers seems quite difficult ;) >> >> >> Mit freundlichen Gr??en / Kind regards >> Achim Rehor >> >> >> Software Technical Support Specialist AIX/ Emea HPC Support >> <_1_D95FF418D95FEE980059980B852581D0.gif> >> IBM Certified Advanced Technical Expert - Power Systems with AIX >> TSCC Software Service, Dept. 7922 >> Global Technology Services >> Phone: +49-7034-274-7862 IBM Deutschland >> E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 >> 65451 Kelsterbach >> Germany >> >> >> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter >> Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll >> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 >> >> >> >> >> >> From: Peter Smith >> To: gpfsug main discussion list >> Date: 11/06/2017 09:17 AM >> Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem is almost full >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> >> >> >> Hi Carl. >> >> When we commissioned our system we ran an NFS stress tool, and filled the system to the top. >> >> No performance degradation was seen until it was 99.7% full. >> >> I believe that after this point it takes longer to find free blocks to write to. >> >> YMMV. >> >> On 6 November 2017 at 03:35, Carl wrote: >> Hi Folk, >> >> Does anyone have much experience with the performance of GPFS as it becomes close to full. In particular I am referring to split data/meta data, where the data pool goes over 80% utilisation. >> >> How much degradation do you see above 80% usage, 90% usage? >> >> Cheers, >> >> Carl. >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> -- >> Peter Smith ? Senior Systems Engineer >> London ? New York ? Los Angeles ? Chicago ? Montr?al >> T +44 (0)20 7344 8000 ? M +44 (0)7816 123009 >> 19-23 Wells Street, London W1T 3PQ >> Twitter? Facebook? framestore.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckerner at illinois.edu Tue Nov 7 13:04:41 2017 From: ckerner at illinois.edu (Chad Kerner) Date: Tue, 7 Nov 2017 07:04:41 -0600 Subject: [gpfsug-discuss] Spectrum Scale with NVMe In-Reply-To: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> References: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> Message-ID: Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with nvme > devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and have been > directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. Neither the sender nor the > company/group of companies he or she represents shall be liable for the > proper and complete transmission of the information contained in this > communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign From luke.raimbach at googlemail.com Tue Nov 7 16:24:56 2017 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Tue, 07 Nov 2017 16:24:56 +0000 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Hello Venkat, Thanks for the information. When was the issue fixed? I tried this on the most recent 4.2.3.5 release and was still experiencing the same behaviour. Cheers, Luke. On Tue, 7 Nov 2017 at 08:45 Venkateswara R Puvvada wrote: > Luke, > > This issue has been fixed. As a workaround you could you also try > resetting the same ACLs at home (instead of cache) or change directory > ctime at home and verify that ACLs are updated correctly on fileset root. > You can contact customer support or open a PMR and request efix. > > ~Venkat (vpuvvada at in.ibm.com) > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/06/2017 06:01 PM > Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Venkat, > > This is only for the fileset root. All other files and directories pull > the correct ACLs as expected when accessing the fileset as root user, or > after setting the correct (missing) ACL on the fileset root. > > Multiple SS versions from around 4.1 to present. > > Thanks! > Luke. > > > On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, <*vpuvvada at in.ibm.com* > > wrote: > > Is this problem happens only for the fileset root directory ? Could you > try accessing the fileset as privileged user after the fileset link and > verify if ACLs are set properly ? AFM reads the ACLs from home and sets in > the cache automatically during the file/dir lookup. What is the Spectrum > Scale version ? > > ~Venkat (*vpuvvada at in.ibm.com* ) > > > > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 11/06/2017 03:32 PM > Subject: [gpfsug-discuss] ACLs on AFM Filesets > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Dear SpectrumScale Experts, > > > When creating an IW cache view of a directory in a remote GPFS filesystem, > I prepare the AFM "home" directory using 'mmafmconfig enable ' > command. > > I wish the cache fileset junction point to inherit the ACL for the home > directory when I link it to the filesystem. > > Currently I'm using a flimsy workaround: > > 1. Read the GPFS ACL from the remote directory => store in some file > acl.txt > > 2. Link the AFM fileset to the local filesystem, > > 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i > acl.txt > > Is there a way for the local cache fileset to automatically inherit/clone > the remote directory's ACL, e.g. at mmlinkfileset time? > > > > Thanks! > > Luke._______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e=* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > > > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex at calicolabs.com Tue Nov 7 17:50:54 2017 From: alex at calicolabs.com (Alex Chekholko) Date: Tue, 7 Nov 2017 09:50:54 -0800 Subject: [gpfsug-discuss] Performance of GPFS when filesystem isalmostfull In-Reply-To: References: Message-ID: One of the parameters that you need to choose at filesystem creation time is the block allocation type. -j {cluster|scatter} parameter to mmcrfs: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1ins_blkalmap.htm#ballmap If you use "cluster", you will have quite high performance when the filesystem is close to empty. If you use "scatter", the performance will stay the same no matter the filesystem utilization because blocks for a given file will always be scattered randomly. Some vendors set up their GPFS filesystem using '-j cluster' and then show off their streaming write performance numbers. But the performance degrades considerably as the filesystem fills up. With "scatter", the filesystem performance is slower but stays consistent throughout its lifetime. On Tue, Nov 7, 2017 at 1:19 AM, Daniel Kidger wrote: > I understand that this near linear performance is one of the > differentiators of Spectrum Scale. > Others with more field experience than me might want to comment on how > Lustre and other distributed filesystem perform as they approaches near > full capacity. > > Daniel > [image: /spectrum_storage-banne] > > > [image: Spectrum Scale Logo] > > > *Dr Daniel Kidger* > IBM Technical Sales Specialist > Software Defined Solution Sales > > + <+%2044-7818%20522%20266> 44-(0)7818 522 266 <+%2044-7818%20522%20266> > daniel.kidger at uk.ibm.com > > On 7 Nov 2017, at 00:12, Carl wrote: > > Thanks to all for the information. > > Im happy to say that it is close to what I hoped would be the case. > > Interesting to see the effect of the -n value. Reinforces the need to > think about it and not go with the defaults. > > Thanks again, > > Carl. > > > On 7 November 2017 at 03:18, Achim Rehor wrote: > >> I have no practical experience on these numbers, however, Peters >> experience below is matching what i learned from Dan years ago. >> >> As long as the -n setting of the FS (the number of nodes potentially >> mounting the fs) is more or less matching the actual number of mounts, >> this 99.x % before degradation is expected. If you are far off with that >> -n estimate, like having it set to 32, but the actual number of mounts is >> in the thousands, >> then degradation happens earlier, since the distribution of free blocks >> in the allocation maps is not matching the actual setup as good as it could >> be. >> >> Naturally, this depends also on how you do filling of the FS. If it is >> only a small percentage of the nodes, doing the creates, then the >> distribution can >> be 'wrong' as well, and single nodes run earlier out of allocation map >> space, and need to look for free blocks elsewhere, costing RPC cycles and >> thus performance. >> >> Putting this in numbers seems quite difficult ;) >> >> >> Mit freundlichen Gr??en / Kind regards >> >> *Achim Rehor* >> >> ------------------------------ >> >> Software Technical Support Specialist AIX/ Emea HPC Support >> <_1_D95FF418D95FEE980059980B852581D0.gif> >> IBM Certified Advanced Technical Expert - Power Systems with AIX >> TSCC Software Service, Dept. 7922 >> Global Technology Services >> >> ------------------------------ >> Phone: +49-7034-274-7862 <+49%207034%202747862> IBM Deutschland >> E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24 >> 65451 Kelsterbach >> Germany >> >> >> >> ------------------------------ >> >> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter >> Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Reinhard Reschke, >> Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll >> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, >> HRB 14562 WEEE-Reg.-Nr. DE 99369940 >> >> >> >> >> >> From: Peter Smith >> To: gpfsug main discussion list >> Date: 11/06/2017 09:17 AM >> Subject: Re: [gpfsug-discuss] Performance of GPFS when filesystem >> is almost full >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> ------------------------------ >> >> >> >> Hi Carl. >> >> When we commissioned our system we ran an NFS stress tool, and filled the >> system to the top. >> >> No performance degradation was seen until it was 99.7% full. >> >> I believe that after this point it takes longer to find free blocks to >> write to. >> >> YMMV. >> >> On 6 November 2017 at 03:35, Carl <*mutantllama at gmail.com* >> > wrote: >> Hi Folk, >> >> Does anyone have much experience with the performance of GPFS as it >> becomes close to full. In particular I am referring to split data/meta >> data, where the data pool goes over 80% utilisation. >> >> How much degradation do you see above 80% usage, 90% usage? >> >> Cheers, >> >> Carl. >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at *spectrumscale.org* >> >> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* >> >> >> >> >> >> -- >> *Peter Smith* ? Senior Systems Engineer >> *London* ? New York ? Los Angeles ? Chicago ? Montr?al >> T +44 (0)20 7344 8000 <+44%2020%207344%208000> ? M +44 (0)7816 123009 >> <+44%20%280%297816%20123009> >> *19-23 Wells Street, London W1T 3PQ* >> >> Twitter >> ? >> Facebook >> ? >> framestore.com >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Wed Nov 8 05:16:02 2017 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Wed, 8 Nov 2017 10:46:02 +0530 Subject: [gpfsug-discuss] ACLs on AFM Filesets In-Reply-To: References: Message-ID: Luke, There are two issues here. ACLs are not updated on fileset root and other one is that ACLs get updated only when the files/dirs are accessed as root user. Fix for the later one is already part of 4.2.3.5. First issue was fixed after your email, you could request efix on top of 4.2.3.5. First issue will get corrected automatically when ctime is changed on target path at home. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/07/2017 09:55 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Venkat, Thanks for the information. When was the issue fixed? I tried this on the most recent 4.2.3.5 release and was still experiencing the same behaviour. Cheers, Luke. On Tue, 7 Nov 2017 at 08:45 Venkateswara R Puvvada wrote: Luke, This issue has been fixed. As a workaround you could you also try resetting the same ACLs at home (instead of cache) or change directory ctime at home and verify that ACLs are updated correctly on fileset root. You can contact customer support or open a PMR and request efix. ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 06:01 PM Subject: Re: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Venkat, This is only for the fileset root. All other files and directories pull the correct ACLs as expected when accessing the fileset as root user, or after setting the correct (missing) ACL on the fileset root. Multiple SS versions from around 4.1 to present. Thanks! Luke. On Mon, 6 Nov 2017, 10:22 Venkateswara R Puvvada, wrote: Is this problem happens only for the fileset root directory ? Could you try accessing the fileset as privileged user after the fileset link and verify if ACLs are set properly ? AFM reads the ACLs from home and sets in the cache automatically during the file/dir lookup. What is the Spectrum Scale version ? ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 11/06/2017 03:32 PM Subject: [gpfsug-discuss] ACLs on AFM Filesets Sent by: gpfsug-discuss-bounces at spectrumscale.org Dear SpectrumScale Experts, When creating an IW cache view of a directory in a remote GPFS filesystem, I prepare the AFM "home" directory using 'mmafmconfig enable ' command. I wish the cache fileset junction point to inherit the ACL for the home directory when I link it to the filesystem. Currently I'm using a flimsy workaround: 1. Read the GPFS ACL from the remote directory => store in some file acl.txt 2. Link the AFM fileset to the local filesystem, 3. Set the GPFS ACL on the local fileset junction point with mmputacl -i acl.txt Is there a way for the local cache fileset to automatically inherit/clone the remote directory's ACL, e.g. at mmlinkfileset time? Thanks! Luke._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=hGpW-C4GuPv5jPnC27siEC3S5TJjLxO4o2HIOLlPdeo&s=pMpWqJdImjhuKhLKAmsS7mnVSRuMfNOjJ3_HjNVW2Po&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=DkfRGRFLq0tUIu2HH7jpjSmG3Uwh3U1dpU1pqQCcCEc&s=jjWH6js9EaYogD2z76C7uDwY94_2yiavn0fmd7iilKQ&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=cbhhdq1uD9_Nmxeh3mRCS0Ic8vc_ts_4uvqXce4DdVc&s=WdJzTgnFn-ApJUW579JhxBPfnVqJ2L3z4x2AJybiVto&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Wed Nov 8 15:50:52 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Wed, 8 Nov 2017 15:50:52 +0000 Subject: [gpfsug-discuss] Default placement/External Pool Message-ID: Hello! A follow up to my previous question about automatically sending files to Amazon s3 as they arrive in GPFS. I have created an interface script to manage Amazon s3 storage as an external pool, I have created a migration policy that pre-migrates all files to the external pool and I have set that as the default policy for the file system. All good so far, but the problem I'm now facing is: Only some of the cluster nodes have access to Amazon due to network constraints. I read the statement "The mmapplypolicy command invokes the external pool script on all nodes in the cluster that have installed the script in its designated location."[1] and thought, 'Great! I'll only install the script on nodes that have access to Amazon' but that appears not to work for a placement policy/default policy and instead, the script runs on precisely no nodes. I assumed this happened because running the script on a non-Amazon facing node resulted in a horrible error (i.e. file not found), so I edited my script to return a non-zero response if being run on a node that isn't in my cloudNode class, then installed the script every where. But this appears to have had no effect what-so-ever. The only thing I can think of now is to control where a migration policy runs based on node class. But I don't know how to do that, or if it's possible, or where the documentation might be as I can't find any. Any assistance would once again be greatly appreciated. [1]=https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_impstorepool.htm Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From Robert.Oesterlin at nuance.com Wed Nov 8 16:02:04 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 8 Nov 2017 16:02:04 +0000 Subject: [gpfsug-discuss] Default placement/External Pool Message-ID: Hi Peter mmapplypolicy has a "-N" parameter that should restrict it to a subset of nodes or node class if you define that. -N {all | mount | Node[,Node...] | NodeFile | NodeClass} Specifies the list of nodes that will run parallel instances of policy code in the GPFS home cluster. This command supports all defined node classes. The default is to run on the node where the mmapplypolicy command is running or the current value of the defaultHelperNodes parameter of the mmchconfig command. Bob Oesterlin Sr Principal Storage Engineer, Nuance ?On 11/8/17, 9:55 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Chase, Peter" wrote: The only thing I can think of now is to control where a migration policy runs based on node class. But I don't know how to do that, or if it's possible, or where the documentation might be as I can't find any. Any assistance would once again be greatly appreciated. From makaplan at us.ibm.com Wed Nov 8 19:21:19 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 8 Nov 2017 14:21:19 -0500 Subject: [gpfsug-discuss] Default placement/External Pool In-Reply-To: References: Message-ID: Peter, 1. to best exploit and integrate both Spectrum Scale and Cloud Storage, please consider: https://www.ibm.com/blogs/systems/spectrum-scale-transparent-cloud-tiering/ 2. Yes, you can use mmapplypolicy to push copies of files to an "external" system. But you'll probably need a strategy or technique to avoid redundantly pushing the "next time" you run the command... 3. Regarding mmapplypolicy nitty-gritty: you can use the -N option to say exactly which nodes you want to run the command. And regarding using ... EXTERNAL ... EXEC 'myscript' You can further restrict which nodes will act as mmapplypolicy "helpers" -- If on a particular node x, 'myscript' does not exist OR myscript TEST returns a non-zero exit code then node x will be excluded.... You will see a message like this: [I] Messages tagged with <3> are from node n3. <3> [E:73] Error on system(/ghome/makaplan/policies/mynodes.sh TEST '/foo/bar5' 2>&1) <3> [W] EXEC '/ghome/makaplan/policies/mynodes.sh' of EXTERNAL POOL or LIST 'x' fails TEST with code 73 on this node. OR [I] Messages tagged with <5> are from node n4. <5> sh: /tmp/mynodes.sh: No such file or directory <5> [E:127] Error on system(/tmp/mynodes.sh TEST '/foo/bar5' 2>&1) <5> [W] EXEC '/tmp/mynodes.sh' of EXTERNAL POOL or LIST 'x' fails TEST with code 127 on this node. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/08/2017 10:51 AM Subject: [gpfsug-discuss] Default placement/External Pool Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello! A follow up to my previous question about automatically sending files to Amazon s3 as they arrive in GPFS. I have created an interface script to manage Amazon s3 storage as an external pool, I have created a migration policy that pre-migrates all files to the external pool and I have set that as the default policy for the file system. All good so far, but the problem I'm now facing is: Only some of the cluster nodes have access to Amazon due to network constraints. I read the statement "The mmapplypolicy command invokes the external pool script on all nodes in the cluster that have installed the script in its designated location."[1] and thought, 'Great! I'll only install the script on nodes that have access to Amazon' but that appears not to work for a placement policy/default policy and instead, the script runs on precisely no nodes. I assumed this happened because running the script on a non-Amazon facing node resulted in a horrible error (i.e. file not found), so I edited my script to return a non-zero response if being run on a node that isn't in my cloudNode class, then installed the script every where. But this appears to have had no effect what-so-ever. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbyb at us.ibm.com Wed Nov 8 20:39:54 2017 From: robbyb at us.ibm.com (Rob Basham) Date: Wed, 8 Nov 2017 20:39:54 +0000 Subject: [gpfsug-discuss] Default placement/External Pool In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 06:22:46 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 07:22:46 +0100 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Message-ID: Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Fri Nov 10 10:06:19 2017 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Fri, 10 Nov 2017 10:06:19 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 10:21:01 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 11:21:01 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Andreas, the version of the GUI and the other packages are the following: gpfs.gui-4.2.3-0.noarch Yes, the collector is running locally on the GUI-Node and it is only one collector configured. The oupt of your command: [root at tower-daemon ~]# echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 1: resolve1|CPU|cpu_user 2: resolve2|CPU|cpu_user 3: sbc-162150007|CPU|cpu_user 4: sbc-162150069|CPU|cpu_user 5: sbc-162150071|CPU|cpu_user 6: sbtl-176173009|CPU|cpu_user 7: tower-daemon|CPU|cpu_user Row Timestamp cpu_user cpu_user cpu_user cpu_user cpu_user cpu_user cpu_user 1 2017-11-10 11:06:00 2.525333 0.151667 0.854333 0.826833 0.836333 0.273833 0.800167 2 2017-11-10 11:07:00 3.052000 0.156833 0.964833 0.946833 0.881833 0.308167 0.896667 3 2017-11-10 11:08:00 4.267167 0.150500 1.134833 1.224833 1.063167 0.300333 0.855333 4 2017-11-10 11:09:00 4.505333 0.149833 1.155333 1.127667 1.098167 0.324500 0.822000 5 2017-11-10 11:10:00 4.023167 0.145667 1.136500 1.079500 1.016000 0.269000 0.836667 6 2017-11-10 11:11:00 2.127167 0.150333 0.903167 0.854833 0.798500 0.280833 0.854500 7 2017-11-10 11:12:00 4.210000 0.151167 0.877833 0.847167 0.836000 0.312500 1.110333 8 2017-11-10 11:13:00 14.388333 0.151000 1.009667 0.986167 0.950333 0.277167 0.814333 9 2017-11-10 11:14:00 18.513167 0.153167 1.048000 0.941333 0.949667 0.282833 0.808333 10 2017-11-10 11:15:00 1.613571 0.149063 0.789630 0.650741 0.826296 0.273333 0.676296 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ [root at tower-daemon ~]# psql postgres postgres -c "select os_host_name from fscc.node;" os_host_name ---------------------- tower sbtl-176173009-admin sbc-162150071-admin sbc-162150069-admin sbc-162150007-admin resolve1-admin resolve2-admin (7rows) The output seems to be ok. Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 11:06 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, 1.) Which GUI version are you running? 2.) Is the Collector running locally on the GUI? 3.) Is there more than one collector configured? 4.) Run the following command on the collector node to verify that there's data in the collector: > echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 5.) Run the following command on the GUI node to verify which host name the GUI uses to query the performance data: psql postgres postgres -c "select os_host_name from fscc.node;" Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 7:23 AM Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=TAzwoRuPR6uYNk_NNemAQPqsxILnSGfc34j4dabTVC0&s=OR8cwq9jfa_GaqXM00kDYFvhoIqPrKR5LT2Anpas3XA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 10:54:17 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 11:54:17 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Some more information: Only the GUI-Node is running on CentOS 7. The Clients are running on CentOS 6.x and RHEL 6.x. Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 11:06 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, 1.) Which GUI version are you running? 2.) Is the Collector running locally on the GUI? 3.) Is there more than one collector configured? 4.) Run the following command on the collector node to verify that there's data in the collector: > echo "get metrics cpu_user last 10 bucket_size 60" | /opt/IBM/zimon/zc 127.0.0.1 5.) Run the following command on the GUI node to verify which host name the GUI uses to query the performance data: psql postgres postgres -c "select os_host_name from fscc.node;" Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 7:23 AM Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=TAzwoRuPR6uYNk_NNemAQPqsxILnSGfc34j4dabTVC0&s=OR8cwq9jfa_GaqXM00kDYFvhoIqPrKR5LT2Anpas3XA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From neil.wilson at metoffice.gov.uk Fri Nov 10 11:19:55 2017 From: neil.wilson at metoffice.gov.uk (Wilson, Neil) Date: Fri, 10 Nov 2017 11:19:55 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the "hostname" field (it's blank by default) of the pmsensors cfg file on that node and restarted pmsensors - the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It's probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Fri Nov 10 12:07:26 2017 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Fri, 10 Nov 2017 12:07:26 +0000 Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 10 12:34:18 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 10 Nov 2017 13:34:18 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Hi Andreas, hi Neil, the GUI-Node returned a hostname with a FQDN. The clients have no FQDN. Thanks for this tip. I will change the hostname in the first step. If this does not help then I will change the configuration files. I will give you feedback in the next week! Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Fri Nov 10 13:25:40 2017 From: john.hearns at asml.com (John Hearns) Date: Fri, 10 Nov 2017 13:25:40 +0000 Subject: [gpfsug-discuss] Spectrum Scale with NVMe In-Reply-To: References: <64b6afd8efb34551a319b5d6e311bbfb@CITESHT4.ad.uillinois.edu> Message-ID: Chad, Thankyou for the reply. Indded I had that issue - I only noticed because I looked at the utisation of the NSDs and a set of them were not being filled with data... A set which were coincidentally all connected to the same server (me whistles innocently....) -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Chad Kerner Sent: Tuesday, November 07, 2017 2:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with > nvme devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and > have been directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the > body of this communication or the attachment thereto (if any), the > information is provided on an AS-IS basis without any express or > implied warranties or liabilities. To the extent you are relying on > this information, you are doing so at your own risk. If you are not > the intended recipient, please notify the sender immediately by > replying to this message and destroy all copies of this message and > any attachments. Neither the sender nor the company/group of companies > he or she represents shall be liable for the proper and complete > transmission of the information contained in this communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce3875dc1def842e88ee308d525e01e80%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pwti5NtVf7c4SClTUc1PWNz5YW4QHWjM5%2F%2BGLdYHoqQ%3D&reserved=0 -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. From peter.chase at metoffice.gov.uk Fri Nov 10 16:18:36 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Fri, 10 Nov 2017 16:18:36 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From stockf at us.ibm.com Fri Nov 10 16:41:19 2017 From: stockf at us.ibm.com (Frederick Stock) Date: Fri, 10 Nov 2017 11:41:19 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: How do you determine if mmapplypolicy is running on a node? Normally mmapplypolicy as a process runs on a single node but its helper processes, policy-help or something similar, run on all the nodes which are referenced by the -N option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s&s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Nov 10 16:42:28 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 10 Nov 2017 11:42:28 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: mmapplypolicy ... -N nodeClass ... will use the nodes in nodeClass as helper nodes to get its work done. mmdsh -N nodeClass command ... will run the SAME command on each of the nodes -- probably not what you want to do with mmapplypolicy. To see more about what mmapplypolicy is doing use options -d 1 (debug info) If you are using -N because you have a lot of files to process, you should also use -g /some-gpfs-temp-directory (see doc) If you are running a small test case, it may happen that you don't see the helper nodes doing anything, because there's not enough time and work to get them going... For test purposes you can coax the helper nodes into action with: options -B 1 -m 1 so that each helper node only does one file at a time. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=WjhVVKkS23BlFGP2KHmkndM0AZ4yB2aC81UUHv8iIZs&s=-dPme1SlhBAqo45xVmtvVWNeAjumd7JrtEksW1U8o5w&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Fri Nov 10 17:15:55 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Fri, 10 Nov 2017 17:15:55 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: Hi Frederick, The ILM active policy (set by mmchpolicy) has an external list rule, the command for the external list runs the mmapplypolicy command. /gpfs1/s3upload/policies/migration.policy has external pool & a migration rule in it. The handler script for the external pool writes the hostname of the server running it out to a file, so that's how I'm trapping which server is running the policy, and that mmapplypolicy is being run. Hope that explains things, if not let me know and I'll have another try :) Regards, Peter -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of gpfsug-discuss-request at spectrumscale.org Sent: 10 November 2017 16:43 To: gpfsug-discuss at spectrumscale.org Subject: gpfsug-discuss Digest, Vol 70, Issue 32 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Spectrum Scale with NVMe (John Hearns) 2. Specifying nodes in commands (Chase, Peter) 3. Re: Specifying nodes in commands (Frederick Stock) 4. Re: Specifying nodes in commands (Marc A Kaplan) ---------------------------------------------------------------------- Message: 1 Date: Fri, 10 Nov 2017 13:25:40 +0000 From: John Hearns To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Message-ID: Content-Type: text/plain; charset="us-ascii" Chad, Thankyou for the reply. Indded I had that issue - I only noticed because I looked at the utisation of the NSDs and a set of them were not being filled with data... A set which were coincidentally all connected to the same server (me whistles innocently....) -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Chad Kerner Sent: Tuesday, November 07, 2017 2:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Spectrum Scale with NVMe Hey John, Once you get /var/mmfs/etc/nsddevices set up, it is all straight forward. We have seen times on reboot where the devices were not ready before gpfs started and the file system started with those disks in an offline state. But, that was just a timing issue with the startup. Chad -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign On 11/7/17, John Hearns wrote: > I am looking for anyone with experience of using Spectrum Scale with > nvme devices. > > I could use an offline brain dump... > > > The specific issue I have is with the nsd device discovery and the naming. > > Before anyone replies, I am gettign excellent support from IBM and > have been directed to the correct documentation. > > I am just looking for any wrinkles or tips that anyone has. > > > Thanks > > -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the > body of this communication or the attachment thereto (if any), the > information is provided on an AS-IS basis without any express or > implied warranties or liabilities. To the extent you are relying on > this information, you are doing so at your own risk. If you are not > the intended recipient, please notify the sender immediately by > replying to this message and destroy all copies of this message and > any attachments. Neither the sender nor the company/group of companies > he or she represents shall be liable for the proper and complete > transmission of the information contained in this communication, or for any delay in its receipt. > -- -- Chad Kerner, Senior Storage Engineer Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce3875dc1def842e88ee308d525e01e80%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pwti5NtVf7c4SClTUc1PWNz5YW4QHWjM5%2F%2BGLdYHoqQ%3D&reserved=0 -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. ------------------------------ Message: 2 Date: Fri, 10 Nov 2017 16:18:36 +0000 From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk ------------------------------ Message: 3 Date: Fri, 10 Nov 2017 11:41:19 -0500 From: "Frederick Stock" To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" How do you determine if mmapplypolicy is running on a node? Normally mmapplypolicy as a process runs on a single node but its helper processes, policy-help or something similar, run on all the nodes which are referenced by the -N option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s&s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 4 Date: Fri, 10 Nov 2017 11:42:28 -0500 From: "Marc A Kaplan" To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Specifying nodes in commands Message-ID: Content-Type: text/plain; charset="iso-8859-1" mmapplypolicy ... -N nodeClass ... will use the nodes in nodeClass as helper nodes to get its work done. mmdsh -N nodeClass command ... will run the SAME command on each of the nodes -- probably not what you want to do with mmapplypolicy. To see more about what mmapplypolicy is doing use options -d 1 (debug info) If you are using -N because you have a lot of files to process, you should also use -g /some-gpfs-temp-directory (see doc) If you are running a small test case, it may happen that you don't see the helper nodes doing anything, because there's not enough time and work to get them going... For test purposes you can coax the helper nodes into action with: options -B 1 -m 1 so that each helper node only does one file at a time. From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/10/2017 11:18 AM Subject: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, I'm running a script triggered from an ILM external list rule. The script has the following command in, and it isn't work as I'd expect: /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P /gpfs1/s3upload/policies/migration.policy --scope fileset I'd expect the mmapplypolicy command to run the policy on all the nodes in the cloudNode class, but it doesn't, it runs on the node that triggered the script. However, the following command does work as I'd expect: /usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset Can any one shed any light on this? Have I just misconstrued how mmapplypolicy works? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=WjhVVKkS23BlFGP2KHmkndM0AZ4yB2aC81UUHv8iIZs&s=-dPme1SlhBAqo45xVmtvVWNeAjumd7JrtEksW1U8o5w&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 70, Issue 32 ********************************************** From peter.chase at metoffice.gov.uk Mon Nov 13 11:14:56 2017 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Mon, 13 Nov 2017 11:14:56 +0000 Subject: [gpfsug-discuss] Specifying nodes in commands Message-ID: Hi Marc, Thanks for your response, there's some handy advice in there that I'll look at further. I'm still struggling a bit with mmapplypolicy and it's -N option. I've changed my external list command to point at a script, that script looks for "LIST" as the first argument, and runs "/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -d 1 -N cloudNode -P /gpfs1/s3upload/policies/migration.policy >>/gpfs1/s3upload/external-list.log 2>&1". If the script is run from the command line on a node that's not in cloudNode class it works without issue and uses nodes in the cloudNode class as helpers, but if the script is called from the active policy, mmapplypolicy runs, but seems to ignore the -N and doesn't use the cloudNode nodes as helpers and instead seems to run locally (from which ever node started the active policy). So now my questions is: why does the -N option appear to be honoured when run from the command line, but not appear to be honoured when triggered by the active policy? Regards, Peter Chase GPCS Team Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk From makaplan at us.ibm.com Mon Nov 13 17:44:23 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 13 Nov 2017 12:44:23 -0500 Subject: [gpfsug-discuss] Specifying nodes in commands In-Reply-To: References: Message-ID: My guess is you have some expectation of how things "ought to be" that does not match how things actually are. If you haven't already done so, put some diagnostics into your script, such as env hostname echo "my args are: $*" And run mmapplypolicy with an explicit node list: mmapplypolicy /some/small-set-of-files -P /mypolicyfile -N node1,node2,node3 -I test -L 1 -d 1 And see how things go Hmmm... reading your post again... It seems perhaps you've got some things out of order or again, incorrect expectations or model of how the this world works... mmapplypolicy reads your policy rules and scans the files and calls the script(s) you've named in the EXEC options of your EXTERNAL rules The scripts are expected to process file lists -- NOT call mmapplypolicy again... Refer to examples in the documentation, and in samples/ilm - and try them! --marc From: "Chase, Peter" To: "'gpfsug-discuss at spectrumscale.org'" Date: 11/13/2017 06:15 AM Subject: Re: [gpfsug-discuss] Specifying nodes in commands Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Marc, Thanks for your response, there's some handy advice in there that I'll look at further. I'm still struggling a bit with mmapplypolicy and it's -N option. I've changed my external list command to point at a script, that script looks for "LIST" as the first argument, and runs "/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -d 1 -N cloudNode -P /gpfs1/s3upload/policies/migration.policy >>/gpfs1/s3upload/external-list.log 2>&1". If the script is run from the command line on a node that's not in cloudNode class it works without issue and uses nodes in the cloudNode class as helpers, but if the script is called from the active policy, mmapplypolicy runs, but seems to ignore the -N and doesn't use the cloudNode nodes as helpers and instead seems to run locally (from which ever node started the active policy). So now my questions is: why does the -N option appear to be honoured when run from the command line, but not appear to be honoured when triggered by the active policy? Regards, Peter Chase GPCS Team Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom Tel: +44 (0)1392 886921 Email: peter.chase at metoffice.gov.uk Website: www.metoffice.gov.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=tNW4WqkmstX3B3t1dvbenDx32bw3S1FQ4BrpLrs1r4o&s=CBzS6KRLe_hQhI4zpeeuvNaYdraGbc7cCV-JTvCgDcM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From damir.krstic at gmail.com Mon Nov 13 20:49:07 2017 From: damir.krstic at gmail.com (Damir Krstic) Date: Mon, 13 Nov 2017 20:49:07 +0000 Subject: [gpfsug-discuss] verbsRdmaSend yes or no Message-ID: I am missing out on SC17 this year because of some instability with our 2 ESS storage arrays. We have just recently upgraded our ESS to 5.2 and we have a question about verbRdmaSend setting. Per IBM and GPFS guidelines for a large cluster, we have this setting off on all compute nodes. We were able to turn it off on ESS 1 (IO1 and IO2). However, IBM was unable to turn it off on ESS 2 (IO3 and IO4). ESS 1 has following filesystem: projects (1PB) ESS 2 has following filesystems: home and hpc All our client nodes have this setting off. So the question is, should we push through and get it disabled on IO3 and IO4 so that we are consistent across the environment? I assume the answer is yes. But I would also like to know what the impact is of leaving it enabled on IO3 and IO4. Thank you. Damir -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 10:16:44 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 10:16:44 +0000 Subject: [gpfsug-discuss] Backing up GPFS config Message-ID: All, A few months ago someone posted to the list all the commands they run to back up their GPFS configuration. Including mmlsfileset -L, the output of mmlsconfig etc, so that in the event of a proper "crap your pants" moment you can not only restore your data, but also your whole configuration. I cannot seem to find this post... does the OP remember and could kindly forward it on to me, or the list again? Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Nov 14 13:35:46 2017 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 14 Nov 2017 14:35:46 +0100 Subject: [gpfsug-discuss] Backing up GPFS config In-Reply-To: References: Message-ID: Plese see https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Back%20Up%20GPFS%20Configuration But also check ?mmcesdr primary backup?. I don't rememner if it included all of mmbackupconfig/mmccr, but I think it did, and it also includes CES config. You don't need to be using CES DR to use it. -jf tir. 14. nov. 2017 kl. 03:16 skrev Sobey, Richard A : > All, > > > > A few months ago someone posted to the list all the commands they run to > back up their GPFS configuration. Including mmlsfileset -L, the output of > mmlsconfig etc, so that in the event of a proper ?crap your pants? moment > you can not only restore your data, but also your whole configuration. > > > > I cannot seem to find this post? does the OP remember and could kindly > forward it on to me, or the list again? > > > > Thanks > > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Tue Nov 14 14:41:50 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Tue, 14 Nov 2017 14:41:50 +0000 Subject: [gpfsug-discuss] Backing up GPFS config In-Reply-To: References: Message-ID: <20171114144149.7lmc46poy24of4yi@utumno.gs.washington.edu> I can't remember if I replied to that post or a different one, but these are the commands we capture output for before running mmbackup: mmlsconfig mmlsnsd mmlscluster mmlscluster --cnfs mmlscluster --ces mmlsnode mmlsdisk ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L mmbackupconfig ${FS_NAME} All the commands but mmbackupconfig produce human-readable output, while mmbackupconfig produces machine-readable output suitable for recovering the filesystem in a disaster. On Tue, Nov 14, 2017 at 10:16:44AM +0000, Sobey, Richard A wrote: > All, > > A few months ago someone posted to the list all the commands they run to back up their GPFS configuration. Including mmlsfileset -L, the output of mmlsconfig etc, so that in the event of a proper "crap your pants" moment you can not only restore your data, but also your whole configuration. > > I cannot seem to find this post... does the OP remember and could kindly forward it on to me, or the list again? > > Thanks > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From Matthias.Knigge at rohde-schwarz.com Tue Nov 14 15:15:58 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Tue, 14 Nov 2017 16:15:58 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Changing the hostname without FQDN does not help. When I change back that the admin-interface is in the same network as the daemon then it works again. Could it be that for the GUI a daemon-interface must set? If yes, where can I set this interface? Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Tue Nov 14 15:18:23 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Tue, 14 Nov 2017 16:18:23 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME In-Reply-To: References: Message-ID: mmfind or rather the convert-script is great! Thanks, Matthias Von: "Marc A Kaplan" An: gpfsug main discussion list Datum: 01.11.2017 15:43 Betreff: [Newsletter] Re: [gpfsug-discuss] Combine different rules - tip: use mmfind & co; FOR FILESET; FILESET_NAME Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Thanks Jonathan B for your comments and tips on experience using mmapplypolicy and policy rules. Good to see that some of the features we put into the product are actually useful. For those not quite as familiar, and have come somewhat later to the game, like Matthias K - I have a few remarks and tips that may be helpful: You can think of and use mmapplypolicy as a fast, parallelized version of the classic `find ... | xargs ... ` pipeline. In fact we've added some "sample" scripts with options that make this easy: samples/ilm/mmfind : "understands" the classic find search arguments as well as all the mmapplypolicy options and the recent versions also support an -xargs option so you can write the classic pipepline as one command: mmfind ... -xargs ... There are debug/diagnostic options so you can see the underlying GPFS commands and policy rules that are generated, so if mmfind doesn't do exactly what you were hoping, you can capture the commands and rules that it does do and tweak/hack those. Two of the most crucial and tricky parts of mmfind are available as separate scripts that can be used separately: tr_findToPol.pl : convert classic options to policy rules. mmxargs : 100% correctly deal with the problem of whitespace and/or "special" characters in the pathnames output as file lists by mmapplypolicy. This is somewhat tricky. EVEN IF you've already worked out your own policy rules and use policy RULE ... EXTERNAL ... EXEC 'myscript' you may want to use mmxargs or "lift" some of the code there-in -- because it is very likely your 'myscript' is not handling the problem of special characters correctly. FILESETs vs POOLs - yes these are "orthogonal" concepts in GPFS (Spectrum Scale!) BUT some customer/admins may choose to direct GPFS to assign to POOL based on FILESET using policy rules clauses like: FOR FILESET('a_fs', 'b_fs') /* handy to restrict a rule to one or a few filesets */ WHERE ... AND (FILESET_NAME LIKE 'xyz_%') AND ... /* restrict to filesets whose name matches a pattern */ -- marc of GPFS_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 16:30:18 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 16:30:18 +0000 Subject: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server In-Reply-To: References: <20171016132932.g5j7vep2frxnsvpf@utumno.gs.washington.edu>, <4B32CB5C696F2849BDEF7DF9EACE884B633F4ACF@SDEB-EXC01.meteo.dz> Message-ID: Hi Scott This looks like what I?m after (thank you Skylar and all others who responded too!) For the uninitiated, what exactly is a User Exit in the context of the following line: ?One way to automate this collection of GPFS configuration data is to use a User Exit. ? Or to put it another way, what is calling the script to be run on the basis of running mmchconfig someparam=someval? I?d like to understand it more. Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Scott Fadden Sent: 16 October 2017 16:35 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server There are some comments on this in the wiki: Backup Spectrum Scale configuration Let me know if anything is missing. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Skylar Thompson > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Date: Mon, Oct 16, 2017 6:29 AM I'm not familiar with GSS, but we have a script that executes the following before backing up a GPFS filesystem so that we have human-readable configuration information: mmlsconfig mmlsnsd mmlscluster mmlsnode mmlsdisk ${FS_NAME} -L mmlsfileset ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L And then executes this for the benefit of GPFS: mmbackupconfig Of course there's quite a bit of overlap for clusters that have more than one filesystem, and even more for filesystems that we backup at the fileset level, but disk is cheap and the hope is it'll make a DR scenario a little bit less harrowing. On Sun, Oct 15, 2017 at 12:44:42PM +0000, atmane khiredine wrote: > Dear All, > > Is there a way to save the GPS configuration? > > OR how backup all GSS > > no backup of data or metadata only configuration for disaster recovery > > for example: > stanza > vdisk > pdisk > RAID code > recovery group > array > > Thank you > > Atmane Khiredine > HPC System Administrator | Office National de la M??t??orologie > T??l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= Click here to report this email as spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 14 16:57:52 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 14 Nov 2017 16:57:52 +0000 Subject: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server In-Reply-To: References: <20171016132932.g5j7vep2frxnsvpf@utumno.gs.washington.edu>, <4B32CB5C696F2849BDEF7DF9EACE884B633F4ACF@SDEB-EXC01.meteo.dz> Message-ID: To answer my own question: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_3.5.0/com.ibm.cluster.gpfs.v3r5.gpfs100.doc/bl1adm_uxtsdrb.htm It?s built in. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A Sent: 14 November 2017 16:30 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Hi Scott This looks like what I?m after (thank you Skylar and all others who responded too!) For the uninitiated, what exactly is a User Exit in the context of the following line: ?One way to automate this collection of GPFS configuration data is to use a User Exit. ? Or to put it another way, what is calling the script to be run on the basis of running mmchconfig someparam=someval? I?d like to understand it more. Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Scott Fadden Sent: 16 October 2017 16:35 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server There are some comments on this in the wiki: Backup Spectrum Scale configuration Let me know if anything is missing. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale ----- Original message ----- From: Skylar Thompson > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Backup All Cluster GSS GPFS Storage Server Date: Mon, Oct 16, 2017 6:29 AM I'm not familiar with GSS, but we have a script that executes the following before backing up a GPFS filesystem so that we have human-readable configuration information: mmlsconfig mmlsnsd mmlscluster mmlsnode mmlsdisk ${FS_NAME} -L mmlsfileset ${FS_NAME} -L mmlspool ${FS_NAME} all -L mmlslicense -L mmlspolicy ${FS_NAME} -L And then executes this for the benefit of GPFS: mmbackupconfig Of course there's quite a bit of overlap for clusters that have more than one filesystem, and even more for filesystems that we backup at the fileset level, but disk is cheap and the hope is it'll make a DR scenario a little bit less harrowing. On Sun, Oct 15, 2017 at 12:44:42PM +0000, atmane khiredine wrote: > Dear All, > > Is there a way to save the GPS configuration? > > OR how backup all GSS > > no backup of data or metadata only configuration for disaster recovery > > for example: > stanza > vdisk > pdisk > RAID code > recovery group > array > > Thank you > > Atmane Khiredine > HPC System Administrator | Office National de la M??t??orologie > T??l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=WDtkF9zLTGGYqFnVnJ3rywZM6KHROA4FpMYi6cUkkKY&m=7Y7vgnMtYTCD5hcc83ShGW1VdOEzZyzil7mhxM0OUbY&s=yhw_G4t4P9iXSTmJvOyfI8EGWxmWKK74spKlLOpAxOA&e= Click here to report this email as spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 15 08:43:28 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 15 Nov 2017 09:43:28 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Problem with the gpfsgui - separate networks for daemon and admin In-Reply-To: References: Message-ID: Strange... I think it is the order of configuration changes. Now it works with severed networks and FQDN. I configured the admin-interface with another network and back to the daemon-network. Then again to the admin-interface and it works fine. So the FQDN should be not the problem. Sometimes a linux system needs a reboot too. ;-) Thanks, Matthias Von: "Andreas Koeninger" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 10.11.2017 13:07 Betreff: [Newsletter] Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi Matthias, what's "hostname" returning on your nodes? 1.) If it is not the one that the GUI has in it's database you can force a refresh by executing the below command on the GUI node: /usr/lpp/mmfs/gui/cli/runtask OS_DETECT --debug 2.) If it is not the one that's shown in the returned performance data you have to restart the pmsensor service on the nodes: systemctl restart pmsensors Mit freundlichen Gr??en / Kind regards Andreas Koeninger Scrum Master and Software Developer / Spectrum Scale GUI and REST API IBM Systems &Technology Group, Integrated Systems Development / M069 ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49-7034-643-0867 Mobile: +49-7034-643-0867 E-Mail: andreas.koeninger at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH / Vorsitzende des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ----- Original message ----- From: "Wilson, Neil" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Date: Fri, Nov 10, 2017 12:20 PM Hi Matthias, Not sure if this will help but we had a very similar issue with the GUI not showing performance data, like you we have separate networks for the gpfs data traffic and management/admin traffic. For some reason when we put the full FQDN of the node into the ?hostname? field (it?s blank by default) of the pmsensors cfg file on that node and restarted pmsensors ? the gui started showing performance data for that node. We ended up removing the auto config for pmsensors from all of our client nodes, then manually configured pmsensors with a custom cfg file on each node. It?s probably not the same for you, but might be worth trying out. Thanks Neil From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Matthias.Knigge at rohde-schwarz.com Sent: 10 November 2017 06:23 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Problem with the gpfsgui - separate networks for daemon and admin Hi at all, when I install the gui without a separate network for the admin commands the gui works. But when I split the networks the gui tells me in the brower: Performance collector did not return any data. All the services like pmsensors, pmcollector, postgresql are running. The firewall is disabled. Any idea for me or some information more needed? Many thanks in advance! Matthias _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=r2ldt2133nWuT-SD27LvI8nFqC4Kx7f47sYAeLaZH84&s=yDy0znk3CG9PZuuQ9yi81wOOwc48Aw8WbMvOjzW_uZI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Wed Nov 15 16:24:52 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 15 Nov 2017 17:24:52 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size Message-ID: Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano From kums at us.ibm.com Wed Nov 15 16:56:36 2017 From: kums at us.ibm.com (Kumaran Rajaram) Date: Wed, 15 Nov 2017 11:56:36 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hi, >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. [snip from mmcrfs] # mmlsfs | egrep 'Block allocation| Estimated number' -j scatter Block allocation type -n 128 Estimated number of nodes that will mount file system [/snip] [snip from man mmcrfs] layoutMap={scatter | cluster} Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round?robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly. The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system?s free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks. The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance). This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks. The block allocation map type cannot be changed after the storage pool has been created. -n NumNodes The estimated number of nodes that will mount the file system in the local cluster and all remote clusters. This is used as a best guess for the initial size of some file system data structures. The default is 32. This value can be changed after the file system has been created but it does not change the existing data structures. Only the newly created data structure is affected by the new value. For example, new storage pool. When you create a GPFS file system, you might want to overestimate the number of nodes that will mount the file system. GPFS uses this information for creating data structures that are essential for achieving maximum parallelism in file system operations (For more information, see GPFS architecture in IBM Spectrum Scale: Concepts, Planning, and Installation Guide ). If you are sure there will never be more than 64 nodes, allow the default value to be applied. If you are planning to add nodes to your system, you should specify a number larger than the default. [/snip from man mmcrfs] Regards, -Kums From: Ivano Talamo To: Date: 11/15/2017 11:25 AM Subject: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Nov 15 18:25:59 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 15 Nov 2017 13:25:59 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Wed Nov 15 23:48:18 2017 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 15 Nov 2017 23:48:18 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: My 2c ... Be careful here about mixing up three different possible effects seen in filesystems 1. Performance degradation as the filesystem approaches 100% full, often due to the difficulty of finding the remaining unallocated blocks. GPFS doesn?t noticeably suffer from this effect compared to its competitors. 2. Performance degradation over time as files get fragmented and so cause extra movement of the actuator arm of a HDD. (hence defrag on Windows and the idea of short stroking drives). 3. Performance degradation as blocks are written further from the fastest part of a hard disk drive. SSDs do not show this effect. Benchmarks on newly formatted empty filesystems are often artificially high compared to performance after say 12 months whether or not the filesystem is near 90%+ capacity utilisation. The -j scatter option allows for more realistic performance measurement when designing for the long term usage of the filesystem. But this is due to the distributed location of the blocks not how full the filesystem is. Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales + 44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 15 Nov 2017, at 11:26, Olaf Weiser wrote: > > to add a comment ... .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > To: gpfsug main discussion list > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? > > Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". > > For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. > > [snip from mmcrfs] > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of nodes that will mount file system > [/snip] > > > [snip from man mmcrfs] > layoutMap={scatter| cluster} > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly. > > The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. The cluster > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks. > > The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks. > > The block allocation map type cannot be changed > after the storage pool has been created. > > > -n NumNodes > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool. > > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default. > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > To: > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=Yu5Gt0RPmbb6KaS_emGivhq5C2A33w5DeecdU2aLViQ&s=K0Mz-y4oBH66YUf1syIXaQ3hxck6WjeEMsM-HNHhqAU&e= > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Thu Nov 16 02:34:57 2017 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 16 Nov 2017 02:34:57 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : > to add a comment ... .. very simply... depending on how you allocate the > physical block storage .... if you - simply - using less physical resources > when reducing the capacity (in the same ratio) .. you get , what you > see.... > > so you need to tell us, how you allocate your block-storage .. (Do you > using RAID controllers , where are your LUNs coming from, are then less > RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the hardware can > deliver.. if you reduce resource.. ... you'll get less , if you enhance > your hardware .. you get more... almost regardless of the total capacity in > #blocks .. > > > > > > > From: "Kumaran Rajaram" > To: gpfsug main discussion list > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem > size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Based on your scenario, write degradation as the file-system is populated > is possible if you had formatted the file-system with "-j cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j scatter" > layoutMap.* Also, we need to ensure the mmcrfs "-n" is set properly. > > [snip from mmcrfs] > > > *# mmlsfs | egrep 'Block allocation| Estimated number' -j > scatter Block allocation type -n 128 > Estimated number of nodes that will mount file system* > [/snip] > > > [snip from man mmcrfs] > * layoutMap={scatter|** cluster}* > > > > > > > > > > > > * Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first uses > a round?robin algorithm to spread the data across all > disks in the storage pool. After a disk is selected, the > location of the data block on the disk is determined by > the block allocation map type. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular file are > kept adjacent to each other within each cluster. If > scatter is specified, the location of the block is chosen > randomly.* > > > > > > > > > * The cluster allocation method may provide > better disk performance for some disk subsystems in > relatively small installations. The benefits of clustered > block allocation diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. **The cluster* > > > * allocation method is the default for GPFS > clusters with eight or fewer nodes and for file systems > with eight or fewer disks.* > > > > > > > * The scatter allocation method provides > more consistent file system performance by averaging out > performance variations due to block location (for many > disk subsystems, the location of the data relative to the > disk edge has a substantial effect on performance).* > > > > *This allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than eight > disks.* > > > * The block allocation map type cannot be changed > after the storage pool has been created.* > > > *-n** NumNodes* > > > > > > > > > * The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. This is used > as a best guess for the initial size of some file system data > structures. The default is 32. This value can be changed after the > file system has been created but it does not change the existing > data structures. Only the newly created data structure is > affected by the new value. For example, new storage pool.* > > > > > > > > > > > > * When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the file system. > GPFS uses this information for creating data structures that are > essential for achieving maximum parallelism in file system > operations (For more information, see GPFS architecture in IBM > Spectrum Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, allow > the default value to be applied. If you are planning to add nodes > to your system, you should specify a number larger than the > default.* > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > To: > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 03:42:05 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 03:42:05 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: Sure... as long we assume that really all physical disk are used .. the fact that was told 1/2 or 1/4 might turn out that one / two complet enclosures 're eliminated ... ? ..that s why I was asking for more details .. I dont see this degration in my environments. . as long the vdisks are big enough to span over all pdisks ( which should be the case for capacity in a range of TB ) ... the performance stays the same Gesendet von IBM Verse Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von:"Jan-Frode Myklebust" An:"gpfsug main discussion list" Datum:Mi. 15.11.2017 21:35Betreff:Re: [gpfsug-discuss] Write performances and filesystem size Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : to add a comment ... .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. From: "Kumaran Rajaram" To: gpfsug main discussion list Date: 11/15/2017 11:56 AM Subject: Re: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, >>Am I missing something? Is this an expected behaviour and someone has an explanation for this? Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap. Also, we need to ensure the mmcrfs "-n" is set properly. [snip from mmcrfs] # mmlsfs | egrep 'Block allocation| Estimated number' -j scatter Block allocation type -n 128 Estimated number of nodes that will mount file system [/snip] [snip from man mmcrfs] layoutMap={scatter| cluster} Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round?robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly. The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system?s free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks. The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance).This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks. The block allocation map type cannot be changed after the storage pool has been created. -n NumNodes The estimated number of nodes that will mount the file system in the local cluster and all remote clusters. This is used as a best guess for the initial size of some file system data structures. The default is 32. This value can be changed after the file system has been created but it does not change the existing data structures. Only the newly created data structure is affected by the new value. For example, new storage pool. When you create a GPFS file system, you might want to overestimate the number of nodes that will mount the file system. GPFS uses this information for creating data structures that are essential for achieving maximum parallelism in file system operations (For more information, see GPFS architecture in IBM Spectrum Scale: Concepts, Planning, and Installation Guide ). If you are sure there will never be more than 64 nodes, allow the default value to be applied. If you are planning to add nodes to your system, you should specify a number larger than the default. [/snip from man mmcrfs] Regards, -Kums From: Ivano Talamo To: Date: 11/15/2017 11:25 AM Subject: [gpfsug-discuss] Write performances and filesystem size Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello everybody, together with my colleagues we are actually running some tests on a new DSS G220 system and we see some unexpected behaviour. What we actually see is that write performances (we did not test read yet) decreases with the decrease of filesystem size. I will not go into the details of the tests, but here are some numbers: - with a filesystem using the full 1.2 PB space we get 14 GB/s as the sum of the disk activity on the two IO servers; - with a filesystem using half of the space we get 10 GB/s; - with a filesystem using 1/4 of the space we get 5 GB/s. We also saw that performances are not affected by the vdisks layout, ie. taking the full space with one big vdisk or 2 half-size vdisks per RG gives the same performances. To our understanding the IO should be spread evenly across all the pdisks in the declustered array, and looking at iostat all disks seem to be accessed. But so there must be some other element that affects performances. Am I missing something? Is this an expected behaviour and someone has an explanation for this? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Thu Nov 16 08:44:06 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Thu, 16 Nov 2017 09:44:06 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: <658ae385-ef78-2303-2eef-1b5ac8824c42@psi.ch> Hello Olaf, yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total. Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size. Regarding the layout allocation we used scatter. The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq. Thanks, Ivano Il 16/11/17 04:42, Olaf Weiser ha scritto: > Sure... as long we assume that really all physical disk are used .. the > fact that was told 1/2 or 1/4 might turn out that one / two complet > enclosures 're eliminated ... ? ..that s why I was asking for more > details .. > > I dont see this degration in my environments. . as long the vdisks are > big enough to span over all pdisks ( which should be the case for > capacity in a range of TB ) ... the performance stays the same > > Gesendet von IBM Verse > > Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and > filesystem size --- > > Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same > number of spindles for any size filesystem, so I would also expect them > to perform the same. > > > > -jf > > > ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >: > > to add a comment ... .. very simply... depending on how you > allocate the physical block storage .... if you - simply - using > less physical resources when reducing the capacity (in the same > ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do > you using RAID controllers , where are your LUNs coming from, are > then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the > hardware can deliver.. if you reduce resource.. ... you'll get less > , if you enhance your hardware .. you get more... almost regardless > of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > > To: gpfsug main discussion list > > > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and > filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone > has an explanation for this? > > Based on your scenario, write degradation as the file-system is > populated is possible if you had formatted the file-system with "-j > cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j > scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is > set properly. > > [snip from mmcrfs]/ > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of > nodes that will mount file system/ > [/snip] > > > [snip from man mmcrfs]/ > *layoutMap={scatter|*//*cluster}*// > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type*. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly.*/ > / > * The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. *//The *cluster*// > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks./ > / > *The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).*//This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks./ > / > The block allocation map type cannot be changed > after the storage pool has been created./ > > */ > -n/*/*NumNodes*// > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool./ > / > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default./ > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > > To: > > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, > ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks > seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone > has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org _ > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From olaf.weiser at de.ibm.com Thu Nov 16 12:03:16 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 12:03:16 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: Message-ID: Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. .. You mean something about vdisk Layout. .. So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right? What about Md .. did you create separate vdisk for MD / what size then ? Gesendet von IBM Verse Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von:"Ivano Talamo" An:"gpfsug main discussion list" Datum:Do. 16.11.2017 03:49Betreff:Re: [gpfsug-discuss] Write performances and filesystem size Hello Olaf,yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total.Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size.Regarding the layout allocation we used scatter.The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq.Thanks,IvanoIl 16/11/17 04:42, Olaf Weiser ha scritto:> Sure... as long we assume that really all physical disk are used .. the> fact that was told 1/2 or 1/4 might turn out that one / two complet> enclosures 're eliminated ... ? ..that s why I was asking for more> details ..>> I dont see this degration in my environments. . as long the vdisks are> big enough to span over all pdisks ( which should be the case for> capacity in a range of TB ) ... the performance stays the same>> Gesendet von IBM Verse>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and> filesystem size --->> Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size>> ------------------------------------------------------------------------>> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same> number of spindles for any size filesystem, so I would also expect them> to perform the same.>>>> -jf>>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >:>> to add a comment ... .. very simply... depending on how you> allocate the physical block storage .... if you - simply - using> less physical resources when reducing the capacity (in the same> ratio) .. you get , what you see....>> so you need to tell us, how you allocate your block-storage .. (Do> you using RAID controllers , where are your LUNs coming from, are> then less RAID groups involved, when reducing the capacity ?...)>> GPFS can be configured to give you pretty as much as what the> hardware can deliver.. if you reduce resource.. ... you'll get less> , if you enhance your hardware .. you get more... almost regardless> of the total capacity in #blocks ..>>>>>>> From: "Kumaran Rajaram" >> To: gpfsug main discussion list> >> Date: 11/15/2017 11:56 AM> Subject: Re: [gpfsug-discuss] Write performances and> filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> > ------------------------------------------------------------------------>>>> Hi,>> >>Am I missing something? Is this an expected behaviour and someone> has an explanation for this?>> Based on your scenario, write degradation as the file-system is> populated is possible if you had formatted the file-system with "-j> cluster".>> For consistent file-system performance, we recommend *mmcrfs "-j> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is> set properly.>> [snip from mmcrfs]/> # mmlsfs | egrep 'Block allocation| Estimated number'> -j scatter Block allocation type> -n 128 Estimated number of> nodes that will mount file system/> [/snip]>>> [snip from man mmcrfs]/> *layoutMap={scatter|*//*cluster}*//> Specifies the block allocation map type. When> allocating blocks for a given file, GPFS first> uses a round?robin algorithm to spread the data> across all disks in the storage pool. After a> disk is selected, the location of the data> block on the disk is determined by the block> allocation map type*. If cluster is> specified, GPFS attempts to allocate blocks in> clusters. Blocks that belong to a particular> file are kept adjacent to each other within> each cluster. If scatter is specified,> the location of the block is chosen randomly.*/> /> * The cluster allocation method may provide> better disk performance for some disk> subsystems in relatively small installations.> The benefits of clustered block allocation> diminish when the number of nodes in the> cluster or the number of disks in a file system> increases, or when the file system?s free space> becomes fragmented. *//The *cluster*//> allocation method is the default for GPFS> clusters with eight or fewer nodes and for file> systems with eight or fewer disks./> /> *The scatter allocation method provides> more consistent file system performance by> averaging out performance variations due to> block location (for many disk subsystems, the> location of the data relative to the disk edge> has a substantial effect on performance).*//This> allocation method is appropriate in most cases> and is the default for GPFS clusters with more> than eight nodes or file systems with more than> eight disks./> /> The block allocation map type cannot be changed> after the storage pool has been created./>> */> -n/*/*NumNodes*//> The estimated number of nodes that will mount the file> system in the local cluster and all remote clusters.> This is used as a best guess for the initial size of> some file system data structures. The default is 32.> This value can be changed after the file system has been> created but it does not change the existing data> structures. Only the newly created data structure is> affected by the new value. For example, new storage> pool./> /> When you create a GPFS file system, you might want to> overestimate the number of nodes that will mount the> file system. GPFS uses this information for creating> data structures that are essential for achieving maximum> parallelism in file system operations (For more> information, see GPFS architecture in IBM Spectrum> Scale: Concepts, Planning, and Installation Guide ). If> you are sure there will never be more than 64 nodes,> allow the default value to be applied. If you are> planning to add nodes to your system, you should specify> a number larger than the default./>> [/snip from man mmcrfs]>> Regards,> -Kums>>>>>> From: Ivano Talamo >> To: >> Date: 11/15/2017 11:25 AM> Subject: [gpfsug-discuss] Write performances and filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> > ------------------------------------------------------------------------>>>> Hello everybody,>> together with my colleagues we are actually running some tests on a new> DSS G220 system and we see some unexpected behaviour.>> What we actually see is that write performances (we did not test read> yet) decreases with the decrease of filesystem size.>> I will not go into the details of the tests, but here are some numbers:>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the> sum of the disk activity on the two IO servers;> - with a filesystem using half of the space we get 10 GB/s;> - with a filesystem using 1/4 of the space we get 5 GB/s.>> We also saw that performances are not affected by the vdisks layout,> ie.> taking the full space with one big vdisk or 2 half-size vdisks per RG> gives the same performances.>> To our understanding the IO should be spread evenly across all the> pdisks in the declustered array, and looking at iostat all disks> seem to> be accessed. But so there must be some other element that affects> performances.>> Am I missing something? Is this an expected behaviour and someone> has an> explanation for this?>> Thank you,> Ivano> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org _> __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>_______________________________________________gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Nov 16 12:37:41 2017 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 16 Nov 2017 12:37:41 +0000 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: , Message-ID: <83A6EEB0EC738F459A39439733AE80451BB738BC@MBX214.d.ethz.ch> Hi Olaf, yes we have separate vdisks for MD: 2 vdisks, each is 100GBytes large, 1MBytes blocksize, 3WayReplication. A ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Olaf Weiser [olaf.weiser at de.ibm.com] Sent: Thursday, November 16, 2017 1:03 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Write performances and filesystem size Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. .. You mean something about vdisk Layout. .. So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right? What about Md .. did you create separate vdisk for MD / what size then ? Gesendet von IBM Verse Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size --- Von: "Ivano Talamo" An: "gpfsug main discussion list" Datum: Do. 16.11.2017 03:49 Betreff: Re: [gpfsug-discuss] Write performances and filesystem size ________________________________ Hello Olaf, yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total. Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size. Regarding the layout allocation we used scatter. The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq. Thanks, Ivano Il 16/11/17 04:42, Olaf Weiser ha scritto: > Sure... as long we assume that really all physical disk are used .. the > fact that was told 1/2 or 1/4 might turn out that one / two complet > enclosures 're eliminated ... ? ..that s why I was asking for more > details .. > > I dont see this degration in my environments. . as long the vdisks are > big enough to span over all pdisks ( which should be the case for > capacity in a range of TB ) ... the performance stays the same > > Gesendet von IBM Verse > > Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and > filesystem size --- > > Von: "Jan-Frode Myklebust" > An: "gpfsug main discussion list" > Datum: Mi. 15.11.2017 21:35 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same > number of spindles for any size filesystem, so I would also expect them > to perform the same. > > > > -jf > > > ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >: > > to add a comment ... .. very simply... depending on how you > allocate the physical block storage .... if you - simply - using > less physical resources when reducing the capacity (in the same > ratio) .. you get , what you see.... > > so you need to tell us, how you allocate your block-storage .. (Do > you using RAID controllers , where are your LUNs coming from, are > then less RAID groups involved, when reducing the capacity ?...) > > GPFS can be configured to give you pretty as much as what the > hardware can deliver.. if you reduce resource.. ... you'll get less > , if you enhance your hardware .. you get more... almost regardless > of the total capacity in #blocks .. > > > > > > > From: "Kumaran Rajaram" > > To: gpfsug main discussion list > > > Date: 11/15/2017 11:56 AM > Subject: Re: [gpfsug-discuss] Write performances and > filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hi, > > >>Am I missing something? Is this an expected behaviour and someone > has an explanation for this? > > Based on your scenario, write degradation as the file-system is > populated is possible if you had formatted the file-system with "-j > cluster". > > For consistent file-system performance, we recommend *mmcrfs "-j > scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is > set properly. > > [snip from mmcrfs]/ > # mmlsfs | egrep 'Block allocation| Estimated number' > -j scatter Block allocation type > -n 128 Estimated number of > nodes that will mount file system/ > [/snip] > > > [snip from man mmcrfs]/ > *layoutMap={scatter|*//*cluster}*// > Specifies the block allocation map type. When > allocating blocks for a given file, GPFS first > uses a round?robin algorithm to spread the data > across all disks in the storage pool. After a > disk is selected, the location of the data > block on the disk is determined by the block > allocation map type*. If cluster is > specified, GPFS attempts to allocate blocks in > clusters. Blocks that belong to a particular > file are kept adjacent to each other within > each cluster. If scatter is specified, > the location of the block is chosen randomly.*/ > / > * The cluster allocation method may provide > better disk performance for some disk > subsystems in relatively small installations. > The benefits of clustered block allocation > diminish when the number of nodes in the > cluster or the number of disks in a file system > increases, or when the file system?s free space > becomes fragmented. *//The *cluster*// > allocation method is the default for GPFS > clusters with eight or fewer nodes and for file > systems with eight or fewer disks./ > / > *The scatter allocation method provides > more consistent file system performance by > averaging out performance variations due to > block location (for many disk subsystems, the > location of the data relative to the disk edge > has a substantial effect on performance).*//This > allocation method is appropriate in most cases > and is the default for GPFS clusters with more > than eight nodes or file systems with more than > eight disks./ > / > The block allocation map type cannot be changed > after the storage pool has been created./ > > */ > -n/*/*NumNodes*// > The estimated number of nodes that will mount the file > system in the local cluster and all remote clusters. > This is used as a best guess for the initial size of > some file system data structures. The default is 32. > This value can be changed after the file system has been > created but it does not change the existing data > structures. Only the newly created data structure is > affected by the new value. For example, new storage > pool./ > / > When you create a GPFS file system, you might want to > overestimate the number of nodes that will mount the > file system. GPFS uses this information for creating > data structures that are essential for achieving maximum > parallelism in file system operations (For more > information, see GPFS architecture in IBM Spectrum > Scale: Concepts, Planning, and Installation Guide ). If > you are sure there will never be more than 64 nodes, > allow the default value to be applied. If you are > planning to add nodes to your system, you should specify > a number larger than the default./ > > [/snip from man mmcrfs] > > Regards, > -Kums > > > > > > From: Ivano Talamo > > To: > > Date: 11/15/2017 11:25 AM > Subject: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Hello everybody, > > together with my colleagues we are actually running some tests on a new > DSS G220 system and we see some unexpected behaviour. > > What we actually see is that write performances (we did not test read > yet) decreases with the decrease of filesystem size. > > I will not go into the details of the tests, but here are some numbers: > > - with a filesystem using the full 1.2 PB space we get 14 GB/s as the > sum of the disk activity on the two IO servers; > - with a filesystem using half of the space we get 10 GB/s; > - with a filesystem using 1/4 of the space we get 5 GB/s. > > We also saw that performances are not affected by the vdisks layout, > ie. > taking the full space with one big vdisk or 2 half-size vdisks per RG > gives the same performances. > > To our understanding the IO should be spread evenly across all the > pdisks in the declustered array, and looking at iostat all disks > seem to > be accessed. But so there must be some other element that affects > performances. > > Am I missing something? Is this an expected behaviour and someone > has an > explanation for this? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org _ > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Thu Nov 16 13:51:51 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Thu, 16 Nov 2017 14:51:51 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hi, as additional information I past the recovery group information in the full and half size cases. In both cases: - data is on sf_g_01_vdisk01 - metadata on sf_g_01_vdisk02 - sf_g_01_vdisk07 is not used in the filesystem. This is with the full-space filesystem: declustered current allowable recovery group arrays vdisks pdisks format version format version ----------------- ----------- ------ ------ -------------- -------------- sf-g-01 3 6 86 4.2.2.0 4.2.2.0 declustered needs replace scrub background activity array service vdisks pdisks spares threshold free space duration task progress priority ----------- ------- ------ ------ ------ --------- ---------- -------- ------------------------- NVR no 1 2 0,0 1 3632 MiB 14 days scrub 95% low DA1 no 4 83 2,44 1 57 TiB 14 days scrub 0% low SSD no 1 1 0,0 1 372 GiB 14 days scrub 79% low declustered checksum vdisk RAID code array vdisk size block size granularity state remarks ------------------ ------------------ ----------- ---------- ---------- ----------- ----- ------- sf_g_01_logTip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip sf_g_01_logTipBackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup sf_g_01_logHome 4WayReplication DA1 144 GiB 2 MiB 4096 ok log sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk01 8+2p DA1 540 TiB 16 MiB 32 KiB ok config data declustered array spare space remarks ------------------ ------------------ ------------- ------- rebuild space DA1 53 pdisk increasing VCD spares is suggested config data disk group fault tolerance remarks ------------------ --------------------------------- ------- rg descriptor 1 enclosure + 1 drawer + 2 pdisk limited by rebuild space system index 1 enclosure + 1 drawer + 2 pdisk limited by rebuild space vdisk disk group fault tolerance remarks ------------------ --------------------------------- ------- sf_g_01_logTip 1 pdisk sf_g_01_logTipBackup 0 pdisk sf_g_01_logHome 1 enclosure + 1 drawer + 1 pdisk limited by rebuild space sf_g_01_vdisk02 1 enclosure + 1 drawer limited by rebuild space sf_g_01_vdisk07 1 enclosure + 1 drawer limited by rebuild space sf_g_01_vdisk01 2 pdisk This is with the half-space filesystem: declustered current allowable recovery group arrays vdisks pdisks format version format version ----------------- ----------- ------ ------ -------------- -------------- sf-g-01 3 6 86 4.2.2.0 4.2.2.0 declustered needs replace scrub background activity array service vdisks pdisks spares threshold free space duration task progress priority ----------- ------- ------ ------ ------ --------- ---------- -------- ------------------------- NVR no 1 2 0,0 1 3632 MiB 14 days scrub 4% low DA1 no 4 83 2,44 1 395 TiB 14 days scrub 0% low SSD no 1 1 0,0 1 372 GiB 14 days scrub 79% low declustered checksum vdisk RAID code array vdisk size block size granularity state remarks ------------------ ------------------ ----------- ---------- ---------- ----------- ----- ------- sf_g_01_logTip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip sf_g_01_logTipBackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup sf_g_01_logHome 4WayReplication DA1 144 GiB 2 MiB 4096 ok log sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 MiB 32 KiB ok sf_g_01_vdisk01 8+2p DA1 270 TiB 16 MiB 32 KiB ok config data declustered array spare space remarks ------------------ ------------------ ------------- ------- rebuild space DA1 68 pdisk increasing VCD spares is suggested config data disk group fault tolerance remarks ------------------ --------------------------------- ------- rg descriptor 1 node + 3 pdisk limited by rebuild space system index 1 node + 3 pdisk limited by rebuild space vdisk disk group fault tolerance remarks ------------------ --------------------------------- ------- sf_g_01_logTip 1 pdisk sf_g_01_logTipBackup 0 pdisk sf_g_01_logHome 1 node + 2 pdisk limited by rebuild space sf_g_01_vdisk02 1 node + 1 pdisk limited by rebuild space sf_g_01_vdisk07 1 node + 1 pdisk limited by rebuild space sf_g_01_vdisk01 2 pdisk Thanks, Ivano Il 16/11/17 13:03, Olaf Weiser ha scritto: > Rjx, that makes it a bit clearer.. as your vdisk is big enough to span > over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... > should bring the same performance. .. > > You mean something about vdisk Layout. .. > So in your test, for the full capacity test, you use just one vdisk per > RG - so 2 in total for 'data' - right? > > What about Md .. did you create separate vdisk for MD / what size then > ? > > Gesendet von IBM Verse > > Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem > size --- > > Von: "Ivano Talamo" > An: "gpfsug main discussion list" > Datum: Do. 16.11.2017 03:49 > Betreff: Re: [gpfsug-discuss] Write performances and filesystem size > > ------------------------------------------------------------------------ > > Hello Olaf, > > yes, I confirm that is the Lenovo version of the ESS GL2, so 2 > enclosures/4 drawers/166 disks in total. > > Each recovery group has one declustered array with all disks inside, so > vdisks use all the physical ones, even in the case of a vdisk that is > 1/4 of the total size. > > Regarding the layout allocation we used scatter. > > The tests were done on the just created filesystem, so no close-to-full > effect. And we run gpfsperf write seq. > > Thanks, > Ivano > > > Il 16/11/17 04:42, Olaf Weiser ha scritto: >> Sure... as long we assume that really all physical disk are used .. the >> fact that was told 1/2 or 1/4 might turn out that one / two complet >> enclosures 're eliminated ... ? ..that s why I was asking for more >> details .. >> >> I dont see this degration in my environments. . as long the vdisks are >> big enough to span over all pdisks ( which should be the case for >> capacity in a range of TB ) ... the performance stays the same >> >> Gesendet von IBM Verse >> >> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and >> filesystem size --- >> >> Von: "Jan-Frode Myklebust" >> An: "gpfsug main discussion list" >> Datum: Mi. 15.11.2017 21:35 >> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size >> >> ------------------------------------------------------------------------ >> >> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same >> number of spindles for any size filesystem, so I would also expect them >> to perform the same. >> >> >> >> -jf >> >> >> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser > >: >> >> to add a comment ... .. very simply... depending on how you >> allocate the physical block storage .... if you - simply - using >> less physical resources when reducing the capacity (in the same >> ratio) .. you get , what you see.... >> >> so you need to tell us, how you allocate your block-storage .. (Do >> you using RAID controllers , where are your LUNs coming from, are >> then less RAID groups involved, when reducing the capacity ?...) >> >> GPFS can be configured to give you pretty as much as what the >> hardware can deliver.. if you reduce resource.. ... you'll get less >> , if you enhance your hardware .. you get more... almost regardless >> of the total capacity in #blocks .. >> >> >> >> >> >> >> From: "Kumaran Rajaram" > > >> To: gpfsug main discussion list >> > > >> Date: 11/15/2017 11:56 AM >> Subject: Re: [gpfsug-discuss] Write performances and >> filesystem size >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> > ------------------------------------------------------------------------ >> >> >> >> Hi, >> >> >>Am I missing something? Is this an expected behaviour and someone >> has an explanation for this? >> >> Based on your scenario, write degradation as the file-system is >> populated is possible if you had formatted the file-system with "-j >> cluster". >> >> For consistent file-system performance, we recommend *mmcrfs "-j >> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is >> set properly. >> >> [snip from mmcrfs]/ >> # mmlsfs | egrep 'Block allocation| Estimated number' >> -j scatter Block allocation type >> -n 128 Estimated number of >> nodes that will mount file system/ >> [/snip] >> >> >> [snip from man mmcrfs]/ >> *layoutMap={scatter|*//*cluster}*// >> Specifies the block allocation map type. When >> allocating blocks for a given file, GPFS first >> uses a round?robin algorithm to spread the data >> across all disks in the storage pool. After a >> disk is selected, the location of the data >> block on the disk is determined by the block >> allocation map type*. If cluster is >> specified, GPFS attempts to allocate blocks in >> clusters. Blocks that belong to a particular >> file are kept adjacent to each other within >> each cluster. If scatter is specified, >> the location of the block is chosen randomly.*/ >> / >> * The cluster allocation method may provide >> better disk performance for some disk >> subsystems in relatively small installations. >> The benefits of clustered block allocation >> diminish when the number of nodes in the >> cluster or the number of disks in a file system >> increases, or when the file system?s free space >> becomes fragmented. *//The *cluster*// >> allocation method is the default for GPFS >> clusters with eight or fewer nodes and for file >> systems with eight or fewer disks./ >> / >> *The scatter allocation method provides >> more consistent file system performance by >> averaging out performance variations due to >> block location (for many disk subsystems, the >> location of the data relative to the disk edge >> has a substantial effect on performance).*//This >> allocation method is appropriate in most cases >> and is the default for GPFS clusters with more >> than eight nodes or file systems with more than >> eight disks./ >> / >> The block allocation map type cannot be changed >> after the storage pool has been created./ >> >> */ >> -n/*/*NumNodes*// >> The estimated number of nodes that will mount the file >> system in the local cluster and all remote clusters. >> This is used as a best guess for the initial size of >> some file system data structures. The default is 32. >> This value can be changed after the file system has been >> created but it does not change the existing data >> structures. Only the newly created data structure is >> affected by the new value. For example, new storage >> pool./ >> / >> When you create a GPFS file system, you might want to >> overestimate the number of nodes that will mount the >> file system. GPFS uses this information for creating >> data structures that are essential for achieving maximum >> parallelism in file system operations (For more >> information, see GPFS architecture in IBM Spectrum >> Scale: Concepts, Planning, and Installation Guide ). If >> you are sure there will never be more than 64 nodes, >> allow the default value to be applied. If you are >> planning to add nodes to your system, you should specify >> a number larger than the default./ >> >> [/snip from man mmcrfs] >> >> Regards, >> -Kums >> >> >> >> >> >> From: Ivano Talamo > > >> To: > > >> Date: 11/15/2017 11:25 AM >> Subject: [gpfsug-discuss] Write performances and filesystem > size >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> > ------------------------------------------------------------------------ >> >> >> >> Hello everybody, >> >> together with my colleagues we are actually running some tests on > a new >> DSS G220 system and we see some unexpected behaviour. >> >> What we actually see is that write performances (we did not test read >> yet) decreases with the decrease of filesystem size. >> >> I will not go into the details of the tests, but here are some > numbers: >> >> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the >> sum of the disk activity on the two IO servers; >> - with a filesystem using half of the space we get 10 GB/s; >> - with a filesystem using 1/4 of the space we get 5 GB/s. >> >> We also saw that performances are not affected by the vdisks layout, >> ie. >> taking the full space with one big vdisk or 2 half-size vdisks per RG >> gives the same performances. >> >> To our understanding the IO should be spread evenly across all the >> pdisks in the declustered array, and looking at iostat all disks >> seem to >> be accessed. But so there must be some other element that affects >> performances. >> >> Am I missing something? Is this an expected behaviour and someone >> has an >> explanation for this? >> >> Thank you, >> Ivano >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org _ >> > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From sandeep.patil at in.ibm.com Thu Nov 16 14:45:18 2017 From: sandeep.patil at in.ibm.com (Sandeep Ramesh) Date: Thu, 16 Nov 2017 20:15:18 +0530 Subject: [gpfsug-discuss] Latest Technical Blogs on Spectrum Scale Message-ID: Dear User Group members, Here are the Development Blogs in last 3 months on Spectrum Scale Technical Topics. Spectrum Scale Monitoring ? Know More ? https://developer.ibm.com/storage/2017/11/16/spectrum-scale-monitoring-know/ IBM Spectrum Scale 5.0 Release ? What?s coming ! https://developer.ibm.com/storage/2017/11/14/ibm-spectrum-scale-5-0-release-whats-coming/ Four Essentials things to know for managing data ACLs on IBM Spectrum Scale? from Windows https://developer.ibm.com/storage/2017/11/13/four-essentials-things-know-managing-data-acls-ibm-spectrum-scale-windows/ GSSUTILS: A new way of running SSR, Deploying or Upgrading ESS Server https://developer.ibm.com/storage/2017/11/13/gssutils/ IBM Spectrum Scale Object Authentication https://developer.ibm.com/storage/2017/11/02/spectrum-scale-object-authentication/ Video Surveillance ? Choosing the right storage https://developer.ibm.com/storage/2017/11/02/video-surveillance-choosing-right-storage/ IBM Spectrum scale object deep dive training with problem determination https://www.slideshare.net/SmitaRaut/ibm-spectrum-scale-object-deep-dive-training Spectrum Scale as preferred software defined storage for Ubuntu OpenStack https://developer.ibm.com/storage/2017/09/29/spectrum-scale-preferred-software-defined-storage-ubuntu-openstack/ IBM Elastic Storage Server 2U24 Storage ? an All-Flash offering, a performance workhorse https://developer.ibm.com/storage/2017/10/06/ess-5-2-flash-storage/ A Complete Guide to Configure LDAP-based authentication with IBM Spectrum Scale? for File Access https://developer.ibm.com/storage/2017/09/21/complete-guide-configure-ldap-based-authentication-ibm-spectrum-scale-file-access/ Deploying IBM Spectrum Scale on AWS Quick Start https://developer.ibm.com/storage/2017/09/18/deploy-ibm-spectrum-scale-on-aws-quick-start/ Monitoring Spectrum Scale Object metrics https://developer.ibm.com/storage/2017/09/14/monitoring-spectrum-scale-object-metrics/ Tier your data with ease to Spectrum Scale Private Cloud(s) using Moonwalk Universal https://developer.ibm.com/storage/2017/09/14/tier-data-ease-spectrum-scale-private-clouds-using-moonwalk-universal/ Why do I see owner as ?Nobody? for my export mounted using NFSV4 Protocol on IBM Spectrum Scale?? https://developer.ibm.com/storage/2017/09/08/see-owner-nobody-export-mounted-using-nfsv4-protocol-ibm-spectrum-scale/ IBM Spectrum Scale? Authentication using Active Directory and LDAP https://developer.ibm.com/storage/2017/09/01/ibm-spectrum-scale-authentication-using-active-directory-ldap/ IBM Spectrum Scale? Authentication using Active Directory and RFC2307 https://developer.ibm.com/storage/2017/09/01/ibm-spectrum-scale-authentication-using-active-directory-rfc2307/ High Availability Implementation with IBM Spectrum Virtualize and IBM Spectrum Scale https://developer.ibm.com/storage/2017/08/30/high-availability-implementation-ibm-spectrum-virtualize-ibm-spectrum-scale/ 10 Frequently asked Questions on configuring Authentication using AD + AUTO ID mapping on IBM Spectrum Scale?. https://developer.ibm.com/storage/2017/08/04/10-frequently-asked-questions-configuring-authentication-using-ad-auto-id-mapping-ibm-spectrum-scale/ IBM Spectrum Scale? Authentication using Active Directory https://developer.ibm.com/storage/2017/07/30/ibm-spectrum-scale-auth-using-active-directory/ Five cool things that you didn?t know Transparent Cloud Tiering on Spectrum Scale can do https://developer.ibm.com/storage/2017/07/29/five-cool-things-didnt-know-transparent-cloud-tiering-spectrum-scale-can/ IBM Spectrum Scale GUI videos https://developer.ibm.com/storage/2017/07/25/ibm-spectrum-scale-gui-videos/ IBM Spectrum Scale? Authentication ? Planning for NFS Access https://developer.ibm.com/storage/2017/07/24/ibm-spectrum-scale-planning-nfs-access/ For more : Search /browse here: https://developer.ibm.com/storage/blog Consolidation list: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/White%20Papers%20%26%20Media -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 16:08:18 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 11:08:18 -0500 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From aelkhouly at sidra.org Thu Nov 16 18:40:51 2017 From: aelkhouly at sidra.org (Ahmad El Khouly) Date: Thu, 16 Nov 2017 18:40:51 +0000 Subject: [gpfsug-discuss] GPFS long waiter Message-ID: <66C328F7-94E9-474F-8AE4-7A4A50DF70E7@sidra.org> Hello all I?m facing long waiter issue and I could not find any way to clear it, I can see all filesystems are responsive and look normal but I can not perform any GPFS commands like mmdf or adding or removing any vdisk, could you please advise how to show more details about this waiter and which pool it is talking about? and any workaround to clear it. 0x7FA0446BF1A0 ( 27706) waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10 (0xFFFFC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace recovery' Ahmed M. Elkhouly Systems Administrator, Scientific Computing Bioinformatics Division Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Nov 16 23:51:39 2017 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 16 Nov 2017 18:51:39 -0500 Subject: [gpfsug-discuss] GPFS long waiter In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Nov 17 13:03:48 2017 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 17 Nov 2017 13:03:48 +0000 Subject: [gpfsug-discuss] GPFS long waiter Message-ID: Hi Ahmed You might take a look at the file system manager nodes (mmlsmgr) and see if any of them are having problems. It looks like some previous ?mmdf? command was launched and got hung up (and perhaps was terminated by ctrl-c) and the helper process is still running. I have seen mmdf get hung up before, and it?s (almost always) associated with the file system manager node in some way. And I?ve had a few PMRs open on this (vers 4.1, early 4.2) ? I have not seen this on any of the latest code levels) But, as Olaf states, getting a mmsnap and opening a PMR might be worthwhile ? what level of GPFS are you running on? Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of Ahmad El Khouly Reply-To: gpfsug main discussion list Date: Thursday, November 16, 2017 at 12:41 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] GPFS long waiter I?m facing long waiter issue and I could not find any way to clear it, I can see all filesystems are responsive and look normal but I can not perform any GPFS commands like mmdf or adding or removing any vdisk, could you please advise how to show more details about this waiter and which pool it is talking about? and any workaround to clear it. 0x7FA0446BF1A0 ( 27706) waiting 20634.654553503 seconds, TSDFCmdThread: on ThCond 0x1803173EE10 (0xFFFFC9003173EE10) (AllocManagerCond), reason 'waiting for pool freeSpace recovery' -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 13:39:47 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 14:39:47 +0100 Subject: [gpfsug-discuss] gpfs.so vfs samba module is missing Message-ID: Hello at all, anyone know in which package I can find the gpfs vfs module? Currently I am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module is still missing. Any ideas for me? Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Fri Nov 17 16:51:02 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Fri, 17 Nov 2017 11:51:02 -0500 Subject: [gpfsug-discuss] gpfs.so vfs samba module is missing In-Reply-To: References: Message-ID: <8805.1510937462@turing-police.cc.vt.edu> On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 19:04:03 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 20:04:03 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** In-Reply-To: <8805.1510937462@turing-police.cc.vt.edu> References: <8805.1510937462@turing-police.cc.vt.edu> Message-ID: https://manpages.debian.org/testing/samba-vfs-modules/vfs_gpfs.8.en.html I do not think so, the module is a part of samba. I installed the package gpfs.smb too but with the same result. Before I use the normal version of samba I used the version of sernet. There was the module available. Now I am working with CentOS 7.3 and samba of the offical repository of CentOS. Thanks, Matthias Von: valdis.kletnieks at vt.edu An: gpfsug main discussion list Datum: 17.11.2017 17:51 Betreff: [Newsletter] Re: [gpfsug-discuss] gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [Anhang "RohdeSchwarzSecure_E-Mail.html" gel?scht von Matthias Knigge/DVS] -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Nov 17 19:45:30 2017 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 17 Nov 2017 19:45:30 +0000 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** In-Reply-To: References: , <8805.1510937462@turing-police.cc.vt.edu> Message-ID: An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Fri Nov 17 19:50:27 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Fri, 17 Nov 2017 20:50:27 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing In-Reply-To: References: , <8805.1510937462@turing-police.cc.vt.edu> Message-ID: That helps me! Thanks! Von: "Christof Schmitt" An: gpfsug-discuss at spectrumscale.org Kopie: gpfsug-discuss at spectrumscale.org Datum: 17.11.2017 20:45 Betreff: [Newsletter] Re: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Whether the gpfs.so module is included depends on each Samba build. Samba provided by Linux distributions typically does not include the gpfs.so module. Sernet package include it. The gpfs.smb Samba build we use in Spectrum Scale also obviously includes the gpfs.so module: # rpm -ql gpfs.smb | grep gpfs.so /usr/lpp/mmfs/lib64/samba/vfs/gpfs.so The main point from a Spectrum Scale point of view: Spectrum Scale only supports the Samba from the gpfs.smb package that was provided with the product. Using any other Samba version is outside of the scope of Spectrum Scale support. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) ----- Original message ----- From: Matthias.Knigge at rohde-schwarz.com Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: [gpfsug-discuss] Antwort: [Newsletter] Re: gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Date: Fri, Nov 17, 2017 12:04 PM https://manpages.debian.org/testing/samba-vfs-modules/vfs_gpfs.8.en.html I do not think so, the module is a part of samba. I installed the package gpfs.smb too but with the same result. Before I use the normal version of samba I used the version of sernet. There was the module available. Now I am working with CentOS 7.3 and samba of the offical repository of CentOS. Thanks, Matthias Von: valdis.kletnieks at vt.edu An: gpfsug main discussion list Datum: 17.11.2017 17:51 Betreff: [Newsletter] Re: [gpfsug-discuss] gpfs.so vfs samba module is missing ***CAUTION_Invalid_Signature*** Gesendet von: gpfsug-discuss-bounces at spectrumscale.org On Fri, 17 Nov 2017 14:39:47 +0100, Matthias.Knigge at rohde-schwarz.com said: > anyone know in which package I can find the gpfs vfs module? Currently I > am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package > provides the vfs module. I updated Samba to 4.6.2 but the gpfs-vfs-module > is still missing. If you're running the IBM-supported protocols server config, you want the rpm 'gpfs.smb'. If you're trying to build your own, your best bet is to punt and install the IBM code. ;) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [Anhang "RohdeSchwarzSecure_E-Mail.html" gel?scht von Matthias Knigge/DVS] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=M1Ebd4GVVmaCFs3t0xgGUpgZUM9CzrxWR9I6cvzUqns&s=ONPhff8MP60AoglpZvh9xBAPlV98nW-SmuWoN4EVzUk&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus at koenighome.de Wed Nov 22 03:13:18 2017 From: marcus at koenighome.de (Marcus Koenig) Date: Wed, 22 Nov 2017 16:13:18 +1300 Subject: [gpfsug-discuss] setxattr via policy Message-ID: Hi there, I've got a question around setting userdefined extended attributes. I have played around a bit with setting certain attributes via mmchattr - but now I want to run a policy to do this for me for certain filesets or file sizes. How would I write my policy to set an attribute like user.testflag1=projectX on a number of files in a fileset that are bigger than 1G for example? Thanks folks. Cheers, Marcus -------------- next part -------------- An HTML attachment was scrubbed... URL: From Matthias.Knigge at rohde-schwarz.com Wed Nov 22 06:23:08 2017 From: Matthias.Knigge at rohde-schwarz.com (Matthias.Knigge at rohde-schwarz.com) Date: Wed, 22 Nov 2017 07:23:08 +0100 Subject: [gpfsug-discuss] Antwort: [Newsletter] setxattr via policy In-Reply-To: References: Message-ID: Good morning, take a look in this directory: cd /usr/lpp/mmfs/samples/ilm/ mmfind or rather tr_findToPol.pl could help you to create a rule/policy. Regards, Matthias Von: Marcus Koenig An: gpfsug-discuss at spectrumscale.org Datum: 22.11.2017 04:13 Betreff: [Newsletter] [gpfsug-discuss] setxattr via policy Gesendet von: gpfsug-discuss-bounces at spectrumscale.org Hi there, I've got a question around setting userdefined extended attributes. I have played around a bit with setting certain attributes via mmchattr - but now I want to run a policy to do this for me for certain filesets or file sizes. How would I write my policy to set an attribute like user.testflag1=projectX on a number of files in a fileset that are bigger than 1G for example? Thanks folks. Cheers, Marcus_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ivano.Talamo at psi.ch Wed Nov 22 08:23:22 2017 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 22 Nov 2017 09:23:22 +0100 Subject: [gpfsug-discuss] Write performances and filesystem size In-Reply-To: References: Message-ID: Hello Olaf, thank you for your reply and for confirming that this is not expected, as we also thought. We did repeat the test with 2 vdisks only without dedicated ones for metadata but the result did not change. We now opened a PMR. Thanks, Ivano Il 16/11/17 17:08, Olaf Weiser ha scritto: > Hi Ivano, > so from this output, the performance degradation is not explainable .. > in my current environments.. , having multiple file systems (so vdisks > on one BB) .. and it works fine .. > > as said .. just open a PMR.. I would'nt consider this as the "expected > behavior" > the only thing is.. the MD disks are a bit small.. so maybe redo your > tests and for a simple compare between 1/2 1/1 or 1/4 capacity test > with 2 vdisks only and /dataAndMetadata/ > cheers > > > > > > From: Ivano Talamo > To: gpfsug main discussion list > Date: 11/16/2017 08:52 AM > Subject: Re: [gpfsug-discuss] Write performances and filesystem size > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > Hi, > > as additional information I past the recovery group information in the > full and half size cases. > In both cases: > - data is on sf_g_01_vdisk01 > - metadata on sf_g_01_vdisk02 > - sf_g_01_vdisk07 is not used in the filesystem. > > This is with the full-space filesystem: > > declustered current allowable > recovery group arrays vdisks pdisks format version format > version > ----------------- ----------- ------ ------ -------------- > -------------- > sf-g-01 3 6 86 4.2.2.0 4.2.2.0 > > > declustered needs replace > scrub background activity > array service vdisks pdisks spares threshold free space > duration task progress priority > ----------- ------- ------ ------ ------ --------- ---------- > -------- ------------------------- > NVR no 1 2 0,0 1 3632 MiB > 14 days scrub 95% low > DA1 no 4 83 2,44 1 57 TiB > 14 days scrub 0% low > SSD no 1 1 0,0 1 372 GiB > 14 days scrub 79% low > > declustered > checksum > vdisk RAID code array vdisk size block > size granularity state remarks > ------------------ ------------------ ----------- ---------- > ---------- ----------- ----- ------- > sf_g_01_logTip 2WayReplication NVR 48 MiB 2 > MiB 4096 ok logTip > sf_g_01_logTipBackup Unreplicated SSD 48 MiB > 2 MiB 4096 ok logTipBackup > sf_g_01_logHome 4WayReplication DA1 144 GiB 2 > MiB 4096 ok log > sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk01 8+2p DA1 540 TiB 16 > MiB 32 KiB ok > > config data declustered array spare space remarks > ------------------ ------------------ ------------- ------- > rebuild space DA1 53 pdisk > increasing VCD spares is suggested > > config data disk group fault tolerance remarks > ------------------ --------------------------------- ------- > rg descriptor 1 enclosure + 1 drawer + 2 pdisk limited by > rebuild space > system index 1 enclosure + 1 drawer + 2 pdisk limited by > rebuild space > > vdisk disk group fault tolerance remarks > ------------------ --------------------------------- ------- > sf_g_01_logTip 1 pdisk > sf_g_01_logTipBackup 0 pdisk > sf_g_01_logHome 1 enclosure + 1 drawer + 1 pdisk limited by > rebuild space > sf_g_01_vdisk02 1 enclosure + 1 drawer limited by > rebuild space > sf_g_01_vdisk07 1 enclosure + 1 drawer limited by > rebuild space > sf_g_01_vdisk01 2 pdisk > > > This is with the half-space filesystem: > > declustered current allowable > recovery group arrays vdisks pdisks format version format > version > ----------------- ----------- ------ ------ -------------- > -------------- > sf-g-01 3 6 86 4.2.2.0 4.2.2.0 > > > declustered needs replace > scrub background activity > array service vdisks pdisks spares threshold free space > duration task progress priority > ----------- ------- ------ ------ ------ --------- ---------- > -------- ------------------------- > NVR no 1 2 0,0 1 3632 MiB > 14 days scrub 4% low > DA1 no 4 83 2,44 1 395 TiB > 14 days scrub 0% low > SSD no 1 1 0,0 1 372 GiB > 14 days scrub 79% low > > declustered > checksum > vdisk RAID code array vdisk size block > size granularity state remarks > ------------------ ------------------ ----------- ---------- > ---------- ----------- ----- ------- > sf_g_01_logTip 2WayReplication NVR 48 MiB 2 > MiB 4096 ok logTip > sf_g_01_logTipBackup Unreplicated SSD 48 MiB > 2 MiB 4096 ok logTipBackup > sf_g_01_logHome 4WayReplication DA1 144 GiB 2 > MiB 4096 ok log > sf_g_01_vdisk02 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk07 3WayReplication DA1 103 GiB 1 > MiB 32 KiB ok > sf_g_01_vdisk01 8+2p DA1 270 TiB 16 > MiB 32 KiB ok > > config data declustered array spare space remarks > ------------------ ------------------ ------------- ------- > rebuild space DA1 68 pdisk > increasing VCD spares is suggested > > config data disk group fault tolerance remarks > ------------------ --------------------------------- ------- > rg descriptor 1 node + 3 pdisk limited by > rebuild space > system index 1 node + 3 pdisk limited by > rebuild space > > vdisk disk group fault tolerance remarks > ------------------ --------------------------------- ------- > sf_g_01_logTip 1 pdisk > sf_g_01_logTipBackup 0 pdisk > sf_g_01_logHome 1 node + 2 pdisk limited by > rebuild space > sf_g_01_vdisk02 1 node + 1 pdisk limited by > rebuild space > sf_g_01_vdisk07 1 node + 1 pdisk limited by > rebuild space > sf_g_01_vdisk01 2 pdisk > > > Thanks, > Ivano > > > > > Il 16/11/17 13:03, Olaf Weiser ha scritto: >> Rjx, that makes it a bit clearer.. as your vdisk is big enough to span >> over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... >> should bring the same performance. .. >> >> You mean something about vdisk Layout. .. >> So in your test, for the full capacity test, you use just one vdisk per >> RG - so 2 in total for 'data' - right? >> >> What about Md .. did you create separate vdisk for MD / what size then >> ? >> >> Gesendet von IBM Verse >> >> Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem >> size --- >> >> Von: "Ivano Talamo" >> An: "gpfsug main discussion list" > >> Datum: Do. 16.11.2017 03:49 >> Betreff: Re: [gpfsug-discuss] Write performances and > filesystem size >> >> ------------------------------------------------------------------------ >> >> Hello Olaf, >> >> yes, I confirm that is the Lenovo version of the ESS GL2, so 2 >> enclosures/4 drawers/166 disks in total. >> >> Each recovery group has one declustered array with all disks inside, so >> vdisks use all the physical ones, even in the case of a vdisk that is >> 1/4 of the total size. >> >> Regarding the layout allocation we used scatter. >> >> The tests were done on the just created filesystem, so no close-to-full >> effect. And we run gpfsperf write seq. >> >> Thanks, >> Ivano >> >> >> Il 16/11/17 04:42, Olaf Weiser ha scritto: >>> Sure... as long we assume that really all physical disk are used .. the >>> fact that was told 1/2 or 1/4 might turn out that one / two complet >>> enclosures 're eliminated ... ? ..that s why I was asking for more >>> details .. >>> >>> I dont see this degration in my environments. . as long the vdisks are >>> big enough to span over all pdisks ( which should be the case for >>> capacity in a range of TB ) ... the performance stays the same >>> >>> Gesendet von IBM Verse >>> >>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and >>> filesystem size --- >>> >>> Von: "Jan-Frode Myklebust" >>> An: "gpfsug main discussion list" >>> Datum: Mi. 15.11.2017 21:35 >>> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size >>> >>> ------------------------------------------------------------------------ >>> >>> Olaf, this looks like a Lenovo ?ESS GLxS? version. Should be using same >>> number of spindles for any size filesystem, so I would also expect them >>> to perform the same. >>> >>> >>> >>> -jf >>> >>> >>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser >> >: >>> >>> to add a comment ... .. very simply... depending on how you >>> allocate the physical block storage .... if you - simply - using >>> less physical resources when reducing the capacity (in the same >>> ratio) .. you get , what you see.... >>> >>> so you need to tell us, how you allocate your block-storage .. (Do >>> you using RAID controllers , where are your LUNs coming from, are >>> then less RAID groups involved, when reducing the capacity ?...) >>> >>> GPFS can be configured to give you pretty as much as what the >>> hardware can deliver.. if you reduce resource.. ... you'll get less >>> , if you enhance your hardware .. you get more... almost regardless >>> of the total capacity in #blocks .. >>> >>> >>> >>> >>> >>> >>> From: "Kumaran Rajaram" >> > >>> To: gpfsug main discussion list >>> >> > >>> Date: 11/15/2017 11:56 AM >>> Subject: Re: [gpfsug-discuss] Write performances and >>> filesystem size >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> >>> >> ------------------------------------------------------------------------ >>> >>> >>> >>> Hi, >>> >>> >>Am I missing something? Is this an expected behaviour and someone >>> has an explanation for this? >>> >>> Based on your scenario, write degradation as the file-system is >>> populated is possible if you had formatted the file-system with "-j >>> cluster". >>> >>> For consistent file-system performance, we recommend *mmcrfs "-j >>> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is >>> set properly. >>> >>> [snip from mmcrfs]/ >>> # mmlsfs | egrep 'Block allocation| Estimated number' >>> -j scatter Block allocation type >>> -n 128 Estimated number of >>> nodes that will mount file system/ >>> [/snip] >>> >>> >>> [snip from man mmcrfs]/ >>> *layoutMap={scatter|*//*cluster}*// >>> Specifies the block allocation map type. When >>> allocating blocks for a given file, GPFS first >>> uses a round?robin algorithm to spread the data >>> across all disks in the storage pool. After a >>> disk is selected, the location of the data >>> block on the disk is determined by the block >>> allocation map type*. If cluster is >>> specified, GPFS attempts to allocate blocks in >>> clusters. Blocks that belong to a particular >>> file are kept adjacent to each other within >>> each cluster. If scatter is specified, >>> the location of the block is chosen randomly.*/ >>> / >>> * The cluster allocation method may provide >>> better disk performance for some disk >>> subsystems in relatively small installations. >>> The benefits of clustered block allocation >>> diminish when the number of nodes in the >>> cluster or the number of disks in a file system >>> increases, or when the file system?s free space >>> becomes fragmented. *//The *cluster*// >>> allocation method is the default for GPFS >>> clusters with eight or fewer nodes and for file >>> systems with eight or fewer disks./ >>> / >>> *The scatter allocation method provides >>> more consistent file system performance by >>> averaging out performance variations due to >>> block location (for many disk subsystems, the >>> location of the data relative to the disk edge >>> has a substantial effect on performance).*//This >>> allocation method is appropriate in most cases >>> and is the default for GPFS clusters with more >>> than eight nodes or file systems with more than >>> eight disks./ >>> / >>> The block allocation map type cannot be changed >>> after the storage pool has been created./ >>> >>> */ >>> -n/*/*NumNodes*// >>> The estimated number of nodes that will mount the file >>> system in the local cluster and all remote clusters. >>> This is used as a best guess for the initial size of >>> some file system data structures. The default is 32. >>> This value can be changed after the file system has been >>> created but it does not change the existing data >>> structures. Only the newly created data structure is >>> affected by the new value. For example, new storage >>> pool./ >>> / >>> When you create a GPFS file system, you might want to >>> overestimate the number of nodes that will mount the >>> file system. GPFS uses this information for creating >>> data structures that are essential for achieving maximum >>> parallelism in file system operations (For more >>> information, see GPFS architecture in IBM Spectrum >>> Scale: Concepts, Planning, and Installation Guide ). If >>> you are sure there will never be more than 64 nodes, >>> allow the default value to be applied. If you are >>> planning to add nodes to your system, you should specify >>> a number larger than the default./ >>> >>> [/snip from man mmcrfs] >>> >>> Regards, >>> -Kums >>> >>> >>> >>> >>> >>> From: Ivano Talamo >> > >>> To: >> > >>> Date: 11/15/2017 11:25 AM >>> Subject: [gpfsug-discuss] Write performances and filesystem >> size >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> >>> >> ------------------------------------------------------------------------ >>> >>> >>> >>> Hello everybody, >>> >>> together with my colleagues we are actually running some tests on >> a new >>> DSS G220 system and we see some unexpected behaviour. >>> >>> What we actually see is that write performances (we did not test read >>> yet) decreases with the decrease of filesystem size. >>> >>> I will not go into the details of the tests, but here are some >> numbers: >>> >>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the >>> sum of the disk activity on the two IO servers; >>> - with a filesystem using half of the space we get 10 GB/s; >>> - with a filesystem using 1/4 of the space we get 5 GB/s. >>> >>> We also saw that performances are not affected by the vdisks layout, >>> ie. >>> taking the full space with one big vdisk or 2 half-size vdisks per RG >>> gives the same performances. >>> >>> To our understanding the IO should be spread evenly across all the >>> pdisks in the declustered array, and looking at iostat all disks >>> seem to >>> be accessed. But so there must be some other element that affects >>> performances. >>> >>> Am I missing something? Is this an expected behaviour and someone >>> has an >>> explanation for this? >>> >>> Thank you, >>> Ivano >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >_ >>> >> > __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_ >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org > >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org > >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jtucker at pixitmedia.com Wed Nov 22 09:20:55 2017 From: jtucker at pixitmedia.com (Jez Tucker) Date: Wed, 22 Nov 2017 09:20:55 +0000 Subject: [gpfsug-discuss] setxattr via policy In-Reply-To: References: Message-ID: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> Hi Marcus, ? Something like this should do you: RULE 'setxattr' LIST 'do_setxattr' FOR FILESET ('xattrfileset') WEIGHT(DIRECTORY_HASH) ACTION(SETXATTR('user.testflag1','projectX')) WHERE ??? KB_ALLOCATED >? [insert required file size limit] Then with one file larger and another file smaller than the limit: root at elmo:/mmfs1/policies# getfattr -n user.testflag1 /mmfs1/data/xattrfileset/* getfattr: Removing leading '/' from absolute path names # file: mmfs1/data/xattrfileset/file.1 user.testflag1="projectX" /mmfs1/data/xattrfileset/file.2: user.testflag1: No such attribute As xattrs are a superb way of automating data operations, for those of you with our Python API have a look over the xattr examples in the git repo: https://github.com/arcapix/gpfsapi-examples as an alternative Pythonic way to achieve this. Cheers, Jez On 22/11/17 03:13, Marcus Koenig wrote: > Hi there, > > I've got a question around setting userdefined extended attributes. I > have played around a bit with setting certain attributes via mmchattr > - but now I want to run a policy to do this for me for certain > filesets or file sizes. > > How would I write my policy to set an attribute like > user.testflag1=projectX on a number of files in a fileset that are > bigger than 1G for example? > > Thanks folks. > > Cheers, > Marcus > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- *Jez Tucker* Head of Research and Development, Pixit Media 07764193820 | jtucker at pixitmedia.com www.pixitmedia.com | Tw:@pixitmedia.com -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus at koenighome.de Wed Nov 22 09:28:56 2017 From: marcus at koenighome.de (Marcus Koenig) Date: Wed, 22 Nov 2017 22:28:56 +1300 Subject: [gpfsug-discuss] setxattr via policy In-Reply-To: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> References: <7b426e0a-2096-ff6a-f9f1-8eeda7114f11@pixitmedia.com> Message-ID: Thanks guys - will test it now - much appreciated. On Wed, Nov 22, 2017 at 10:20 PM, Jez Tucker wrote: > Hi Marcus, > > Something like this should do you: > > RULE 'setxattr' LIST 'do_setxattr' > FOR FILESET ('xattrfileset') > WEIGHT(DIRECTORY_HASH) > ACTION(SETXATTR('user.testflag1','projectX')) > WHERE > KB_ALLOCATED > [insert required file size limit] > > > Then with one file larger and another file smaller than the limit: > > root at elmo:/mmfs1/policies# getfattr -n user.testflag1 > /mmfs1/data/xattrfileset/* > getfattr: Removing leading '/' from absolute path names > # file: mmfs1/data/xattrfileset/file.1 > user.testflag1="projectX" > > /mmfs1/data/xattrfileset/file.2: user.testflag1: No such attribute > > > As xattrs are a superb way of automating data operations, for those of you > with our Python API have a look over the xattr examples in the git repo: > https://github.com/arcapix/gpfsapi-examples as an alternative Pythonic > way to achieve this. > > Cheers, > > Jez > > > > > On 22/11/17 03:13, Marcus Koenig wrote: > > Hi there, > > I've got a question around setting userdefined extended attributes. I have > played around a bit with setting certain attributes via mmchattr - but now > I want to run a policy to do this for me for certain filesets or file sizes. > > How would I write my policy to set an attribute like > user.testflag1=projectX on a number of files in a fileset that are bigger > than 1G for example? > > Thanks folks. > > Cheers, > Marcus > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > -- > *Jez Tucker* > Head of Research and Development, Pixit Media > 07764193820 <07764%20193820> | jtucker at pixitmedia.com > www.pixitmedia.com | Tw:@pixitmedia.com > > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Nov 22 16:51:27 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 22 Nov 2017 11:51:27 -0500 Subject: [gpfsug-discuss] setxattr via policy - extended attributes - tips and hints In-Reply-To: References: Message-ID: Assuming you have a recent version of Spectrum Scale... You can use ACTION(SetXattr(...)) in mmapplypolicy {MIGRATE,LIST} rules and/or in {SET POOL} rules that are evaluated at file creation time. Later... You can use WHERE .... Xattr(...) in any policy rules to test/compare an extended attribute. But watch out for NULL! See the "tips" section of the ILM chapter of the admin guide for some ways to deal with NULL (hints: COALESCE , expr IS NULL, expr IS NOT NULL, CASE ... ) See also mm{ch|ls}attr -d -X --hex-attr and so forth. Also can be used compatibly with {set|get}fattr on Linux --marc -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Nov 23 06:21:10 2017 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Thu, 23 Nov 2017 06:21:10 +0000 Subject: [gpfsug-discuss] tar sparse file data loss Message-ID: Somehow this nugget of joy (that?s most definitely sarcasm, this really sucks) slipped past my radar: http://www-01.ibm.com/support/docview.wss?uid=isg1IV96475 Anyone know if there?s a fix in the 4.1 stream? In my opinion this is 100% a tar bug as the APAR suggests but GPFS has implemented a workaround. See this post from the tar mailing list: https://www.mail-archive.com/bug-tar at gnu.org/msg04209.html It looks like the troublesome code may still exist upstream: http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c#n273 No better way to ensure you?ll hit a problem than to assume you won?t :) -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Thu Nov 23 23:02:46 2017 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 23 Nov 2017 23:02:46 +0000 Subject: [gpfsug-discuss] tar sparse file data loss In-Reply-To: References: Message-ID: <61aa823e50ad4cf3a59de063528e6d12@exch1-cdc.nexus.csiro.au> I logged perhaps the original service request on this but must admit we haven?t tried it of late as have worked around the issue. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] Sent: Thursday, 23 November 2017 4:21 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] tar sparse file data loss Somehow this nugget of joy (that?s most definitely sarcasm, this really sucks) slipped past my radar: http://www-01.ibm.com/support/docview.wss?uid=isg1IV96475 Anyone know if there?s a fix in the 4.1 stream? In my opinion this is 100% a tar bug as the APAR suggests but GPFS has implemented a workaround. See this post from the tar mailing list: https://www.mail-archive.com/bug-tar at gnu.org/msg04209.html It looks like the troublesome code may still exist upstream: http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c#n273 No better way to ensure you?ll hit a problem than to assume you won?t :) -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Sun Nov 26 18:00:37 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Sun, 26 Nov 2017 13:00:37 -0500 Subject: [gpfsug-discuss] Online data migration tool Message-ID: With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron From Greg.Lehmann at csiro.au Sun Nov 26 22:33:45 2017 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Sun, 26 Nov 2017 22:33:45 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: I personally don?t think lack of a migration tool is a problem. I do think that 2 format changes in such quick succession is a problem. I am willing to migrate occasionally, but then the amount of data we have in GPFS is still small. I do value my data, so I'd trust a manual migration using standard tools that have been around for a while over a custom migration tool any day. This last format change seems fairly major to me, so doubly so in this case. Trying to find a plus in this, maybe use it test DR procedures at the same time. Apologies in advance to those, that simply can't. I hope you get your migration tool. To IBM, thank you for making GPFS faster. Greg -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, 27 November 2017 4:01 AM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Online data migration tool With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From S.J.Thompson at bham.ac.uk Sun Nov 26 22:39:48 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Sun, 26 Nov 2017 22:39:48 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: I agree that migration is not easy. We thought we might be able to accomplish it using SOBAR, but the block size has to match in the old and new file-systems. In fact mmfsd asserts if you try. I had a PMR open on this and was told SoBAR can only be used to restore to the same block size and they aren't going to fix it. (Seriously how many people using SOBAR for DR are likely to be able to restore to identical hardware?). Second we thought maybe AFM would help, but we use IFS and child dependent filesets and we can't replicate the structure in the AFM cache. Given there is no other supported way of moving data or converting file-systems, like you we are stuck with significant disruption when we want to replace some aging hardware next year. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of aaron.knister at gmail.com [aaron.knister at gmail.com] Sent: 26 November 2017 18:00 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Online data migration tool With the release of Scale 5.0 it?s no secret that some of the performance features of 5.0 require a new disk format and existing filesystems cannot be migrated in place to get these features. There?s also an issue for long time customers who have had scale since before the 4.1 days where filesystems crested prior to I think 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold metadata. At some point we?re not going to be able to buy storage that?s not got 4K sectors. In both situations IBM has hamstrung its customer base with large filesystems by requiring them to undergo extremely disruptive and expensive filesystem migrations to either keep using their filesystem with new hardware or take advantage of new features. The expensive part comes from having to purchase new storage hardware in order migrate the data. My question is this? I know filesystem migration tools are complicated (I believe that?s why customers purchase support) but why on earth are there no migration tools for these features? How are customers supposed to take the product seriously as a platform for long term storage when IBM is so willing to break the on disk format and leave customers stuck unable to replacing aging storage hardware or leverage new features? What message does this send to customers who have had the product on site for over a decade? There is at least one open RFE on this issue and has been for some time that has seen no movement. That speaks volumes. Frankly I?m a little tired of bringing problems to the mailing list, being told to open RFEs then having the RFEs denied or just sit there stagnant. -Aaron _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From abeattie at au1.ibm.com Sun Nov 26 22:46:13 2017 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Sun, 26 Nov 2017 22:46:13 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Nov 27 14:56:56 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 27 Nov 2017 14:56:56 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: <1511794616.18554.121.camel@strath.ac.uk> On Sun, 2017-11-26 at 13:00 -0500, Aaron Knister wrote: > With the release of Scale 5.0 it?s no secret that some of the > performance features of 5.0 require a new disk format and existing > filesystems cannot be migrated in place to get these features.? > > There?s also an issue for long time customers who have had scale > since before the 4.1 days where filesystems crested prior to I think > 4.1 are not 4K aligned and thus cannot use 4K sector LUNs to hold > metadata. At some point we?re not going to be able to buy storage > that?s not got 4K sectors.? This has been going on since forever. We have had change to 64bit inodes for more than 2 billion files and the ability to mount on Windows. They are like 2.3 and 3.0 changes from memory going back around a decade now. I have a feeling there was another change for mounting HSM'ed file systems on Windows too. I just don't think IBM care. The answer has always been well just start again. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From yguvvala at cambridgecomputer.com Wed Nov 29 16:00:33 2017 From: yguvvala at cambridgecomputer.com (Yugendra Guvvala) Date: Wed, 29 Nov 2017 11:00:33 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511794616.18554.121.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: Hi, I am trying to understand the technical challenges to migrate to GPFS 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to see 5.0 release and hear about some promising features available. But not sure about complexity involved to migrate. ? Thanks, Yugi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:35:04 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:35:04 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: <1511973304.18554.133.camel@strath.ac.uk> On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: > Hi,? > > I am trying to understand the technical challenges to migrate to GPFS > 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to > see 5.0 release and hear about some promising features available. But > not sure about complexity involved to migrate.? > Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 29 16:37:02 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 29 Nov 2017 11:37:02 -0500 Subject: [gpfsug-discuss] SOBAR restore with new blocksize and/or inodesize In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> Message-ID: This redbook http://w3-03.ibm.com/support/techdocs/atsmastr.nsf/3af3af29ce1f19cf86256c7100727a9f/335d8a48048ea78d85258059006dad33/$FILE/SOBAR_Migration_SpectrumScale_v1.0.pdf has these and other hints: -B blocksize, should match the file system block size of the source system, but can also be larger (not smaller). To obtain the file system block size in the source system use the command: mmlsfs -B -i inodesize, should match the file system inode size of the source system, but can also be larger (not smaller). To obtain the inode size in the source system use the following command: mmlsfs -i. Note, in Spectrum Scale it is recommended to use a inodesize of 4K because this well aligns to disk I/O. Our tests have shown that having a greater inode size on the target than on the source works as well. If you really want to shrink the blocksize, some internal testing indicates that works also. Shrinking the inodesize also works, although this will impact the efficiency of small file and extended attributes in-the-inode support. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Wed Nov 29 16:39:25 2017 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 29 Nov 2017 16:39:25 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511973304.18554.133.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. ? Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 29 November 2017 16:35 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Online data migration tool On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: > Hi, > > I am trying to understand the technical challenges to migrate to GPFS > 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to > see 5.0 release and hear about some promising features available. But > not sure about complexity involved to migrate. > Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From scottg at emailhosting.com Wed Nov 29 16:38:07 2017 From: scottg at emailhosting.com (scott) Date: Wed, 29 Nov 2017 11:38:07 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1511973304.18554.133.camel@strath.ac.uk> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: Question: Who at IBM is going to reach out to ESPN - a 24/7 online user - with >15PETABYTES of content? Asking customers to copy, reformat, copy back will just cause IBM to have to support the older version for a longer period of time Just my $.03 (adjusted for inflation) On 11/29/2017 11:35 AM, Jonathan Buzzard wrote: > On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: >> Hi, >> >> I am trying to understand the technical challenges to migrate to GPFS >> 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to >> see 5.0 release and hear about some promising features available. But >> not sure about complexity involved to migrate. >> > Oh that's simple. You copy all your data somewhere else (good luck if > you happen to have a few hundred TB or maybe a PB or more) then > reformat your files system with the new disk format then restore all > your data to your shiny new file system. > > Over the years there have been a number of these "reformats" to get all > the new shiny features, which is the cause of the grumbles because it > is not funny and most people don't have the disk space to just hold > another copy of the data, and even if they did it is extremely > disruptive. > > JAB. > From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:47:27 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:47:27 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <1511974047.18554.135.camel@strath.ac.uk> On Wed, 2017-11-29 at 11:38 -0500, scott wrote: > Question: Who at IBM is going to reach out to ESPN - a 24/7 online > user? > - with >15PETABYTES of content? > > Asking customers to copy, reformat, copy back will just cause IBM to? > have to support the older version for a longer period of time > > Just my $.03 (adjusted for inflation) > Oh you can upgrade to 5.0, it's just if your file system was created with a previous version then you won't get to use all the new features.? I would imagine if you still had a file system created under 2.3 you could mount it on 5.0. Just you would be missing a bunch of features like support for more than 2 billion files, or the ability to mount in on Windows or ... JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Kevin.Buterbaugh at Vanderbilt.Edu Wed Nov 29 16:51:51 2017 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 29 Nov 2017 16:51:51 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Hi All, Well, actually a year ago we started the process of doing pretty much what Richard describes below ? the exception being that we rsync?d data over to the new filesystem group by group. It was no fun but it worked. And now GPFS (and it will always be GPFS ? it will never be Spectrum Scale) version 5 is coming and there are compelling reasons to want to do the same thing over again ? despite the pain. Having said all that, I think it would be interesting to have someone from IBM give an explanation of why Apple can migrate millions of devices to a new filesystem with 99.999999% of the users never even knowing they did it ? but IBM can?t provide a way to migrate to a new filesystem ?in place.? And to be fair to IBM, they do ship AIX with root having a password and Apple doesn?t, so we all have our strengths and weaknesses! ;-) Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Nov 29, 2017, at 10:39 AM, Sobey, Richard A > wrote: Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. ? Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 29 November 2017 16:35 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Online data migration tool On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: Hi, I am trying to understand the technical challenges to migrate to GPFS 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to see 5.0 release and hear about some promising features available. But not sure about complexity involved to migrate. Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Nov 29 16:55:46 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 29 Nov 2017 16:55:46 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Message-ID: <1511974546.18554.138.camel@strath.ac.uk> On Wed, 2017-11-29 at 16:51 +0000, Buterbaugh, Kevin L wrote: [SNIP] > And now GPFS (and it will always be GPFS ? it will never be > Spectrum Scale) Splitter, its Tiger Shark forever ;-) JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From makaplan at us.ibm.com Wed Nov 29 17:37:29 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 29 Nov 2017 12:37:29 -0500 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Wed Nov 29 17:40:51 2017 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Wed, 29 Nov 2017 17:40:51 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Wednesday, November 29, 2017 at 11:38 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Nov 29 17:43:11 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Wed, 29 Nov 2017 17:43:11 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> , Message-ID: You can in place upgrade. I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] Sent: 29 November 2017 17:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Wednesday, November 29, 2017 at 11:38 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 5.0 features? Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Nov 29 17:50:50 2017 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 29 Nov 2017 17:50:50 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> Message-ID: <4FB50580-B5E2-45AD-BABB-C2BE9E99012F@vanderbilt.edu> Simon in correct ? I?d love to be able to support a larger block size for my users who have sane workflows while still not wasting a ton of space for the biomedical folks?. ;-) A question ? will the new, much improved, much faster mmrestripefs that was touted at SC17 require a filesystem that was created with GPFS / Tiger Shark / Spectrum Scale / Multi-media filesystem () version 5 or simply one that has been ?upgraded? to that format? Thanks? Kevin > On Nov 29, 2017, at 11:43 AM, Simon Thompson (IT Research Support) wrote: > > You can in place upgrade. > > I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] > Sent: 29 November 2017 17:40 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. > > From: on behalf of Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Wednesday, November 29, 2017 at 11:38 AM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? > > > The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C755e8b13215f48e4e21508d53750ac45%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636475741979446614&sdata=RpfsLbGTRtlZQ06Winrn65jXQlDYjFHdWuKMvEyZwBI%3D&reserved=0 From knop at us.ibm.com Wed Nov 29 18:27:40 2017 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 29 Nov 2017 13:27:40 -0500 Subject: [gpfsug-discuss] 5.0 features? -- mmrestripefs -b In-Reply-To: References: <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: Kevin, The improved rebalance function (mmrestripefs -b) only depends on the cluster level being (at least) 5.0.0, and will work with older file system formats as well. This particular improvement did not require a change in the format/structure of the file system. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 11/29/2017 12:51 PM Subject: Re: [gpfsug-discuss] 5.0 features? Sent by: gpfsug-discuss-bounces at spectrumscale.org Simon in correct ? I?d love to be able to support a larger block size for my users who have sane workflows while still not wasting a ton of space for the biomedical folks?. ;-) A question ? will the new, much improved, much faster mmrestripefs that was touted at SC17 require a filesystem that was created with GPFS / Tiger Shark / Spectrum Scale / Multi-media filesystem () version 5 or simply one that has been ?upgraded? to that format? Thanks? Kevin > On Nov 29, 2017, at 11:43 AM, Simon Thompson (IT Research Support) wrote: > > You can in place upgrade. > > I think what people are referring to is likely things like the new sub block sizing for **new** filesystems. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of jfosburg at mdanderson.org [jfosburg at mdanderson.org] > Sent: 29 November 2017 17:40 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > I haven?t even heard it?s been released or has been announced. I?ve requested a roadmap discussion. > > From: on behalf of Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Wednesday, November 29, 2017 at 11:38 AM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] 5.0 features? > > Which features of 5.0 require a not-in-place upgrade of a file system? Where has this information been published? > > > The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttp-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss-26data-3D02-257C01-257CKevin.Buterbaugh-2540vanderbilt.edu-257C755e8b13215f48e4e21508d53750ac45-257Cba5a7f39e3be4ab3b45067fa80faecad-257C0-257C0-257C636475741979446614-26sdata-3DRpfsLbGTRtlZQ06Winrn65jXQlDYjFHdWuKMvEyZwBI-253D-26reserved-3D0&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=T_wlNQsuQkBDoQhdS2fe4nbIoDOo5oywJRYfJ6849M8&s=C6m8yyvkVEqEmpozrpgGHNidk4SwpbgpCWO1fvYKffA&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=T_wlNQsuQkBDoQhdS2fe4nbIoDOo5oywJRYfJ6849M8&s=JFaXBwXQ8aaDrZ1mdCvsZ6siAktHtOVvZr7vqiy_Tp4&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From nikhilk at us.ibm.com Wed Nov 29 19:08:11 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Wed, 29 Nov 2017 12:08:11 -0700 Subject: [gpfsug-discuss] Online data migration tool Message-ID: Hi, I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. I hope that clarifies things a little and makes the upgrade path more accessible. Please let me know if there are any other questions or concerns. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Nov 29 19:19:11 2017 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 29 Nov 2017 14:19:11 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> References: <1511794616.18554.121.camel@strath.ac.uk> <1511973304.18554.133.camel@strath.ac.uk> <0546D23D-6D81-49C7-92E5-141078C680A8@vanderbilt.edu> Message-ID: <49425FCD-D1CA-46FE-B1F1-98E5F464707C@ulmer.org> About five years ago (I think) Apple slipped a volume manager[1] in on the unsuspecting. :) If you have a Mac, you might have noticed that the mount type/pattern changed with Lion. CoreStorage was the beginning of building the infrastructure to change a million(?) Macs and several hundred million iPhones and iPads under the users? noses. :) Has anyone seen list of the features that would require the on-disk upgrade? If there isn?t one yet, I think that the biggest failing is not not publishing it ? the natives are restless and it?s not like IBM wouldn?t know... [1] This is what Apple calls it. If you?ve ever used AIX or Linux you?ll just chuckle when you look at the limitations. -- Stephen > On Nov 29, 2017, at 11:51 AM, Buterbaugh, Kevin L wrote: > > Hi All, > > Well, actually a year ago we started the process of doing pretty much what Richard describes below ? the exception being that we rsync?d data over to the new filesystem group by group. It was no fun but it worked. And now GPFS (and it will always be GPFS ? it will never be Spectrum Scale) version 5 is coming and there are compelling reasons to want to do the same thing over again ? despite the pain. > > Having said all that, I think it would be interesting to have someone from IBM give an explanation of why Apple can migrate millions of devices to a new filesystem with 99.999999% of the users never even knowing they did it ? but IBM can?t provide a way to migrate to a new filesystem ?in place.? > > And to be fair to IBM, they do ship AIX with root having a password and Apple doesn?t, so we all have our strengths and weaknesses! ;-) > > Kevin > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > >> On Nov 29, 2017, at 10:39 AM, Sobey, Richard A wrote: >> >> Could we utilise free capacity in the existing filesystem and empty NSDs, create a new FS and AFM migrate data in stages? Terribly long winded and frought with danger and peril... do not pass go... ah, answered my own question. >> >> ? >> >> Richard >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard >> Sent: 29 November 2017 16:35 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Online data migration tool >> >> On Wed, 2017-11-29 at 11:00 -0500, Yugendra Guvvala wrote: >>> Hi, >>> >>> I am trying to understand the technical challenges to migrate to GPFS >>> 5.0 from GPFS 4.3. We currently run GPFS 4.3 and i was all exited to >>> see 5.0 release and hear about some promising features available. But >>> not sure about complexity involved to migrate. >>> >> >> Oh that's simple. You copy all your data somewhere else (good luck if you happen to have a few hundred TB or maybe a PB or more) then reformat your files system with the new disk format then restore all your data to your shiny new file system. >> >> Over the years there have been a number of these "reformats" to get all the new shiny features, which is the cause of the grumbles because it is not funny and most people don't have the disk space to just hold another copy of the data, and even if they did it is extremely disruptive. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From ulmer at ulmer.org Wed Nov 29 19:21:00 2017 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 29 Nov 2017 14:21:00 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Thank you. -- Stephen > On Nov 29, 2017, at 2:08 PM, Nikhil Khandelwal > wrote: > > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Wed Nov 29 22:41:48 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Wed, 29 Nov 2017 17:41:48 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf from those presentations regarding 32 subblocks: "It has a significant performance penalty for small files in large block size filesystems" although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. -Aaron On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For > all Spectrum Scale clusters that are currently at 4.X.X, it is possible to > migrate to 5.0.0 with no offline data migration and no need to move data. > Once these clusters are at 5.0.0, they will benefit from the performance > improvements, new features (such as file audit logging), and various > enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to > these clusters, and that is the increased number of sub-blocks per block > for small file allocation. This means that for file systems with a large > block size and a lot of small files, the overall space utilization will be > the same it currently is in 4.X.X. Since file systems created at 4.X.X and > earlier used a block size that kept this allocation in mind, there should > be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance > improvements, metadata improvements, updated compatibility, new > functionality, and all of the other enhancements will be immediately > available to you once you complete the upgrade to 5.0.0 -- with no need to > reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more > accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikhilk at us.ibm.com Thu Nov 30 00:00:23 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Wed, 29 Nov 2017 17:00:23 -0700 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Hi Aaron, By large block size we are primarily talking about block sizes 4 MB and greater. You are correct, in my previous message I neglected to mention the file create performance for small files on these larger block sizes due to the subblock change. In addition to the added space efficiency, small file creation (for example 32kB files) on large block size filesystems will improve. In the case of a 1 MB block size, there would be no real difference in file creates. For a 16 MB block size, however there will be a performance improvement for small file creation as a part of the subblock change for new filesystems. For users who are upgrading from 4.X.X to 5.0.0, the file creation speed will remain the same after the upgrade. I hope that helps, sorry for the confusion. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption From: Aaron Knister To: gpfsug main discussion list Date: 11/29/2017 03:42 PM Subject: Re: [gpfsug-discuss] Online data migration tool Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf from those presentations regarding 32 subblocks: "It has a significant performance penalty for small files in large block size filesystems" although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. -Aaron On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: Hi, I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. I hope that clarifies things a little and makes the upgrade path more accessible. Please let me know if there are any other questions or concerns. Thank you, Nikhil Khandelwal Spectrum Scale Development Client Adoption _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WUJ15T9xHCCIfLm1wqC74jhfu28fXGLotYoHQvJlMCg&m=GNrHjCLvQL1u_WHVimX2lAlYOGPzciCFrYHGlae3h_E&s=VtVgCRl7kxNRgcl5QeHdZJ0Rz6jCA-jfQXyLztbr5TY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From abeattie at au1.ibm.com Thu Nov 30 01:55:54 2017 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 30 Nov 2017 01:55:54 +0000 Subject: [gpfsug-discuss] 5.0 features? In-Reply-To: References: , <1511794616.18554.121.camel@strath.ac.uk><1511973304.18554.133.camel@strath.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Thu Nov 30 15:35:32 2017 From: aaron.knister at gmail.com (Aaron Knister) Date: Thu, 30 Nov 2017 10:35:32 -0500 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: Oh? I specifically remember Sven talking about the >32 subblocks on the context of file creation speed in addition to space efficiency. If what you?re saying is true, then why do those charts show that feature in the context of file creation performance and specifically mention it as a performance bottleneck? Are the slides incorrect or am I just reading them wrong? Sent from my iPhone > On Nov 30, 2017, at 10:05, Lyle Gayne wrote: > > Aaron, > that is a misunderstanding. The new feature for larger numbers of sub-blocks (varying by block size) has nothing to do with the 50K creates per second or many other performance patterns in GPFS. > > The improved create (and other metadata ops) rates came from identifying and mitigating various locking bottlenecks and optimizing the code paths specifically involved in those ops. > > Thanks > Lyle > > > Aaron Knister ---11/29/2017 05:42:26 PM---Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impressio > > From: Aaron Knister > To: gpfsug main discussion list > Date: 11/29/2017 05:42 PM > Subject: Re: [gpfsug-discuss] Online data migration tool > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > Thanks, Nikhil. Most of that was consistent with my understnading, however I was under the impression that the >32 subblocks code is required to achieve the touted 50k file creates/second that Sven has talked about a bunch of times: > > http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf > http://files.gpfsug.org/presentations/2017/Ehningen/31_-_SSUG17DE_-_Sven_Oehme_-_News_from_Research.pdf > http://files.gpfsug.org/presentations/2016/SC16/12_-_Sven_Oehme_Dean_Hildebrand_-_News_from_IBM_Research.pdf > > from those presentations regarding 32 subblocks: > > "It has a significant performance penalty for small files in large block size filesystems" > > although I'm not clear on the specific definition of "large". Many filesystems I encounter only have a 1M block size so it may not matter there, although that same presentation clearly shows the benefit of larger block sizes which is yet *another* thing for which a migration tool would be helpful. > > -Aaron > > > On Wed, Nov 29, 2017 at 2:08 PM, Nikhil Khandelwal wrote: > Hi, > > I would like to clarify migration path to 5.0.0 from 4.X.X clusters. For all Spectrum Scale clusters that are currently at 4.X.X, it is possible to migrate to 5.0.0 with no offline data migration and no need to move data. Once these clusters are at 5.0.0, they will benefit from the performance improvements, new features (such as file audit logging), and various enhancements that are included in 5.0.0. > > That being said, there is one enhancement that will not be applied to these clusters, and that is the increased number of sub-blocks per block for small file allocation. This means that for file systems with a large block size and a lot of small files, the overall space utilization will be the same it currently is in 4.X.X. Since file systems created at 4.X.X and earlier used a block size that kept this allocation in mind, there should be very little impact on existing file systems. > > Outside of that one particular function, the remainder of the performance improvements, metadata improvements, updated compatibility, new functionality, and all of the other enhancements will be immediately available to you once you complete the upgrade to 5.0.0 -- with no need to reformat, move data, or take your data offline. > > I hope that clarifies things a little and makes the upgrade path more accessible. > > Please let me know if there are any other questions or concerns. > > Thank you, > Nikhil Khandelwal > Spectrum Scale Development > Client Adoption > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=irBRNHjLNBazoPW27vuMTJGyZjdo_8yqZZNkY7RRh5I&s=8nZVi2Wp8LPbXo0Pg6ItJv6GEOk5jINHR05MY_H7a4w&e= > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Thu Nov 30 16:13:30 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 30 Nov 2017 16:13:30 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: <1512058410.18554.151.camel@strath.ac.uk> On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: [SNIP] > Since file systems created at 4.X.X and earlier used a block size > that kept this allocation in mind, there should be very little impact > on existing file systems. That is quite a presumption. I would say that file systems created at 4.X.X and earlier potentially used a block size that was the best *compromise*, and the new options would work a lot better. So for example supporting a larger block size for users who have sane workflows while still not wasting a ton of space for the biomedical folks who abuse the file system as a database. Though I have come to the conclusion to stop them using the file system as a database (no don't do ls in that directory there is 200,000 files and takes minutes to come back) is to put your BOFH hat on quota them on maximum file numbers and suggest to them that they use a database even if it is just sticking it all in SQLite :-D JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From valdis.kletnieks at vt.edu Thu Nov 30 16:27:39 2017 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Thu, 30 Nov 2017 11:27:39 -0500 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? Message-ID: <20014.1512059259@turing-police.cc.vt.edu> We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From S.J.Thompson at bham.ac.uk Thu Nov 30 16:31:48 2017 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 30 Nov 2017 16:31:48 +0000 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? In-Reply-To: <20014.1512059259@turing-police.cc.vt.edu> References: <20014.1512059259@turing-police.cc.vt.edu> Message-ID: Um no, you are talking GPFS protocol between cluster nodes still in multicluster. Contact nodes are where the remote cluster goes to start with, but after that it's just normal node to node gpfs traffic (not just the contact nodes). At least that is my understanding. If you want traffic separation, you need something like AFM. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valdis.kletnieks at vt.edu [valdis.kletnieks at vt.edu] Sent: 30 November 2017 16:27 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? From aaron.s.knister at nasa.gov Thu Nov 30 16:35:04 2017 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Thu, 30 Nov 2017 16:35:04 +0000 Subject: [gpfsug-discuss] mmauth/mmremotecluster wonkyness? In-Reply-To: <20014.1512059259@turing-police.cc.vt.edu> References: <20014.1512059259@turing-police.cc.vt.edu> Message-ID: It?s my understanding and experience that all member nodes of two clusters that are multi-clustered must be able to (and will eventually given enough time/activity) make connections to any and all nodes in both clusters. Even if you don?t designate the 2 protocol nodes as contact nodes I would expect to see connections from remote clusters to the protocol nodes just because of the nature of the beast. If you don?t want remote nodes to make connections to the protocol nodes then I believe you would need to put the protocol nodes in their own cluster. CES/CNFS hasn?t always supported this but I think it is now supported, at least with NFS. On November 30, 2017 at 11:28:03 EST, valdis.kletnieks at vt.edu wrote: We have a 10-node cluster running gpfs 4.2.2.3, where 8 nodes are GPFS contact nodes for 2 filesystems, and 2 are protocol nodes doingNFS exports of the filesystems. But we see some nodes in remote clusters trying to GPFS connect to the 2 protocol nodes anyhow. My reading of the manpages is that the remote cluster is responsible for setting '-n contactNodes' when they do the 'mmremotecluster add', and there's no way to sanity check or enforce that at the local end, and fail/flag connections to unintended non-contact nodes if the remote admin forgets/botches the -n. Is that actually correct? If so, is it time for an RFE? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikhilk at us.ibm.com Thu Nov 30 17:00:08 2017 From: nikhilk at us.ibm.com (Nikhil Khandelwal) Date: Thu, 30 Nov 2017 10:00:08 -0700 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: References: Message-ID: That is fair, there certainly are compromises that have to be made with regards to file space/size/performance when choosing a block size, especially with varied workloads or users who may create 200,000 files at a time :). With an increased the number of subblocks, the compromises and parameters going into this choice change. However, I just didn't want to lose sight of the fact that the remainder of the 5.0.0 features and enhancements (and there are a lot :-) ) are available to all systems, with no need to go through painful data movement or recreating of filesystems. Thanks, Nikhil Khandelwal Spectrum Scale Development Client Adoption From: Jonathan Buzzard To: gpfsug main discussion list Date: 11/30/2017 09:13 AM Subject: Re: [gpfsug-discuss] Online data migration tool Sent by: gpfsug-discuss-bounces at spectrumscale.org On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: [SNIP] > Since file systems created at 4.X.X and earlier used a block size > that kept this allocation in mind, there should be very little impact > on existing file systems. That is quite a presumption. I would say that file systems created at 4.X.X and earlier potentially used a block size that was the best *compromise*, and the new options would work a lot better. So for example supporting a larger block size for users who have sane workflows while still not wasting a ton of space for the biomedical folks who abuse the file system as a database. Though I have come to the conclusion to stop them using the file system as a database (no don't do ls in that directory there is 200,000 files and takes minutes to come back) is to put your BOFH hat on quota them on maximum file numbers and suggest to them that they use a database even if it is just sticking it all in SQLite :-D JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=WUJ15T9xHCCIfLm1wqC74jhfu28fXGLotYoHQvJlMCg&m=RrwCj4KWyu_ykACVG1SYu8EJiDZnH6edu-2rnoalOg4&s=p7xlojuTYL5csXYA94NyL-R5hk7OgLH0qKGTN0peGFk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From skylar2 at u.washington.edu Thu Nov 30 18:01:48 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 18:01:48 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <1512058410.18554.151.camel@strath.ac.uk> References: <1512058410.18554.151.camel@strath.ac.uk> Message-ID: <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> On Thu, Nov 30, 2017 at 04:13:30PM +0000, Jonathan Buzzard wrote: > On Wed, 2017-11-29 at 12:08 -0700, Nikhil Khandelwal wrote: > > [SNIP] > > > Since file systems created at 4.X.X and earlier used a block size > > that kept this allocation in mind, there should be very little impact > > on existing file systems. > > That is quite a presumption. I would say that file systems created at > 4.X.X and earlier potentially used a block size that was the best > *compromise*, and the new options would work a lot better. > > So for example supporting a larger block size for users who have sane > workflows while still not wasting a ton of space for the biomedical > folks who abuse the file system as a database. > > Though I have come to the conclusion to stop them using the file system > as a database (no don't do ls in that directory there is 200,000 files > and takes minutes to come back) is to put your BOFH hat on quota them > on maximum file numbers and suggest to them that they use a database > even if it is just sticking it all in SQLite :-D To be fair, a lot of our biomedical/informatics folks have no choice in the matter because the vendors are imposing a filesystem-as-a-database paradigm on them. Each of our Illumina sequencers, for instance, generates a few million files per run, many of which are images containing raw data from the sequencers that are used to justify refunds for defective reagents. Sure, we could turn them off, but then we're eating $$$ we could be getting back from the vendor. At least SSD prices have come down far enough that we can put our metadata on fast disks now, even if we can't take advantage of the more efficient small file allocation yet. -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From makaplan at us.ibm.com Thu Nov 30 18:34:05 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 30 Nov 2017 13:34:05 -0500 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: It would be interesting to know how well Spectrum Scale large directory and small file features work in these sort of DB-ish applications. You might want to optimize by creating a file system provisioned and tuned for such application... Regardless of file system, `ls -1 | grep ...` in a huge directory is not going to be a good idea. But stats and/or opens on a huge directory to look for a particular file should work pretty well... -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Thu Nov 30 18:41:52 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 18:41:52 +0000 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: <20171130184152.ivvduyzjlp7etys2@utumno.gs.washington.edu> On Thu, Nov 30, 2017 at 01:34:05PM -0500, Marc A Kaplan wrote: > It would be interesting to know how well Spectrum Scale large directory > and small file features work in these sort of DB-ish applications. > > You might want to optimize by creating a file system provisioned and tuned > for such application... > > Regardless of file system, `ls -1 | grep ...` in a huge directory is not > going to be a good idea. But stats and/or opens on a huge directory to > look for a particular file should work pretty well... I've wondered if it would be worthwhile having POSIX look-alike commands like ls and find that plug into the GPFS API rather than making VFS calls. That's of course a project for my Copious Free Time... -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From makaplan at us.ibm.com Thu Nov 30 20:52:09 2017 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 30 Nov 2017 15:52:09 -0500 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: Generally the GPFS API will give you access to some information and functionality that are not available via the Posix API. But I don't think you'll find significant performance difference in cases where there is functional overlap. Going either way (Posix or GPFS-specific) - for each API call the execution path drops into the kernel - and then if required - an inter-process call to the mmfsd daemon process. From: Skylar Thompson To: gpfsug-discuss at spectrumscale.org Date: 11/30/2017 01:42 PM Subject: Re: [gpfsug-discuss] FIle system vs Database Sent by: gpfsug-discuss-bounces at spectrumscale.org On Thu, Nov 30, 2017 at 01:34:05PM -0500, Marc A Kaplan wrote: > It would be interesting to know how well Spectrum Scale large directory > and small file features work in these sort of DB-ish applications. > > You might want to optimize by creating a file system provisioned and tuned > for such application... > > Regardless of file system, `ls -1 | grep ...` in a huge directory is not > going to be a good idea. But stats and/or opens on a huge directory to > look for a particular file should work pretty well... I've wondered if it would be worthwhile having POSIX look-alike commands like ls and find that plug into the GPFS API rather than making VFS calls. That's of course a project for my Copious Free Time... -- -- Skylar Thompson (skylar2 at u.washington.edu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Thu Nov 30 21:42:21 2017 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Thu, 30 Nov 2017 21:42:21 +0000 Subject: [gpfsug-discuss] FIle system vs Database In-Reply-To: References: Message-ID: <20171130214220.pqtizt2q6ysu6cds@utumno.gs.washington.edu> Interesting, thanks for the information Marc. Could there be an improvement for something like "ls -l some-dir" using the API, though? Instead of getdents + stat for every file (entering and leaving kernel mode many times), could it be done in one operation with one context switch? On Thu, Nov 30, 2017 at 03:52:09PM -0500, Marc A Kaplan wrote: > Generally the GPFS API will give you access to some information and > functionality that are not available via the Posix API. > > But I don't think you'll find significant performance difference in cases > where there is functional overlap. > > Going either way (Posix or GPFS-specific) - for each API call the > execution path drops into the kernel - and then if required - an > inter-process call to the mmfsd daemon process. -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From jonathan.buzzard at strath.ac.uk Thu Nov 30 22:02:35 2017 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 30 Nov 2017 22:02:35 +0000 Subject: [gpfsug-discuss] Online data migration tool In-Reply-To: <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> References: <1512058410.18554.151.camel@strath.ac.uk> <20171130180148.jlarxyjgoc4mvre3@utumno.gs.washington.edu> Message-ID: <17e108bf-67af-78af-3e2d-e4a4b99c178d@strath.ac.uk> On 30/11/17 18:01, Skylar Thompson wrote: [SNIP] > To be fair, a lot of our biomedical/informatics folks have no choice in the > matter because the vendors are imposing a filesystem-as-a-database paradigm > on them. Each of our Illumina sequencers, for instance, generates a few > million files per run, many of which are images containing raw data from > the sequencers that are used to justify refunds for defective reagents. > Sure, we could turn them off, but then we're eating $$$ we could be getting > back from the vendor. > Been there too. What worked was having a find script that ran through their files, found directories that had not been accessed for a week and zipped them all up, before nuking the original files. The other thing I would suggest is if they want to buy sequencers from vendors who are brain dead, then that's fine but they are going to have to pay extra for the storage because they are costing way more than the average to store their files. Far to much buying of kit goes on without any thought of the consequences of how to deal with the data it generates. Then there where the proteomics bunch who basically just needed a good thrashing with a very large clue stick, because the zillions of files where the result of their own Perl scripts. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG