From michael.meier at fau.de Tue Feb 1 11:20:37 2022 From: michael.meier at fau.de (Michael Meier) Date: Tue, 1 Feb 2022 12:20:37 +0100 Subject: [gpfsug-discuss] Spectrum Scale and vfs_fruit Message-ID: Hi, A bunch of security updates for Samba were released yesterday, most importantly among them CVE-2021-44142 (https://www.samba.org/samba/security/CVE-2021-44142.html) in the vfs_fruit VFS-module that adds extended support for Apple Clients. Spectrum Scale supports that, so Spectrum Scale might be affected, and I'm trying to find out if we're affected or not. Now we never enabled this via "mmsmb config change --vfs-fruit-enable", and I would expect this to be disabled by default - however, I cannot find an explicit statement like "by default this is disabled" in https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=services-support-vfs-fruit-smb-protocol Am I correct in assuming that it is indeed disabled by default? And how would I verify that? Am I correct in assuming that _if_ it was enabled, then 'fruit' would show up under the 'vfs objects' in 'mmsmb config list'? Regards, -- Michael Meier, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales Rechenzentrum Erlangen Martensstrasse 1, 91058 Erlangen, Germany Tel.: +49 9131 85-20994, Fax: +49 9131 302941 michael.meier at fau.de hpc.fau.de From p.ward at nhm.ac.uk Tue Feb 1 12:28:09 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 1 Feb 2022 12:28:09 +0000 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: Not currently set. I'll look into them. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: 26 January 2022 16:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Awesome, glad that you found them (I missed them the first time too). As for the anomalous changed files, do you have these options set in your client option file? skipacl yes skipaclupdatecheck yes updatectime yes We had similar problems where metadata and ACL updates were interpreted as data changes by mmbackup/dsmc. We also have a case open with IBM where mmbackup will both expire and backup a file in the same run, even in the absence of mtime changes, but it's unclear whether that's program error or something with our include/exclude rules. I'd be curious if you're running into that as well. On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > Good call! > > Yes they are dot files. > > > New issue. > > Mmbackup seems to be backup up the same files over and over without them changing: > areas are being backed up multiple times. > The example below is a co-resident file, the only thing that has changed since it was created 20/10/21, is the file has been accessed for backup. > This file is in the 'changed' list in mmbackup: > > This list has just been created: > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > Listing the last few files in the file (selecting the last one) > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > Check the file stats (access time just before last backup) > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File: '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > Size: 545 Blocks: 32 IO Block: 4194304 regular file > Device: 2bh/43d Inode: 212618897 Links: 1 > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: (1399647564/NHM\dg-mbl-urban-nature-project-rw) > Context: unconfined_u:object_r:unlabeled_t:s0 > Access: 2022-01-25 06:40:58.334961446 +0000 > Modify: 2020-12-01 15:20:40.122053000 +0000 > Change: 2021-10-20 17:55:18.265746459 +0100 > Birth: - > > Check if migrated > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File name : /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > On-line size : 545 > Used blocks : 16 > Data Version : 1 > Meta Version : 1 > State : Co-resident > Container Index : 1 > Base Name : 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > Check if immutable > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > file name: /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > metadata replication: 2 max 2 > data replication: 2 max 2 > immutable: no > appendOnly: no > flags: > storage pool name: data > fileset name: hpc-workspaces-fset > snapshot name: > creation time: Wed Oct 20 17:55:18 2021 > Misc attributes: ARCHIVE > Encrypted: no > > Check active and inactive backups (it was backed up yesterday) > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 11:19:02 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > 11:07:05 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/25/2022 06:41:17 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > It will be backed up again shortly, why? > > And it was backed up again: > # dsmcqbi > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 15:54:09 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > 15:30:03 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/26/2022 12:23:02 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/25/2022 06:41:17 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Skylar > Thompson > Sent: 24 January 2022 15:37 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Hi Paul, > > Did you look for dot files? At least for us on 5.0.5 there's a .list.1. file while the backups are running: > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > Those directories are empty > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of IBM Spectrum > > Scale > > Sent: 22 January 2022 00:35 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > Instead of calculating *.ix.* files, please look at a list file in these directories. > > > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked.]"Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/21/2022 09:38 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > of the script I now copy the contents of the .mmbackupCfg folder to > > a date stamped logging folder Checking how many entries in these files compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you > > > > Right in the command line seems to have worked. > > At the end of the script I now copy the contents of the .mmbackupCfg > > folder to a date stamped logging folder > > > > Checking how many entries in these files compared to the Summary: > > wc -l mmbackup* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 754 total > > From Summary > > Total number of objects inspected: 755 > > I can live with a discrepancy of 1. > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > From Summary > > Total number of objects expired: 2 > > That matches > > > > wc -l mmbackupC* mmbackupS* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 752 total > > Summary: > > Total number of objects backed up: 751 > > > > A difference of 1 I can live with. > > > > What does Statech stand for? > > > > Just this to sort out: > > Total number of objects failed: 1 > > I will add: > > --tsm-errorlog TSMErrorLogFile > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 19 January 2022 15:09 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > This is to set environment for mmbackup. > > If mmbackup is invoked within a script, you can set "export DEBUGmmbackup=2" right above mmbackup command. > > e.g) in your script > > .... > > export DEBUGmmbackup=2 > > mmbackup .... > > > > Or, you can set it in the same command line like > > DEBUGmmbackup=2 mmbackup .... > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to se]"Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to see if they are the cluster manager. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/19/2022 06:04 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > they are the cluster manager. If they are, then they take > > responsibility to start the backup script. The script then randomly selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you. > > > > We run a script on all our nodes that checks to see if they are the cluster manager. > > If they are, then they take responsibility to start the backup script. > > The script then randomly selects one of the available backup nodes and uses dsmsh mmbackup on it. > > > > Where does this command belong? > > I have seen it listed as a export command, again where should that be run ? on all backup nodes, or all nodes? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 18 January 2022 22:54 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files even after successful backup. They are available at MMBACKUP_RECORD_ROOT (default is FSroot or FilesetRoot directory). > > In .mmbackupCfg directory, there are 3 directories: > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to back]"Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to backup using mmbackup. > > > > From: "Paul Ward" > > > To: > > "gpfsug-discuss at spectrumscale.org > org>" > > > org>> > > Date: 01/18/2022 11:56 AM > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > using mmbackup. I have increased the -L value from 3 up to 6 but > > only seem to see the files that are in scope, not the ones that are selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > I am trying to work out what files have been sent to backup using mmbackup. > > I have increased the -L value from 3 up to 6 but only seem to see the files that are in scope, not the ones that are selected. > > > > I can see the three file lists generated during a backup, but can?t seem to find a list of what files were backed up. > > > > It should be the diff of the shadow and shadow-old, but the wc -l of the diff doesn?t match the number of files in the backup summary. > > Wrong assumption? > > > > Where should I be looking ? surely it shouldn?t be this hard to see what files are selected? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > er > > ved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > es > > erved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > 2F > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > 7f > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > d8 > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > C2 > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > ;r > > eserved=0> > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&reserved=0 From dehaan at us.ibm.com Tue Feb 1 16:14:07 2022 From: dehaan at us.ibm.com (David DeHaan) Date: Tue, 1 Feb 2022 09:14:07 -0700 Subject: [gpfsug-discuss] Spectrum Scale and vfs_fruit In-Reply-To: References: Message-ID: Yes, it is disabled by default. And yes, you can tell if it has been enabled by looking at the smb config list. This is what a non-fruit vfs-object line looks like vfs objects = shadow_copy2 syncops gpfs fileid time_audit This is one that has been "fruitified" vfs objects = shadow_copy2 syncops fruit streams_xattr gpfs fileid time_audit *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* David DeHaan Spectrum Scale Test *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From: "Michael Meier" To: gpfsug-discuss at spectrumscale.org Date: 02/01/2022 04:26 AM Subject: [EXTERNAL] [gpfsug-discuss] Spectrum Scale and vfs_fruit Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, A bunch of security updates for Samba were released yesterday, most importantly among them CVE-2021-44142 ( https://www.samba.org/samba/security/CVE-2021-44142.html ) in the vfs_fruit VFS-module that adds extended support for Apple Clients. Spectrum Scale supports that, so Spectrum Scale might be affected, and I'm trying to find out if we're affected or not. Now we never enabled this via "mmsmb config change --vfs-fruit-enable", and I would expect this to be disabled by default - however, I cannot find an explicit statement like "by default this is disabled" in https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=services-support-vfs-fruit-smb-protocol Am I correct in assuming that it is indeed disabled by default? And how would I verify that? Am I correct in assuming that _if_ it was enabled, then 'fruit' would show up under the 'vfs objects' in 'mmsmb config list'? Regards, -- Michael Meier, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales Rechenzentrum Erlangen Martensstrasse 1, 91058 Erlangen, Germany Tel.: +49 9131 85-20994, Fax: +49 9131 302941 michael.meier at fau.de hpc.fau.de _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ivano.talamo at psi.ch Wed Feb 2 09:07:13 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 09:07:13 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce Message-ID: Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Wed Feb 2 09:33:25 2022 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 2 Feb 2022 09:33:25 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: Message-ID: Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec > > By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. > > Thanks, > Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed Feb 2 10:07:25 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 2 Feb 2022 10:07:25 +0000 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Message-ID: Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:548be828-dcc2-4a88-ac2e-ff5106b3f802] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-iity4nk4 Type: application/octet-stream Size: 2541 bytes Desc: Outlook-iity4nk4 URL: From ivano.talamo at psi.ch Wed Feb 2 10:45:26 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 10:45:26 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , Message-ID: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From sthompson2 at lenovo.com Wed Feb 2 10:52:27 2022 From: sthompson2 at lenovo.com (Simon Thompson2) Date: Wed, 2 Feb 2022 10:52:27 +0000 Subject: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is "HA". BUT if you shutdown the GUI servers (say you are waiting for a log4j patch ...) then you have no snapshot automation. Due to the way we structured independent filesets, this could be 50 or so to automate and we wanted to set a say 4 day retention policy. So clicking in the GUI was pretty simple to do this for. What we did found is it a snapshot failed to delete for some reason (quiesce etc), then the GUI never tried again to clean it up so we have monitoring to look for unexpected snapshots that needed cleaning up. Simon ________________________________ Simon Thompson He/Him/His Senior Storage Performance WW HPC Customer Solutions Lenovo UK [Phone]+44 7788 320635 [Email]sthompson2 at lenovo.com Lenovo.com Twitter | Instagram | Facebook | Linkedin | YouTube | Privacy [cid:image003.png at 01D81822.F63BAB90] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Kidger, Daniel Sent: 02 February 2022 10:07 To: gpfsug-discuss at spectrumscale.org Subject: [External] [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:image004.png at 01D81822.F63BAB90] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 20109 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 2541 bytes Desc: image004.png URL: From jordi.caubet at es.ibm.com Wed Feb 2 11:07:37 2022 From: jordi.caubet at es.ibm.com (Jordi Caubet Serrabou) Date: Wed, 2 Feb 2022 11:07:37 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: <4326cfae883b4378bcb284b6daecb05e@psi.ch> References: <4326cfae883b4378bcb284b6daecb05e@psi.ch>, , Message-ID: An HTML attachment was scrubbed... URL: From janfrode at tanso.net Wed Feb 2 11:53:50 2022 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 2 Feb 2022 12:53:50 +0100 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou < jordi.caubet at es.ibm.com> wrote: > Ivano, > > if it happens frequently, I would recommend to open a support case. > > The creation or deletion of a snapshot requires a quiesce of the nodes to > obtain a consistent point-in-time image of the file system and/or update > some internal structures afaik. Quiesce is required for nodes at the > storage cluster but also remote clusters. Quiesce means stop activities > (incl. I/O) for a short period of time to get such consistent image. Also > waiting to flush any data in-flight to disk that does not allow a > consistent point-in-time image. > > Nodes receive a quiesce request and acknowledge when ready. When all nodes > acknowledge, snapshot operation can proceed and immediately I/O can resume. > It usually takes few seconds at most and the operation performed is short > but time I/O is stopped depends of how long it takes to quiesce the nodes. > If some node take longer to agree stop the activities, such node will > be delay the completion of the quiesce and keep I/O paused on the rest. > There could many things while some nodes delay quiesce ack. > > The larger the cluster, the more difficult it gets. The more network > congestion or I/O load, the more difficult it gets. I recommend to open a > ticket for support to try to identify the root cause of which nodes not > acknowledge the quiesce and maybe find the root cause. If I recall some > previous thread, default timeout was 60 seconds which match your log > message. After such timeout, snapshot is considered failed to complete. > > Support might help you understand the root cause and provide some > recommendations if it happens frequently. > > Best Regards, > -- > Jordi Caubet Serrabou > IBM Storage Client Technical Specialist (IBM Spain) > > > ----- Original message ----- > From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Date: Wed, Feb 2, 2022 11:45 AM > > > Hello Andrew, > > > > Thanks for your questions. > > > > We're not experiencing any other issue/slowness during normal activity. > > The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool > for metadata only. > > > > The two NSD servers have 750GB of RAM and 618 are configured as pagepool. > > > > The issue we see is happening on both the two filesystems we have: > > > > - perf filesystem: > > - 1.8 PB size (71% in use) > > - 570 milions of inodes (24% in use) > > > > - tiered filesystem: > > - 400 TB size (34% in use) > > - 230 Milions of files (60% in use) > > > > Cheers, > > Ivano > > > > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> on behalf of Andrew Beattie < > abeattie at au1.ibm.com> > *Sent:* Wednesday, February 2, 2022 10:33 AM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Ivano, > > How big is the filesystem in terms of number of files? > How big is the filesystem in terms of capacity? > Is the Metadata on Flash or Spinning disk? > Do you see issues when users do an LS of the filesystem or only when you > are doing snapshots. > > How much memory do the NSD servers have? > How much is allocated to the OS / Spectrum > Scale Pagepool > > Regards > > Andrew Beattie > Technical Specialist - Storage for Big Data & AI > IBM Technology Group > IBM Australia & New Zealand > P. +61 421 337 927 > E. abeattie at au1.IBM.com > > > > > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: > > > ? > > > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe > also when creating new ones) the filesystem becomes inaccessible on the > clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote > cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 > msec > > By looking around I see we're not the first one. I am wondering if that's > considered an unavoidable part of the snapshotting and if there's any > tunable that can improve the situation. Since when this occurs all the > clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage > cluster is on 5.1.1-0. > > Thanks, > Ivano > > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: > > International Business Machines, S.A. > > Santa Hortensia, 26-28, 28002 Madrid > > Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 > > CIF A28-010791 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Feb 2 12:09:24 2022 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 2 Feb 2022 12:09:24 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed Feb 2 12:08:54 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 2 Feb 2022 12:08:54 +0000 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: Simon, Thanks - that is a good insight. The HA 'feature' of the snapshot automation is perhaps a key feature as Linux still lacks a decent 'cluster cron' Also, If "HA" do we know where the state is centrally kept? On the point of snapshots being left undeleted, do you ever use /usr/lpp/mmfs/gui/cli/lssnapops to see what the queue of outstanding actions is like? (There is also a notification tool: lssnapnotify in that directory that is supposed to alert on failed snapshot actions, although personally I have never used it) Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:fce0ce85-6ae4-44ce-aa94-d7d099e68acb] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Simon Thompson2 Sent: 02 February 2022 10:52 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is ?HA?. BUT if you shutdown the GUI servers (say you are waiting for a log4j patch ?) then you have no snapshot automation. Due to the way we structured independent filesets, this could be 50 or so to automate and we wanted to set a say 4 day retention policy. So clicking in the GUI was pretty simple to do this for. What we did found is it a snapshot failed to delete for some reason (quiesce etc), then the GUI never tried again to clean it up so we have monitoring to look for unexpected snapshots that needed cleaning up. Simon ________________________________ Simon Thompson He/Him/His Senior Storage Performance WW HPC Customer Solutions Lenovo UK [Phone]+44 7788 320635 [Email]sthompson2 at lenovo.com Lenovo.com Twitter | Instagram | Facebook | Linkedin | YouTube | Privacy [cid:image003.png at 01D81822.F63BAB90] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Kidger, Daniel Sent: 02 February 2022 10:07 To: gpfsug-discuss at spectrumscale.org Subject: [External] [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:image004.png at 01D81822.F63BAB90] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 20109 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 2541 bytes Desc: image004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-axuecxph Type: application/octet-stream Size: 2541 bytes Desc: Outlook-axuecxph URL: From anacreo at gmail.com Wed Feb 2 12:41:07 2022 From: anacreo at gmail.com (Alec) Date: Wed, 2 Feb 2022 04:41:07 -0800 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? Alec On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser wrote: > keep in mind... creating many snapshots... means ;-) .. you'll have to > delete many snapshots.. > at a certain level, which depends on #files, #directories, ~workload, > #nodes, #networks etc.... we ve seen cases, where generating just full > snapshots (whole file system) is the better approach instead of > maintaining snapshots for each file set individually .. > > sure. this has other side effects , like space consumption etc... > so as always.. it depends.. > > > > > ----- Urspr?ngliche Nachricht ----- > Von: "Jan-Frode Myklebust" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org > An: "gpfsug main discussion list" > CC: > Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Datum: Mi, 2. Feb 2022 12:54 > > Also, if snapshotting multiple filesets, it's important to group these > into a single mmcrsnapshot command. Then you get a single quiesce, > instead of one per fileset. > > i.e. do: > > snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) > mmcrsnapshot gpfs0 > fileset1:$snapname,filset2:snapname,fileset3:snapname > > instead of: > > mmcrsnapshot gpfs0 fileset1:$snapname > mmcrsnapshot gpfs0 fileset2:$snapname > mmcrsnapshot gpfs0 fileset3:$snapname > > > -jf > > > On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou < > jordi.caubet at es.ibm.com> wrote: > > Ivano, > > if it happens frequently, I would recommend to open a support case. > > The creation or deletion of a snapshot requires a quiesce of the nodes to > obtain a consistent point-in-time image of the file system and/or update > some internal structures afaik. Quiesce is required for nodes at the > storage cluster but also remote clusters. Quiesce means stop activities > (incl. I/O) for a short period of time to get such consistent image. Also > waiting to flush any data in-flight to disk that does not allow a > consistent point-in-time image. > > Nodes receive a quiesce request and acknowledge when ready. When all nodes > acknowledge, snapshot operation can proceed and immediately I/O can resume. > It usually takes few seconds at most and the operation performed is short > but time I/O is stopped depends of how long it takes to quiesce the nodes. > If some node take longer to agree stop the activities, such node will > be delay the completion of the quiesce and keep I/O paused on the rest. > There could many things while some nodes delay quiesce ack. > > The larger the cluster, the more difficult it gets. The more network > congestion or I/O load, the more difficult it gets. I recommend to open a > ticket for support to try to identify the root cause of which nodes not > acknowledge the quiesce and maybe find the root cause. If I recall some > previous thread, default timeout was 60 seconds which match your log > message. After such timeout, snapshot is considered failed to complete. > > Support might help you understand the root cause and provide some > recommendations if it happens frequently. > > Best Regards, > -- > Jordi Caubet Serrabou > IBM Storage Client Technical Specialist (IBM Spain) > > > ----- Original message ----- > From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Date: Wed, Feb 2, 2022 11:45 AM > > > Hello Andrew, > > > > Thanks for your questions. > > > > We're not experiencing any other issue/slowness during normal activity. > > The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool > for metadata only. > > > > The two NSD servers have 750GB of RAM and 618 are configured as pagepool. > > > > The issue we see is happening on both the two filesystems we have: > > > > - perf filesystem: > > - 1.8 PB size (71% in use) > > - 570 milions of inodes (24% in use) > > > > - tiered filesystem: > > - 400 TB size (34% in use) > > - 230 Milions of files (60% in use) > > > > Cheers, > > Ivano > > > > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> on behalf of Andrew Beattie < > abeattie at au1.ibm.com> > *Sent:* Wednesday, February 2, 2022 10:33 AM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Ivano, > > How big is the filesystem in terms of number of files? > How big is the filesystem in terms of capacity? > Is the Metadata on Flash or Spinning disk? > Do you see issues when users do an LS of the filesystem or only when you > are doing snapshots. > > How much memory do the NSD servers have? > How much is allocated to the OS / Spectrum > Scale Pagepool > > Regards > > Andrew Beattie > Technical Specialist - Storage for Big Data & AI > IBM Technology Group > IBM Australia & New Zealand > P. +61 421 337 927 > E. abeattie at au1.IBM.com > > > > > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: > > > ? > > > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe > also when creating new ones) the filesystem becomes inaccessible on the > clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote > cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 > msec > > By looking around I see we're not the first one. I am wondering if that's > considered an unavoidable part of the snapshotting and if there's any > tunable that can improve the situation. Since when this occurs all the > clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage > cluster is on 5.1.1-0. > > Thanks, > Ivano > > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: > > International Business Machines, S.A. > > Santa Hortensia, 26-28, 28002 Madrid > > Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 > > CIF A28-010791 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:55:52 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:55:52 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch>, , , Message-ID: <8d51042ed95b461fb2be3dc33dac030a@psi.ch> Hi Jordi, thanks for the explanation, I can now see better why something like that would happen. Indeed the cluster has a lot of clients, coming via different clusters and even some NFS/SMB via protocol nodes. So I think opening a case makes a lot of sense to track it down. Not sure how we can make the debug transparent to the users, but we'll see. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Jordi Caubet Serrabou Sent: Wednesday, February 2, 2022 12:07 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:57:32 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:57:32 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> , Message-ID: Sure, that makes a lot of sense and we were already doing in that way. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Jan-Frode Myklebust Sent: Wednesday, February 2, 2022 12:53 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:59:30 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:59:30 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , <4326cfae883b4378bcb284b6daecb05e@psi.ch>, Message-ID: Ok that sounds a good candidate for an improvement. Thanks. We didn't want to do a full filesystem snapshot for the space consumption indeed. But we may consider it, keeping an eye on the space. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Olaf Weiser Sent: Wednesday, February 2, 2022 1:09 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. sure. this has other side effects , like space consumption etc... so as always.. it depends.. ----- Urspr?ngliche Nachricht ----- Von: "Jan-Frode Myklebust" Gesendet von: gpfsug-discuss-bounces at spectrumscale.org An: "gpfsug main discussion list" CC: Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Datum: Mi, 2. Feb 2022 12:54 Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 13:03:13 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 13:03:13 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> , Message-ID: That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them to agree. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Alec Sent: Wednesday, February 2, 2022 1:41 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? Alec On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser > wrote: keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. sure. this has other side effects , like space consumption etc... so as always.. it depends.. ----- Urspr?ngliche Nachricht ----- Von: "Jan-Frode Myklebust" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org An: "gpfsug main discussion list" > CC: Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Datum: Mi, 2. Feb 2022 12:54 Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi.caubet at es.ibm.com Wed Feb 2 13:34:20 2022 From: jordi.caubet at es.ibm.com (Jordi Caubet Serrabou) Date: Wed, 2 Feb 2022 13:34:20 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: Message-ID: Maybe some colleagues at IBM devel can correct me, but pagepool size should not make much difference. Afaik, it is mostly read cache data. Another think could be if using HAWC function, I am not sure in such case. Anyhow, looking at your node name, your system seems a DSS from Lenovo so you NSD servers are running GPFS Native RAID and the reason why the pagepool is large there, not for the NSD server role itself, it is for the GNR role that caches disk tracks. Lowering will impact performance. -- Jordi Caubet Serrabou IBM Software Defined Infrastructure (SDI) and Flash Technical Sales Specialist Technical Computing and HPC IT Specialist and Architect > On 2 Feb 2022, at 14:03, Talamo Ivano Giuseppe (PSI) wrote: > > ? > That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. > > Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them to agree. > > > > Cheers, > > Ivano > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Alec > Sent: Wednesday, February 2, 2022 1:41 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. > > Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? > > My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. > > Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? > > Alec > > >> On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser wrote: >> keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. >> at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. >> >> sure. this has other side effects , like space consumption etc... >> so as always.. it depends.. >> >> >> >> ----- Urspr?ngliche Nachricht ----- >> Von: "Jan-Frode Myklebust" >> Gesendet von: gpfsug-discuss-bounces at spectrumscale.org >> An: "gpfsug main discussion list" >> CC: >> Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> Datum: Mi, 2. Feb 2022 12:54 >> >> Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. >> >> i.e. do: >> >> snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) >> mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname >> >> instead of: >> >> mmcrsnapshot gpfs0 fileset1:$snapname >> mmcrsnapshot gpfs0 fileset2:$snapname >> mmcrsnapshot gpfs0 fileset3:$snapname >> >> >> -jf >> >> >> On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou wrote: >> Ivano, >> >> if it happens frequently, I would recommend to open a support case. >> >> The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. >> >> Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. >> There could many things while some nodes delay quiesce ack. >> >> The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. >> >> Support might help you understand the root cause and provide some recommendations if it happens frequently. >> >> Best Regards, >> -- >> Jordi Caubet Serrabou >> IBM Storage Client Technical Specialist (IBM Spain) >> >> ----- Original message ----- >> From: "Talamo Ivano Giuseppe (PSI)" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "gpfsug main discussion list" >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> Date: Wed, Feb 2, 2022 11:45 AM >> >> Hello Andrew, >> >> >> >> Thanks for your questions. >> >> >> >> We're not experiencing any other issue/slowness during normal activity. >> >> The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. >> >> >> >> The two NSD servers have 750GB of RAM and 618 are configured as pagepool. >> >> >> >> The issue we see is happening on both the two filesystems we have: >> >> >> >> - perf filesystem: >> >> - 1.8 PB size (71% in use) >> >> - 570 milions of inodes (24% in use) >> >> >> >> - tiered filesystem: >> >> - 400 TB size (34% in use) >> >> - 230 Milions of files (60% in use) >> >> >> >> Cheers, >> >> Ivano >> >> >> >> >> >> >> >> __________________________________________ >> Paul Scherrer Institut >> Ivano Talamo >> WHGA/038 >> Forschungsstrasse 111 >> 5232 Villigen PSI >> Schweiz >> >> Telefon: +41 56 310 47 11 >> E-Mail: ivano.talamo at psi.ch >> >> >> >> >> From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie >> Sent: Wednesday, February 2, 2022 10:33 AM >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> >> Ivano, >> >> How big is the filesystem in terms of number of files? >> How big is the filesystem in terms of capacity? >> Is the Metadata on Flash or Spinning disk? >> Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. >> >> How much memory do the NSD servers have? >> How much is allocated to the OS / Spectrum >> Scale Pagepool >> >> Regards >> >> Andrew Beattie >> Technical Specialist - Storage for Big Data & AI >> IBM Technology Group >> IBM Australia & New Zealand >> P. +61 421 337 927 >> E. abeattie at au1.IBM.com >> >> >> >>> >>> On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: >>> >>> ? >>> >>> >>> Dear all, >>> >>> Since a while we are experiencing an issue when dealing with snapshots. >>> Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). >>> >>> The clients and the storage are on two different clusters, using remote cluster mount for the access. >>> >>> On the log files many lines like the following appear (on both clusters): >>> Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec >>> >>> By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. >>> >>> If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. >>> >>> Thanks, >>> Ivano >>> >>> >>> >>> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: >> >> International Business Machines, S.A. >> >> Santa Hortensia, 26-28, 28002 Madrid >> >> Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 >> >> CIF A28-010791 >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 -------------- next part -------------- An HTML attachment was scrubbed... URL: From juergen.hannappel at desy.de Wed Feb 2 15:04:24 2022 From: juergen.hannappel at desy.de (Hannappel, Juergen) Date: Wed, 2 Feb 2022 16:04:24 +0100 (CET) Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> Hi, I use a python script via cron job, it checks how many snapshots exist and removes those that exceed a configurable limit, then creates a new one. Deployed via puppet it's much less hassle than click around in a GUI/ > From: "Kidger, Daniel" > To: "gpfsug main discussion list" > Sent: Wednesday, 2 February, 2022 11:07:25 > Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? > Hi all, > Since the subject of snapshots has come up, I also have a question ... > Snapshots can be created from the command line with mmcrsnapshot, and hence can > be automated via con jobs etc. > Snapshots can also be created from the Scale GUI. The GUI also provides its own > automation for the creation, retention, and deletion of snapshots. > My question is: do most customers use the former or the latter for automation? > (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do > exactly the same as what the GUI does it terms of creating automated snapshots. > It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. > How many customers also use the commands found in /usr/lpp/mmfs/gui/cli / ? ) > Daniel > Daniel Kidger > HPC Storage Solutions Architect, EMEA > [ mailto:daniel.kidger at hpe.com | daniel.kidger at hpe.com ] > +44 (0)7818 522266 > [ http://www.hpe.com/ | hpe.com ] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-iity4nk4 Type: image/png Size: 2541 bytes Desc: Outlook-iity4nk4 URL: From mark.bergman at uphs.upenn.edu Wed Feb 2 16:09:02 2022 From: mark.bergman at uphs.upenn.edu (mark.bergman at uphs.upenn.edu) Date: Wed, 02 Feb 2022 11:09:02 -0500 Subject: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: Your message of "Wed, 02 Feb 2022 10:07:25 +0000." References: Message-ID: <1971435-1643818142.818836@ATIP.bjhn.uBcv> Big vote for cron jobs. Our snapshot are created by a script, installed on each GPFS node. The script handles naming, removing old snapshots, checking that sufficient disk space exists before creating a snapshot, etc. We do snapshots every 15 minutes, keeping them with lower frequency over longer intervals. For example: current hour: keep 4 snapshots hours -2 .. -8 keep 3 snapshots per hour hours -8 .. -24 keep 2 snapshots per hour days -1 .. -5 keep 1 snapshot per hour days -5 .. -15 keep 4 snapshots per day days -15 .. -30 keep 1 snapshot per day the duration & frequency & minimum disk space can be adjusted per-filesystem. The automation is done through a cronjob that runs on each GPFS (DSS-G) server to create the snapshot only if the node is currently the cluster master, as in: */15 * * * * root mmlsmgr -Y | grep -q "clusterManager.*:$(hostname --long):" && /path/to/snapshotter This requires no locking and ensures that only a single instance of snapshots is created at each time interval. We use the same trick to gather GPFS health stats, etc., ensuring that the data collection only runs on a single node (the cluster manager). -- Mark Bergman voice: 215-746-4061 mark.bergman at pennmedicine.upenn.edu fax: 215-614-0266 http://www.med.upenn.edu/cbica/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania From info at odina.nl Wed Feb 2 16:22:47 2022 From: info at odina.nl (Jaap Jan Ouwehand) Date: Wed, 02 Feb 2022 17:22:47 +0100 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> References: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> Message-ID: <9CD60B1D-5BF8-4BBD-9F9D-A872D89EE9C4@odina.nl> Hi, I also used a custom script (database driven) via cron which creates many fileset snapshots during the day via the "default helper nodes". Because of the iops, the oldest snapshots are deleted at night. Perhaps it's a good idea to take one global filesystem snapshot and make it available to the filesets with mmsnapdir. Kind regards, Jaap Jan Ouwehand "Hannappel, Juergen" schreef op 2 februari 2022 16:04:24 CET: >Hi, >I use a python script via cron job, it checks how many snapshots exist and removes those that >exceed a configurable limit, then creates a new one. >Deployed via puppet it's much less hassle than click around in a GUI/ > >> From: "Kidger, Daniel" >> To: "gpfsug main discussion list" >> Sent: Wednesday, 2 February, 2022 11:07:25 >> Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? > >> Hi all, > >> Since the subject of snapshots has come up, I also have a question ... > >> Snapshots can be created from the command line with mmcrsnapshot, and hence can >> be automated via con jobs etc. >> Snapshots can also be created from the Scale GUI. The GUI also provides its own >> automation for the creation, retention, and deletion of snapshots. > >> My question is: do most customers use the former or the latter for automation? > >> (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do >> exactly the same as what the GUI does it terms of creating automated snapshots. >> It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. >> How many customers also use the commands found in /usr/lpp/mmfs/gui/cli / ? ) > >> Daniel > >> Daniel Kidger >> HPC Storage Solutions Architect, EMEA >> [ mailto:daniel.kidger at hpe.com | daniel.kidger at hpe.com ] > >> +44 (0)7818 522266 > >> [ http://www.hpe.com/ | hpe.com ] > >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.ward at nhm.ac.uk Mon Feb 7 16:39:25 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 7 Feb 2022 16:39:25 +0000 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: Backups seem to have settled down. A workshop with our partner and IBM is in the pipeline. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Paul Ward Sent: 01 February 2022 12:28 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Not currently set. I'll look into them. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: 26 January 2022 16:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Awesome, glad that you found them (I missed them the first time too). As for the anomalous changed files, do you have these options set in your client option file? skipacl yes skipaclupdatecheck yes updatectime yes We had similar problems where metadata and ACL updates were interpreted as data changes by mmbackup/dsmc. We also have a case open with IBM where mmbackup will both expire and backup a file in the same run, even in the absence of mtime changes, but it's unclear whether that's program error or something with our include/exclude rules. I'd be curious if you're running into that as well. On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > Good call! > > Yes they are dot files. > > > New issue. > > Mmbackup seems to be backup up the same files over and over without them changing: > areas are being backed up multiple times. > The example below is a co-resident file, the only thing that has changed since it was created 20/10/21, is the file has been accessed for backup. > This file is in the 'changed' list in mmbackup: > > This list has just been created: > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > Listing the last few files in the file (selecting the last one) > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > Check the file stats (access time just before last backup) > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File: '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > Size: 545 Blocks: 32 IO Block: 4194304 regular file > Device: 2bh/43d Inode: 212618897 Links: 1 > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: (1399647564/NHM\dg-mbl-urban-nature-project-rw) > Context: unconfined_u:object_r:unlabeled_t:s0 > Access: 2022-01-25 06:40:58.334961446 +0000 > Modify: 2020-12-01 15:20:40.122053000 +0000 > Change: 2021-10-20 17:55:18.265746459 +0100 > Birth: - > > Check if migrated > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File name : /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > On-line size : 545 > Used blocks : 16 > Data Version : 1 > Meta Version : 1 > State : Co-resident > Container Index : 1 > Base Name : 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > Check if immutable > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > file name: /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > metadata replication: 2 max 2 > data replication: 2 max 2 > immutable: no > appendOnly: no > flags: > storage pool name: data > fileset name: hpc-workspaces-fset > snapshot name: > creation time: Wed Oct 20 17:55:18 2021 > Misc attributes: ARCHIVE > Encrypted: no > > Check active and inactive backups (it was backed up yesterday) > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 11:19:02 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > 11:07:05 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/25/2022 06:41:17 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > It will be backed up again shortly, why? > > And it was backed up again: > # dsmcqbi > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 15:54:09 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > 15:30:03 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/26/2022 12:23:02 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/25/2022 06:41:17 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Skylar > Thompson > Sent: 24 January 2022 15:37 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Hi Paul, > > Did you look for dot files? At least for us on 5.0.5 there's a .list.1. file while the backups are running: > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > Those directories are empty > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of IBM Spectrum > > Scale > > Sent: 22 January 2022 00:35 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > Instead of calculating *.ix.* files, please look at a list file in these directories. > > > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked.]"Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/21/2022 09:38 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > of the script I now copy the contents of the .mmbackupCfg folder to > > a date stamped logging folder Checking how many entries in these files compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you > > > > Right in the command line seems to have worked. > > At the end of the script I now copy the contents of the .mmbackupCfg > > folder to a date stamped logging folder > > > > Checking how many entries in these files compared to the Summary: > > wc -l mmbackup* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 754 total > > From Summary > > Total number of objects inspected: 755 > > I can live with a discrepancy of 1. > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > From Summary > > Total number of objects expired: 2 > > That matches > > > > wc -l mmbackupC* mmbackupS* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 752 total > > Summary: > > Total number of objects backed up: 751 > > > > A difference of 1 I can live with. > > > > What does Statech stand for? > > > > Just this to sort out: > > Total number of objects failed: 1 > > I will add: > > --tsm-errorlog TSMErrorLogFile > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 19 January 2022 15:09 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > This is to set environment for mmbackup. > > If mmbackup is invoked within a script, you can set "export DEBUGmmbackup=2" right above mmbackup command. > > e.g) in your script > > .... > > export DEBUGmmbackup=2 > > mmbackup .... > > > > Or, you can set it in the same command line like > > DEBUGmmbackup=2 mmbackup .... > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to se]"Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to see if they are the cluster manager. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/19/2022 06:04 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > they are the cluster manager. If they are, then they take > > responsibility to start the backup script. The script then randomly selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you. > > > > We run a script on all our nodes that checks to see if they are the cluster manager. > > If they are, then they take responsibility to start the backup script. > > The script then randomly selects one of the available backup nodes and uses dsmsh mmbackup on it. > > > > Where does this command belong? > > I have seen it listed as a export command, again where should that be run ? on all backup nodes, or all nodes? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 18 January 2022 22:54 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files even after successful backup. They are available at MMBACKUP_RECORD_ROOT (default is FSroot or FilesetRoot directory). > > In .mmbackupCfg directory, there are 3 directories: > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to back]"Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to backup using mmbackup. > > > > From: "Paul Ward" > > > To: > > "gpfsug-discuss at spectrumscale.org > org>" > > > org>> > > Date: 01/18/2022 11:56 AM > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > using mmbackup. I have increased the -L value from 3 up to 6 but > > only seem to see the files that are in scope, not the ones that are selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > I am trying to work out what files have been sent to backup using mmbackup. > > I have increased the -L value from 3 up to 6 but only seem to see the files that are in scope, not the ones that are selected. > > > > I can see the three file lists generated during a backup, but can?t seem to find a list of what files were backed up. > > > > It should be the diff of the shadow and shadow-old, but the wc -l of the diff doesn?t match the number of files in the backup summary. > > Wrong assumption? > > > > Where should I be looking ? surely it shouldn?t be this hard to see what files are selected? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > er > > ved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > es > > erved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > 25 > > 2F > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > 7f > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > d8 > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > C2 > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > ;r > > eserved=0> > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 From anacreo at gmail.com Mon Feb 7 17:42:36 2022 From: anacreo at gmail.com (Alec) Date: Mon, 7 Feb 2022 09:42:36 -0800 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: I'll share something we do when working with the GPFS policy engine so we don't blow out our backups... So we use a different backup in solution and have our file system broken down into multiple concurrent streams. In my policy engine when making major changes to the file system such as encrypting or compressing data I use a where clause such as: MOD(INODE, 7)<=dayofweek When we call mmpolicy I add -M dayofweek=NN. In this case I'd use cron and pass day of the week. What this achieves is that on each day I only work on 1/7th of each file system... So that no one backup stream is blown out. It is cumulative so 7+ will work on 100% of the file system. It's a nifty trick so figured I'd share it out. In production we do something more like 40, and set shares to increment by 1 on weekdays and 3 on weekends to distribute workload out over the whole month with more work on the weekends. Alec On Mon, Feb 7, 2022, 8:39 AM Paul Ward wrote: > Backups seem to have settled down. > A workshop with our partner and IBM is in the pipeline. > > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Paul Ward > Sent: 01 February 2022 12:28 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Not currently set. I'll look into them. > > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Skylar Thompson > Sent: 26 January 2022 16:50 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Awesome, glad that you found them (I missed them the first time too). > > As for the anomalous changed files, do you have these options set in your > client option file? > > skipacl yes > skipaclupdatecheck yes > updatectime yes > > We had similar problems where metadata and ACL updates were interpreted as > data changes by mmbackup/dsmc. > > We also have a case open with IBM where mmbackup will both expire and > backup a file in the same run, even in the absence of mtime changes, but > it's unclear whether that's program error or something with our > include/exclude rules. I'd be curious if you're running into that as well. > > On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > > Good call! > > > > Yes they are dot files. > > > > > > New issue. > > > > Mmbackup seems to be backup up the same files over and over without them > changing: > > areas are being backed up multiple times. > > The example below is a co-resident file, the only thing that has changed > since it was created 20/10/21, is the file has been accessed for backup. > > This file is in the 'changed' list in mmbackup: > > > > This list has just been created: > > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > > > Listing the last few files in the file (selecting the last one) > > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > > > Check the file stats (access time just before last backup) > > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > File: > '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > > Size: 545 Blocks: 32 IO Block: 4194304 regular file > > Device: 2bh/43d Inode: 212618897 Links: 1 > > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: > (1399647564/NHM\dg-mbl-urban-nature-project-rw) > > Context: unconfined_u:object_r:unlabeled_t:s0 > > Access: 2022-01-25 06:40:58.334961446 +0000 > > Modify: 2020-12-01 15:20:40.122053000 +0000 > > Change: 2021-10-20 17:55:18.265746459 +0100 > > Birth: - > > > > Check if migrated > > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > File name : > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > On-line size : 545 > > Used blocks : 16 > > Data Version : 1 > > Meta Version : 1 > > State : Co-resident > > Container Index : 1 > > Base Name : > 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > > > Check if immutable > > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > file name: > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > metadata replication: 2 max 2 > > data replication: 2 max 2 > > immutable: no > > appendOnly: no > > flags: > > storage pool name: data > > fileset name: hpc-workspaces-fset > > snapshot name: > > creation time: Wed Oct 20 17:55:18 2021 > > Misc attributes: ARCHIVE > > Encrypted: no > > > > Check active and inactive backups (it was backed up yesterday) > > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > IBM Spectrum Protect > > Command Line Backup-Archive Client Interface > > Client Version 8, Release 1, Level 10.0 > > Client date/time: 01/26/2022 11:19:02 > > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights > Reserved. > > > > Node Name: SC-PN-SK-01 > > Session established with server TSM-JERSEY: Windows > > Server Version 8, Release 1, Level 10.100 > > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > > 11:07:05 > > > > Accessing as node: SCALE > > Size Backup Date Mgmt Class > A/I File > > ---- ----------- ---------- > --- ---- > > 545 B 01/25/2022 06:41:17 DEFAULT > A > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 12/28/2021 21:19:18 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:17:35 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:18:05 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > > > > It will be backed up again shortly, why? > > > > And it was backed up again: > > # dsmcqbi > > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > IBM Spectrum Protect > > Command Line Backup-Archive Client Interface > > Client Version 8, Release 1, Level 10.0 > > Client date/time: 01/26/2022 15:54:09 > > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights > Reserved. > > > > Node Name: SC-PN-SK-01 > > Session established with server TSM-JERSEY: Windows > > Server Version 8, Release 1, Level 10.100 > > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > > 15:30:03 > > > > Accessing as node: SCALE > > Size Backup Date Mgmt Class > A/I File > > ---- ----------- ---------- > --- ---- > > 545 B 01/26/2022 12:23:02 DEFAULT > A > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 12/28/2021 21:19:18 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:17:35 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:18:05 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/25/2022 06:41:17 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > > > > > -----Original Message----- > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of Skylar > > Thompson > > Sent: 24 January 2022 15:37 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > Did you look for dot files? At least for us on 5.0.5 there's a > .list.1. file while the backups are running: > > > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > > Those directories are empty > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > > On Behalf Of IBM Spectrum > > > Scale > > > Sent: 22 January 2022 00:35 > > > To: gpfsug main discussion list > > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > > > > Hi Paul, > > > > > > Instead of calculating *.ix.* files, please look at a list file in > these directories. > > > > > > updatedFiles : contains a file that lists all candidates for backup > > > statechFiles : cantains a file that lists all candidates for meta > > > info update expiredFiles : cantains a file that lists all > > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 > AM---Thank you Right in the command line seems to have worked.]"Paul Ward" > ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to > have worked. > > > > > > From: "Paul Ward" > > > > To: "gpfsug main discussion list" > > > > > org>> > > > Cc: > > > "gpfsug-discuss-bounces at spectrumscale.org > > ce > > > s at spectrumscale.org>" > > > > > ce > > > s at spectrumscale.org>> > > > Date: 01/21/2022 09:38 AM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > > Sent > > > by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > > of the script I now copy the contents of the .mmbackupCfg folder to > > > a date stamped logging folder Checking how many entries in these files > compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is > From an External Sender This message came from outside your organization. > > > ZjQcmQRYFpfptBannerEnd > > > Thank you > > > > > > Right in the command line seems to have worked. > > > At the end of the script I now copy the contents of the .mmbackupCfg > > > folder to a date stamped logging folder > > > > > > Checking how many entries in these files compared to the Summary: > > > wc -l mmbackup* > > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > > 754 total > > > From Summary > > > Total number of objects inspected: 755 > > > I can live with a discrepancy of 1. > > > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > > From Summary > > > Total number of objects expired: 2 > > > That matches > > > > > > wc -l mmbackupC* mmbackupS* > > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > > 752 total > > > Summary: > > > Total number of objects backed up: 751 > > > > > > A difference of 1 I can live with. > > > > > > What does Statech stand for? > > > > > > Just this to sort out: > > > Total number of objects failed: 1 > > > I will add: > > > --tsm-errorlog TSMErrorLogFile > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > > Sent: 19 January 2022 15:09 > > > To: gpfsug main discussion list > > > > > org>> > > > Cc: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > > > > This is to set environment for mmbackup. > > > If mmbackup is invoked within a script, you can set "export > DEBUGmmbackup=2" right above mmbackup command. > > > e.g) in your script > > > .... > > > export DEBUGmmbackup=2 > > > mmbackup .... > > > > > > Or, you can set it in the same command line like > > > DEBUGmmbackup=2 mmbackup .... > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 > AM---Thank you. We run a script on all our nodes that checks to se]"Paul > Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our > nodes that checks to see if they are the cluster manager. > > > > > > From: "Paul Ward" > > > > To: "gpfsug main discussion list" > > > > > org>> > > > Cc: > > > "gpfsug-discuss-bounces at spectrumscale.org > > ce > > > s at spectrumscale.org>" > > > > > ce > > > s at spectrumscale.org>> > > > Date: 01/19/2022 06:04 AM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > > Sent > > > by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > > they are the cluster manager. If they are, then they take > > > responsibility to start the backup script. The script then randomly > selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender This message came from outside your > organization. > > > ZjQcmQRYFpfptBannerEnd > > > Thank you. > > > > > > We run a script on all our nodes that checks to see if they are the > cluster manager. > > > If they are, then they take responsibility to start the backup script. > > > The script then randomly selects one of the available backup nodes and > uses dsmsh mmbackup on it. > > > > > > Where does this command belong? > > > I have seen it listed as a export command, again where should that be > run ? on all backup nodes, or all nodes? > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > > Sent: 18 January 2022 22:54 > > > To: gpfsug main discussion list > > > > > org>> > > > Cc: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files > even after successful backup. They are available at MMBACKUP_RECORD_ROOT > (default is FSroot or FilesetRoot directory). > > > In .mmbackupCfg directory, there are 3 directories: > > > updatedFiles : contains a file that lists all candidates for backup > > > statechFiles : cantains a file that lists all candidates for meta > > > info update expiredFiles : cantains a file that lists all > > > candidates for expiration > > > > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, > I am trying to work out what files have been sent to back]"Paul Ward" > ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have > been sent to backup using mmbackup. > > > > > > From: "Paul Ward" > > > > To: > > > "gpfsug-discuss at spectrumscale.org > > org>" > > > > > org>> > > > Date: 01/18/2022 11:56 AM > > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > > using mmbackup. I have increased the -L value from 3 up to 6 but > > > only seem to see the files that are in scope, not the ones that are > selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender This message came from outside your > organization. > > > ZjQcmQRYFpfptBannerEnd > > > Hi, > > > > > > I am trying to work out what files have been sent to backup using > mmbackup. > > > I have increased the -L value from 3 up to 6 but only seem to see the > files that are in scope, not the ones that are selected. > > > > > > I can see the three file lists generated during a backup, but can?t > seem to find a list of what files were backed up. > > > > > > It should be the diff of the shadow and shadow-old, but the wc -l of > the diff doesn?t match the number of files in the backup summary. > > > Wrong assumption? > > > > > > Where should I be looking ? surely it shouldn?t be this hard to see > what files are selected? > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > > ed > > > =0 > > gp > > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > wa > > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > > a0 > > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > > ey > > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > > 00 > > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > > er > > > ved=0> > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > > ed > > > =0 > > gp > > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > wa > > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > > a0 > > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > > ey > > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > > 00 > > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > > es > > > erved=0> > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > > rv > > > ed=0 > > 25 > > > 2F > > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > > 7f > > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > > d8 > > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > > C2 > > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > > ;r > > > eserved=0> > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > > rv > > > ed=0 > > > > > > -- > > -- Skylar Thompson (skylar2 at u.washington.edu) > > -- Genome Sciences Department (UW Medicine), System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- Pronouns: He/Him/His > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > > rved=0 _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > > rved=0 > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.ward at nhm.ac.uk Mon Feb 21 12:30:15 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 21 Feb 2022 12:30:15 +0000 Subject: [gpfsug-discuss] immutable folder Message-ID: HI, I have a folder that I can't delete. IAM mode - non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can't leave it unchanged... Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From scale at us.ibm.com Mon Feb 21 16:11:37 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 21 Feb 2022 12:11:37 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug-discuss at spectrumscale.org" Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Tue Feb 22 10:30:36 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 22 Feb 2022 10:30:36 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug-discuss at spectrumscale.org" > Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From scale at us.ibm.com Tue Feb 22 14:17:00 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Tue, 22 Feb 2022 10:17:00 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. ???????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From cantrell at astro.gsu.edu Tue Feb 22 17:24:09 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 12:24:09 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS Message-ID: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> We're trying to mount multiple mounts at boot up via gpfs. We can mount the main gpfs mount /gpfs1, but would like to mount things like: /home /gpfs1/home /other /gpfs1/other /stuff /gpfs1/stuff But adding that to fstab doesn't work, because from what I understand, that's not how gpfs works with mounts. What's the standard way to accomplish something like this? We've used systemd timers/mounts to accomplish it, but that's not ideal. Is there a way to do this natively with gpfs or does this have to be done through symlinks or gpfs over nfs? From skylar2 at uw.edu Tue Feb 22 17:37:27 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 09:37:27 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> Message-ID: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Assuming this is on Linux, you ought to be able to use bind mounts for that, something like this in fstab or equivalent: /home /gpfs1/home bind defaults 0 0 On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > We're trying to mount multiple mounts at boot up via gpfs. > We can mount the main gpfs mount /gpfs1, but would like to mount things > like: > /home /gpfs1/home > /other /gpfs1/other > /stuff /gpfs1/stuff > > But adding that to fstab doesn't work, because from what I understand, > that's not how gpfs works with mounts. > What's the standard way to accomplish something like this? > We've used systemd timers/mounts to accomplish it, but that's not ideal. > Is there a way to do this natively with gpfs or does this have to be done > through symlinks or gpfs over nfs? -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From ulmer at ulmer.org Tue Feb 22 17:50:13 2022 From: ulmer at ulmer.org (Stephen Ulmer) Date: Tue, 22 Feb 2022 12:50:13 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> Message-ID: <3DE42AF3-34F0-4E3D-8813-813ADF85477A@ulmer.org> > On Feb 22, 2022, at 12:24 PM, Justin Cantrell wrote: > > We're trying to mount multiple mounts at boot up via gpfs. > We can mount the main gpfs mount /gpfs1, but would like to mount things like: > /home /gpfs1/home > /other /gpfs1/other > /stuff /gpfs1/stuff > > But adding that to fstab doesn't work, because from what I understand, that's not how gpfs works with mounts. > What's the standard way to accomplish something like this? > We've used systemd timers/mounts to accomplish it, but that's not ideal. > Is there a way to do this natively with gpfs or does this have to be done through symlinks or gpfs over nfs? > What are you really trying to accomplish? Backward compatibility with an older user experience? Making it shorter to type? Matching the path on non-GPFS nodes? -- Stephen From tina.friedrich at it.ox.ac.uk Tue Feb 22 18:12:23 2022 From: tina.friedrich at it.ox.ac.uk (Tina Friedrich) Date: Tue, 22 Feb 2022 18:12:23 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: <7b8fa26b-bb70-2ba4-0fe4-639ffede6943@it.ox.ac.uk> Bind mounts would definitely work; you can also use the automounter to bind-mount things into place. That's how we do that. E.g. [ ~]$ cat /etc/auto.data /data localhost://mnt/gpfs/bulk/data [ ~]$ cat /etc/auto.master | grep data # data /- /etc/auto.data works very well :) (That automatically bind-mounts it.) You could then also only mount user home directories if they're logged in, instead of showing all of them under /home/. Autofs can do pretty nice wildcarding and such. I would call bind mounting things - regardless of how - a better solution than symlinks, but that might just be my opinion :) Tina On 22/02/2022 17:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I understand, >> that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be done >> through symlinks or gpfs over nfs? > -- Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator Research Computing and Support Services IT Services, University of Oxford http://www.arc.ox.ac.uk http://www.it.ox.ac.uk From anacreo at gmail.com Tue Feb 22 18:56:44 2022 From: anacreo at gmail.com (Alec) Date: Tue, 22 Feb 2022 10:56:44 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: There is a sample script I believe it's called mmfsup. It's a hook that's called at startup of GPFS cluster node. We modify that script to do things such as configure backup ignore lists, update pagepool, and mount GPFS filesystem nodes as appropriate. We basically have a case statement based on class of the node, ie master, client, or primary backup node. Advantage of this is if you do an gpfs stop/start on an already running node things work right... Great in a fire situation... Or if you modify mounts or filesystems... You can call mmfsup say with mmdsh, send verify startup would be right. We started on this path because our backup software default policy would backup GPFS mounts from each node.. so simply adding the ignores at startup from the non backup primary was our solution. We also have mounts that should not be mounted on some nodes, and this handles that very elegantly. Alec On Tue, Feb 22, 2022, 9:37 AM Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > We're trying to mount multiple mounts at boot up via gpfs. > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > like: > > /home /gpfs1/home > > /other /gpfs1/other > > /stuff /gpfs1/stuff > > > > But adding that to fstab doesn't work, because from what I understand, > > that's not how gpfs works with mounts. > > What's the standard way to accomplish something like this? > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > Is there a way to do this natively with gpfs or does this have to be done > > through symlinks or gpfs over nfs? > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cantrell at astro.gsu.edu Tue Feb 22 19:23:53 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 14:23:53 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> I tried a bind mount, but perhaps I'm doing it wrong. The system fails to boot because gpfs doesn't start until too late in the boot process. In fact, the system boots and the gpfs1 partition isn't available for a good 20-30 seconds. /gfs1/home??? /home??? none???? bind I've tried adding mount options of x-systemd-requires=gpfs1, noauto. The noauto lets it boot, but the mount is never mounted properly. Doing a manual mount -a mounts it. On 2/22/22 12:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I understand, >> that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be done >> through symlinks or gpfs over nfs? From skylar2 at uw.edu Tue Feb 22 19:42:45 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 11:42:45 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> Message-ID: <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Like Tina, we're doing bind mounts in autofs. I forgot that there might be a race condition if you're doing it in fstab. If you're on system with systemd, another option might be to do this directly with systemd.mount rather than let the fstab generator make the systemd.mount units: https://www.freedesktop.org/software/systemd/man/systemd.mount.html You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > to boot because gpfs doesn't start until too late in the boot process. > In fact, the system boots and the gpfs1 partition isn't available for a > good 20-30 seconds. > > /gfs1/home??? /home??? none???? bind > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > The noauto lets it boot, but the mount is never mounted properly. Doing > a manual mount -a mounts it. > > On 2/22/22 12:37, Skylar Thompson wrote: > > Assuming this is on Linux, you ought to be able to use bind mounts for > > that, something like this in fstab or equivalent: > > > > /home /gpfs1/home bind defaults 0 0 > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > We're trying to mount multiple mounts at boot up via gpfs. > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > like: > > > /home /gpfs1/home > > > /other /gpfs1/other > > > /stuff /gpfs1/stuff > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > that's not how gpfs works with mounts. > > > What's the standard way to accomplish something like this? > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > Is there a way to do this natively with gpfs or does this have to be done > > > through symlinks or gpfs over nfs? > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From cantrell at astro.gsu.edu Tue Feb 22 20:05:58 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 15:05:58 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Message-ID: This is how we're currently solving this problem, with systemd timer and mount. None of the requires seem to work with gpfs since it starts so late. I would like a better solution. Is it normal for gpfs to start so late?? I think it doesn't mount until after the gpfs.service starts, and even then it's 20-30 seconds. On 2/22/22 14:42, Skylar Thompson wrote: > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > a race condition if you're doing it in fstab. If you're on system with systemd, > another option might be to do this directly with systemd.mount rather than > let the fstab generator make the systemd.mount units: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: >> I tried a bind mount, but perhaps I'm doing it wrong. The system fails >> to boot because gpfs doesn't start until too late in the boot process. >> In fact, the system boots and the gpfs1 partition isn't available for a >> good 20-30 seconds. >> >> /gfs1/home??? /home??? none???? bind >> I've tried adding mount options of x-systemd-requires=gpfs1, noauto. >> The noauto lets it boot, but the mount is never mounted properly. Doing >> a manual mount -a mounts it. >> >> On 2/22/22 12:37, Skylar Thompson wrote: >>> Assuming this is on Linux, you ought to be able to use bind mounts for >>> that, something like this in fstab or equivalent: >>> >>> /home /gpfs1/home bind defaults 0 0 >>> >>> On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >>>> We're trying to mount multiple mounts at boot up via gpfs. >>>> We can mount the main gpfs mount /gpfs1, but would like to mount things >>>> like: >>>> /home /gpfs1/home >>>> /other /gpfs1/other >>>> /stuff /gpfs1/stuff >>>> >>>> But adding that to fstab doesn't work, because from what I understand, >>>> that's not how gpfs works with mounts. >>>> What's the standard way to accomplish something like this? >>>> We've used systemd timers/mounts to accomplish it, but that's not ideal. >>>> Is there a way to do this natively with gpfs or does this have to be done >>>> through symlinks or gpfs over nfs? >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 From skylar2 at uw.edu Tue Feb 22 20:12:03 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 12:12:03 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Message-ID: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> The problem might be that the service indicates success when mmstartup returns rather than when the mount is actually active (requires quorum checking, arbitration, etc.). A couple tricks I can think of would be using ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a callback[2] that triggers on the mount condition for your filesystem that makes the bind mount rather than systemd. [1] https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathIsMountPoint= [2] https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command These are both on our todo list for improving our own GPFS mounting as we have problems with our job scheduler not starting reliably on reboot, but for us we can have Puppet start it on the next run so it just means nodes might not return to service for 30 minutes or so. On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > This is how we're currently solving this problem, with systemd timer and > mount. None of the requires seem to work with gpfs since it starts so late. > I would like a better solution. > > Is it normal for gpfs to start so late?? I think it doesn't mount until > after the gpfs.service starts, and even then it's 20-30 seconds. > > > On 2/22/22 14:42, Skylar Thompson wrote: > > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > > a race condition if you're doing it in fstab. If you're on system with systemd, > > another option might be to do this directly with systemd.mount rather than > > let the fstab generator make the systemd.mount units: > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > > > to boot because gpfs doesn't start until too late in the boot process. > > > In fact, the system boots and the gpfs1 partition isn't available for a > > > good 20-30 seconds. > > > > > > /gfs1/home??? /home??? none???? bind > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > The noauto lets it boot, but the mount is never mounted properly. Doing > > > a manual mount -a mounts it. > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > Assuming this is on Linux, you ought to be able to use bind mounts for > > > > that, something like this in fstab or equivalent: > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > > > like: > > > > > /home /gpfs1/home > > > > > /other /gpfs1/other > > > > > /stuff /gpfs1/stuff > > > > > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > > > that's not how gpfs works with mounts. > > > > > What's the standard way to accomplish something like this? > > > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > > > Is there a way to do this natively with gpfs or does this have to be done > > > > > through symlinks or gpfs over nfs? > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From anacreo at gmail.com Tue Feb 22 20:29:29 2022 From: anacreo at gmail.com (Alec) Date: Tue, 22 Feb 2022 12:29:29 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: The trick for us on AIX in the inittab I have a script fswait.ksh and monitors for the cluster mount point to be available before allowing the cluster dependent startup item (lower in the inittab) I'm pretty sure Linux has a way to define a dependent service.. define a cluster ready service and mark everything else as dependent on that or one of it's descendents. You could simply put the wait on FS in your dependent services start script as an option as well. Lookup systemd and then After= or Part of= if memory serves me right on Linux. For the mmfsup script it goes into /var/mmfs/etc/mmfsup The cluster will call it if present when the node is ready. On Tue, Feb 22, 2022, 12:13 PM Skylar Thompson wrote: > The problem might be that the service indicates success when mmstartup > returns rather than when the mount is actually active (requires quorum > checking, arbitration, etc.). A couple tricks I can think of would be using > ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a > callback[2] that triggers on the mount condition for your filesystem that > makes the bind mount rather than systemd. > > [1] > https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathIsMountPoint= > [2] > https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command > > These are both on our todo list for improving our own GPFS mounting as we > have problems with our job scheduler not starting reliably on reboot, but > for us we can have Puppet start it on the next run so it just means nodes > might not return to service for 30 minutes or so. > > On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > > This is how we're currently solving this problem, with systemd timer and > > mount. None of the requires seem to work with gpfs since it starts so > late. > > I would like a better solution. > > > > Is it normal for gpfs to start so late?? I think it doesn't mount until > > after the gpfs.service starts, and even then it's 20-30 seconds. > > > > > > On 2/22/22 14:42, Skylar Thompson wrote: > > > Like Tina, we're doing bind mounts in autofs. I forgot that there > might be > > > a race condition if you're doing it in fstab. If you're on system with > systemd, > > > another option might be to do this directly with systemd.mount rather > than > > > let the fstab generator make the systemd.mount units: > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > > I tried a bind mount, but perhaps I'm doing it wrong. The system > fails > > > > to boot because gpfs doesn't start until too late in the boot > process. > > > > In fact, the system boots and the gpfs1 partition isn't available > for a > > > > good 20-30 seconds. > > > > > > > > /gfs1/home??? /home??? none???? bind > > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > > The noauto lets it boot, but the mount is never mounted properly. > Doing > > > > a manual mount -a mounts it. > > > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > > Assuming this is on Linux, you ought to be able to use bind mounts > for > > > > > that, something like this in fstab or equivalent: > > > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount > things > > > > > > like: > > > > > > /home /gpfs1/home > > > > > > /other /gpfs1/other > > > > > > /stuff /gpfs1/stuff > > > > > > > > > > > > But adding that to fstab doesn't work, because from what I > understand, > > > > > > that's not how gpfs works with mounts. > > > > > > What's the standard way to accomplish something like this? > > > > > > We've used systemd timers/mounts to accomplish it, but that's > not ideal. > > > > > > Is there a way to do this natively with gpfs or does this have > to be done > > > > > > through symlinks or gpfs over nfs? > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From malone12 at illinois.edu Tue Feb 22 20:21:43 2022 From: malone12 at illinois.edu (Maloney, J.D.) Date: Tue, 22 Feb 2022 20:21:43 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: Our Puppet/Ansible GPFS modules/playbooks handle this sequencing for us (we use bind mounts for things like u, projects, and scratch also). Like Skylar mentioned page pool allocation, quorum checking, and cluster arbitration have to come before a mount of the FS so that time you mentioned doesn?t seem totally off to me. We just make the creation of the bind mounts dependent on the actual GPFS mount occurring in the configuration management tooling which has worked out well for us in that regard. Best, J.D. Maloney Sr. HPC Storage Engineer | Storage Enabling Technologies Group National Center for Supercomputing Applications (NCSA) From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Skylar Thompson Date: Tuesday, February 22, 2022 at 2:13 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to do multiple mounts via GPFS The problem might be that the service indicates success when mmstartup returns rather than when the mount is actually active (requires quorum checking, arbitration, etc.). A couple tricks I can think of would be using ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a callback[2] that triggers on the mount condition for your filesystem that makes the bind mount rather than systemd. [1] https://urldefense.com/v3/__https://www.freedesktop.org/software/systemd/man/systemd.unit.html*ConditionPathIsMountPoint=__;Iw!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv4xJQwzZ$ [2] https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv3f90Gia$ These are both on our todo list for improving our own GPFS mounting as we have problems with our job scheduler not starting reliably on reboot, but for us we can have Puppet start it on the next run so it just means nodes might not return to service for 30 minutes or so. On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > This is how we're currently solving this problem, with systemd timer and > mount. None of the requires seem to work with gpfs since it starts so late. > I would like a better solution. > > Is it normal for gpfs to start so late?? I think it doesn't mount until > after the gpfs.service starts, and even then it's 20-30 seconds. > > > On 2/22/22 14:42, Skylar Thompson wrote: > > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > > a race condition if you're doing it in fstab. If you're on system with systemd, > > another option might be to do this directly with systemd.mount rather than > > let the fstab generator make the systemd.mount units: > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.freedesktop.org*2Fsoftware*2Fsystemd*2Fman*2Fsystemd.mount.html&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=*2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv0tqF9rU$ > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > > > to boot because gpfs doesn't start until too late in the boot process. > > > In fact, the system boots and the gpfs1 partition isn't available for a > > > good 20-30 seconds. > > > > > > /gfs1/home??? /home??? none???? bind > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > The noauto lets it boot, but the mount is never mounted properly. Doing > > > a manual mount -a mounts it. > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > Assuming this is on Linux, you ought to be able to use bind mounts for > > > > that, something like this in fstab or equivalent: > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > > > like: > > > > > /home /gpfs1/home > > > > > /other /gpfs1/other > > > > > /stuff /gpfs1/stuff > > > > > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > > > that's not how gpfs works with mounts. > > > > > What's the standard way to accomplish something like this? > > > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > > > Is there a way to do this natively with gpfs or does this have to be done > > > > > through symlinks or gpfs over nfs? > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=http*3A*2F*2Fgpfsug.org*2Fmailman*2Flistinfo*2Fgpfsug-discuss&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=F4oXAT0zdY*2BS1mR784ZGghUt0G*2F6Ofu36MfJ9WnPsPM*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv5uX7C9S$ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cantrell at astro.gsu.edu Tue Feb 22 22:07:47 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 17:07:47 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: I'd love to see your fstab to see how you're doing that bind mount. Do you use systemd? What cluster manager are you using? On 2/22/22 15:21, Maloney, J.D. wrote: > > Our Puppet/Ansible GPFS modules/playbooks handle this sequencing for > us (we use bind mounts for things like u, projects, and scratch > also).? Like Skylar mentioned page pool allocation, quorum checking, > and cluster arbitration have to come before a mount of the FS so that > time you mentioned doesn?t seem totally off to me. ?We just make the > creation of the bind mounts dependent on the actual GPFS mount > occurring in the configuration management tooling which has worked out > well for us in that regard. > > Best, > > J.D. Maloney > > Sr. HPC Storage Engineer | Storage Enabling Technologies Group > > National Center for Supercomputing Applications (NCSA) > > *From: *gpfsug-discuss-bounces at spectrumscale.org > on behalf of Skylar > Thompson > *Date: *Tuesday, February 22, 2022 at 2:13 PM > *To: *gpfsug-discuss at spectrumscale.org > *Subject: *Re: [gpfsug-discuss] How to do multiple mounts via GPFS > > The problem might be that the service indicates success when mmstartup > returns rather than when the mount is actually active (requires quorum > checking, arbitration, etc.). A couple tricks I can think of would be > using > ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a > callback[2] that triggers on the mount condition for your filesystem that > makes the bind mount rather than systemd. > > [1] > https://urldefense.com/v3/__https://www.freedesktop.org/software/systemd/man/systemd.unit.html*ConditionPathIsMountPoint=__;Iw!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv4xJQwzZ$ > > > [2] > https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv3f90Gia$ > > > > These are both on our todo list for improving our own GPFS mounting as we > have problems with our job scheduler not starting reliably on reboot, but > for us we can have Puppet start it on the next run so it just means nodes > might not return to service for 30 minutes or so. > > On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > > This is how we're currently solving this problem, with systemd timer and > > mount. None of the requires seem to work with gpfs since it starts > so late. > > I would like a better solution. > > > > Is it normal for gpfs to start so late?? I think it doesn't mount until > > after the gpfs.service starts, and even then it's 20-30 seconds. > > > > > > On 2/22/22 14:42, Skylar Thompson wrote: > > > Like Tina, we're doing bind mounts in autofs. I forgot that there > might be > > > a race condition if you're doing it in fstab. If you're on system > with systemd, > > > another option might be to do this directly with systemd.mount > rather than > > > let the fstab generator make the systemd.mount units: > > > > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.freedesktop.org*2Fsoftware*2Fsystemd*2Fman*2Fsystemd.mount.html&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=*2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv0tqF9rU$ > > > > > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount > unit. > > > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > > I tried a bind mount, but perhaps I'm doing it wrong. The system > fails > > > > to boot because gpfs doesn't start until too late in the boot > process. > > > > In fact, the system boots and the gpfs1 partition isn't > available for a > > > > good 20-30 seconds. > > > > > > > > /gfs1/home??? /home??? none???? bind > > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > > The noauto lets it boot, but the mount is never mounted > properly. Doing > > > > a manual mount -a mounts it. > > > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > > Assuming this is on Linux, you ought to be able to use bind > mounts for > > > > > that, something like this in fstab or equivalent: > > > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > > We can mount the main gpfs mount /gpfs1, but would like to > mount things > > > > > > like: > > > > > > /home /gpfs1/home > > > > > > /other /gpfs1/other > > > > > > /stuff /gpfs1/stuff > > > > > > > > > > > > But adding that to fstab doesn't work, because from what I > understand, > > > > > > that's not how gpfs works with mounts. > > > > > > What's the standard way to accomplish something like this? > > > > > > We've used systemd timers/mounts to accomplish it, but > that's not ideal. > > > > > > Is there a way to do this natively with gpfs or does this > have to be done > > > > > > through symlinks or gpfs over nfs? > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=http*3A*2F*2Fgpfsug.org*2Fmailman*2Flistinfo*2Fgpfsug-discuss&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=F4oXAT0zdY*2BS1mR784ZGghUt0G*2F6Ofu36MfJ9WnPsPM*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv5uX7C9S$ > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ > > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From NSCHULD at de.ibm.com Wed Feb 23 07:01:45 2022 From: NSCHULD at de.ibm.com (Norbert Schuld) Date: Wed, 23 Feb 2022 09:01:45 +0200 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu><20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu><34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu><20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: May I point out some additional systemd targets documented here: https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=gpfs-planning-systemd Depending on the need the gpfs-wait-mount.service could be helpful as an "after" clause for other units. An example is provided in /usr/lpp/mmfs/samples/systemd.service.sample Kind regards Norbert Schuld IBM Spectrum Scale Software Development -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 11:03:37 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 11:03:37 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. ???????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug-discuss at spectrumscale.org" > Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender ">This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From juergen.hannappel at desy.de Wed Feb 23 11:49:09 2022 From: juergen.hannappel at desy.de (Hannappel, Juergen) Date: Wed, 23 Feb 2022 12:49:09 +0100 (CET) Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name > From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 > Subject: Re: [gpfsug-discuss] immutable folder > Its not a fileset, its just a folder, well a subfolder? > [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact > experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick > Foster's sample > It?s the ?Nick Foster's sample? folder I want to delete, but it says it is > immutable and I can?t disable that. > I suspect it?s the apostrophe confusing things. > Kindest regards, > Paul > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: [ mailto:p.ward at nhm.ac.uk | p.ward at nhm.ac.uk ] > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale > Sent: 22 February 2022 14:17 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder > Scale disallows deleting fileset junction using rmdir, so I suggested > mmunlinkfileset. > Regards, The Spectrum Scale (GPFS) team > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), > then please post it to the public IBM developerWroks Forum at [ > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=04%7C01%7Cp.ward%40nhm.ac.uk%7Cbd72c8c2ee3d49f619c908d9f60e0732%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637811363409593169%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=XoY%2BAbA5%2FNBwuoJrY12MNurjJrp8KMsV1t63hdItfiM%3D&reserved=0 > | > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > ] . > If your query concerns a potential software error in Spectrum Scale (GPFS) and > you have an IBM software maintenance contract please contact 1-800-237-5511 in > the United States or your local IBM Service Center in other countries. > The forum is informally monitored as time permits and should not be used for > priority messages to the Spectrum Scale (GPFS) team. > From: "Paul Ward" < [ mailto:p.ward at nhm.ac.uk | p.ward at nhm.ac.uk ] > > To: "gpfsug main discussion list" < [ mailto:gpfsug-discuss at spectrumscale.org | > gpfsug-discuss at spectrumscale.org ] > > Date: 02/22/2022 05:31 AM > Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder > Sent by: [ mailto:gpfsug-discuss-bounces at spectrumscale.org | > gpfsug-discuss-bounces at spectrumscale.org ] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From p.ward at nhm.ac.uk Wed Feb 23 12:17:15 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 12:17:15 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Thanks, I couldn't recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory 'it/stu'pid name': No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder... [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It's the "Nick Foster's sample" folder I want to delete, but it says it is immutable and I can't disable that. I suspect it's the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From stockf at us.ibm.com Wed Feb 23 12:51:26 2022 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 23 Feb 2022 12:51:26 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: , <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image001.jpg at 01D828AF.49A09C40.jpg Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 13:52:20 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 13:52:20 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: , <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: 5.1.1-1 Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Frederick Stock Sent: 23 February 2022 12:51 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] immutable folder Paul, what version of Spectrum Scale are you using? Fred _______________________________________________________ Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Paul Ward" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Date: Wed, Feb 23, 2022 7:17 AM Thanks, I couldn't recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory 'it/stu'pid name': No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawingDescription automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder... [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It's the "Nick Foster's sample" folder I want to delete, but it says it is immutable and I can't disable that. I suspect it's the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawingDescription automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From julian.jakobs at cec.mpg.de Wed Feb 23 13:48:10 2022 From: julian.jakobs at cec.mpg.de (Jakobs, Julian) Date: Wed, 23 Feb 2022 13:48:10 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> Message-ID: <67f997e15dc040d2900b2e1f9295dec0@cec.mpg.de> I've ran into the same problem some time ago. What worked for me was this shell script I run as a @reboot cronjob: #!/bin/bash while [ ! -d /gpfs1/home ] do sleep 5 done mount --bind /gpfs1/home /home -----Urspr?ngliche Nachricht----- Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Justin Cantrell Gesendet: Dienstag, 22. Februar 2022 20:24 An: gpfsug-discuss at spectrumscale.org Betreff: Re: [gpfsug-discuss] How to do multiple mounts via GPFS I tried a bind mount, but perhaps I'm doing it wrong. The system fails to boot because gpfs doesn't start until too late in the boot process. In fact, the system boots and the gpfs1 partition isn't available for a good 20-30 seconds. /gfs1/home /home none bind I've tried adding mount options of x-systemd-requires=gpfs1, noauto. The noauto lets it boot, but the mount is never mounted properly. Doing a manual mount -a mounts it. On 2/22/22 12:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount >> things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I >> understand, that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be >> done through symlinks or gpfs over nfs? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6777 bytes Desc: not available URL: From scale at us.ibm.com Wed Feb 23 14:57:24 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 23 Feb 2022 10:57:24 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Your directory is under a fileset with non-compliant iam mode. With fileset in that mode, it follows snapLock protocol - it disallows changing subdir to immutable, but allows changing subdir to mutable. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/23/2022 07:17 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" ??????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name From: "Paul Ward" To: "gpfsug main discussion list" Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org < gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 16:35:14 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 16:35:14 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Its not allowing me! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 23 February 2022 14:57 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Your directory is under a fileset with non-compliant iam mode. With fileset in that mode, it follows snapLock protocol - it disallows changing subdir to immutable, but allows changing subdir to mutable. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/23/2022 07:17 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" ??????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From uwe.falke at kit.edu Wed Feb 23 18:26:50 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Wed, 23 Feb 2022 19:26:50 +0100 Subject: [gpfsug-discuss] IO sizes Message-ID: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that? just one of the NSD servers does send smaller IO requests to the storage? than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by? ctrl A, one by ctrl B,? and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From alex at calicolabs.com Wed Feb 23 18:39:07 2022 From: alex at calicolabs.com (Alex Chekholko) Date: Wed, 23 Feb 2022 10:39:07 -0800 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: > Dear all, > > sorry for asking a question which seems not directly GPFS related: > > In a setup with 4 NSD servers (old-style, with storage controllers in > the back end), 12 clients and 10 Seagate storage systems, I do see in > benchmark tests that just one of the NSD servers does send smaller IO > requests to the storage than the other 3 (that is, both reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes > ( one server to the controllers A, the other one to controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by > mpt3sas) for all sd devices and all multipath (dm) devices built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so there is some > asymmetry, but that should not affect the IOs, shouldn't it?, and if it > did we would see the same effect in both pairs of NSD servers, but we do > not). > > All 4 storage systems are also configured the same way (2 disk groups / > pools / declustered arrays, one managed by ctrl A, one by ctrl B, and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I do see, both > in iostat and on the storage systems, that the default IO requests are > about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the storage) cause > incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as > the controller is not able to re-coalesce the data properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Wed Feb 23 21:20:11 2022 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 23 Feb 2022 21:20:11 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: Message-ID: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew > On 24 Feb 2022, at 04:39, Alex Chekholko wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Hi, > > Metadata I/Os will always be smaller than the usual data block size, right? > Which version of GPFS? > > Regards, > Alex > >> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: >> Dear all, >> >> sorry for asking a question which seems not directly GPFS related: >> >> In a setup with 4 NSD servers (old-style, with storage controllers in >> the back end), 12 clients and 10 Seagate storage systems, I do see in >> benchmark tests that just one of the NSD servers does send smaller IO >> requests to the storage than the other 3 (that is, both reads and >> writes are smaller). >> >> The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes >> ( one server to the controllers A, the other one to controllers B of the >> Seagates, resp.). >> >> All 4 NSD servers are set up similarly: >> >> kernel: 3.10.0-1160.el7.x86_64 #1 SMP >> >> HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx >> >> driver : mpt3sas 31.100.01.00 >> >> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by >> mpt3sas) for all sd devices and all multipath (dm) devices built on top. >> >> scheduler: deadline >> >> multipath (actually we do have 3 paths to each volume, so there is some >> asymmetry, but that should not affect the IOs, shouldn't it?, and if it >> did we would see the same effect in both pairs of NSD servers, but we do >> not). >> >> All 4 storage systems are also configured the same way (2 disk groups / >> pools / declustered arrays, one managed by ctrl A, one by ctrl B, and >> 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). >> >> >> GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO >> requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. >> >> The first question I have - but that is not my main one: I do see, both >> in iostat and on the storage systems, that the default IO requests are >> about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb >> is really in terms of kiB, not sectors, cf. >> https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). >> >> But what puzzles me even more: one of the server compiles IOs even >> smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and >> writes ... I just cannot see why. >> >> I have to suspect that this will (in writing to the storage) cause >> incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as >> the controller is not able to re-coalesce the data properly; and it >> seems it cannot do it completely at least) >> >> >> If someone of you has seen that already and/or knows a potential >> explanation I'd be glad to learn about. >> >> >> And if some of you wonder: yes, I (was) moved away from IBM and am now >> at KIT. >> >> Many thanks in advance >> >> Uwe >> >> >> -- >> Karlsruhe Institute of Technology (KIT) >> Steinbuch Centre for Computing (SCC) >> Scientific Data Management (SDM) >> >> Uwe Falke >> >> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 >> D-76344 Eggenstein-Leopoldshafen >> >> Tel: +49 721 608 28024 >> Email: uwe.falke at kit.edu >> www.scc.kit.edu >> >> Registered office: >> Kaiserstra?e 12, 76131 Karlsruhe, Germany >> >> KIT ? The Research University in the Helmholtz Association >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From uwe.falke at kit.edu Thu Feb 24 01:03:32 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Thu, 24 Feb 2022 02:03:32 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems? recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks to > consider V4 filesystems have 1/32 subblocks, V5 filesystems have > 1/1024 subblocks (assuming metadata and data block size is the same) > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file size is > if most of your files are smaller than your filesystem block size, > then you are always going to be performing writes using groups of > subblocks rather than a full block writes. > > Regards, > > Andrew > > >> On 24 Feb 2022, at 04:39, Alex Chekholko wrote: >> >> ? Hi, Metadata I/Os will always be smaller than the usual data block >> size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, >> 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry >> for asking a question which seems ZjQcmQRYFpfptBannerStart >> This Message Is From an External Sender >> This message came from outside your organization. >> ZjQcmQRYFpfptBannerEnd >> Hi, >> >> Metadata I/Os will always be smaller than the usual data block size, >> right? >> Which version of GPFS? >> >> Regards, >> Alex >> >> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: >> >> Dear all, >> >> sorry for asking a question which seems not directly GPFS related: >> >> In a setup with 4 NSD servers (old-style, with storage >> controllers in >> the back end), 12 clients and 10 Seagate storage systems, I do >> see in >> benchmark tests that? just one of the NSD servers does send >> smaller IO >> requests to the storage? than the other 3 (that is, both reads and >> writes are smaller). >> >> The NSD servers form 2 pairs, each pair is connected to 5 seagate >> boxes >> ( one server to the controllers A, the other one to controllers B >> of the >> Seagates, resp.). >> >> All 4 NSD servers are set up similarly: >> >> kernel: 3.10.0-1160.el7.x86_64 #1 SMP >> >> HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx >> >> driver : mpt3sas 31.100.01.00 >> >> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as >> limited by >> mpt3sas) for all sd devices and all multipath (dm) devices built >> on top. >> >> scheduler: deadline >> >> multipath (actually we do have 3 paths to each volume, so there >> is some >> asymmetry, but that should not affect the IOs, shouldn't it?, and >> if it >> did we would see the same effect in both pairs of NSD servers, >> but we do >> not). >> >> All 4 storage systems are also configured the same way (2 disk >> groups / >> pools / declustered arrays, one managed by? ctrl A, one by ctrl >> B,? and >> 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). >> >> >> GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO >> requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. >> >> The first question I have - but that is not my main one: I do >> see, both >> in iostat and on the storage systems, that the default IO >> requests are >> about 4MiB, not 8MiB as I'd expect from above settings >> (max_sectors_kb >> is really in terms of kiB, not sectors, cf. >> https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). >> >> But what puzzles me even more: one of the server compiles IOs even >> smaller, varying between 3.2MiB and 3.6MiB mostly - both for >> reads and >> writes ... I just cannot see why. >> >> I have to suspect that this will (in writing to the storage) cause >> incomplete stripe writes on our erasure-coded volumes (8+2p)(as >> long as >> the controller is not able to re-coalesce the data properly; and it >> seems it cannot do it completely at least) >> >> >> If someone of you has seen that already and/or knows a potential >> explanation I'd be glad to learn about. >> >> >> And if some of you wonder: yes, I (was) moved away from IBM and >> am now >> at KIT. >> >> Many thanks in advance >> >> Uwe >> >> >> -- >> Karlsruhe Institute of Technology (KIT) >> Steinbuch Centre for Computing (SCC) >> Scientific Data Management (SDM) >> >> Uwe Falke >> >> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 >> D-76344 Eggenstein-Leopoldshafen >> >> Tel: +49 721 608 28024 >> Email: uwe.falke at kit.edu >> www.scc.kit.edu >> >> Registered office: >> Kaiserstra?e 12, 76131 Karlsruhe, Germany >> >> KIT ? The Research University in the Helmholtz Association >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Achim.Rehor at de.ibm.com Thu Feb 24 12:41:11 2022 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Thu, 24 Feb 2022 14:41:11 +0200 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi Uwe, first of all, glad to see you back in the GPFS space ;) agreed, groups of subblocks being written will end up in IO sizes, being smaller than the 8MB filesystem blocksize, also agreed, this cannot be metadata, since their size is MUCH smaller, like 4k or less, mostly. But why would these grouped subblock reads/writes all end up on the same NSD server, while the others do full block writes ? How is your NSD server setup per NSD ? did you 'round-robin' set the preferred NSD server per NSD ? are the client nodes transferring the data in anyway doing specifics ? Sorry for not having a solution for you, jsut sharing a few ideas ;) Mit freundlichen Gr??en / Kind regards Achim Rehor Technical Support Specialist Spectrum Scale and ESS (SME) Advisory Product Services Professional IBM Systems Storage Support - EMEA gpfsug-discuss-bounces at spectrumscale.org wrote on 23/02/2022 22:20:11: > From: "Andrew Beattie" > To: "gpfsug main discussion list" > Date: 23/02/2022 22:20 > Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > Alex, Metadata will be 4Kib Depending on the filesystem version you > will also have subblocks to consider V4 filesystems have 1/32 > subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata > and data block size is the same) ???????????ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks to > consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/ > 1024 subblocks (assuming metadata and data block size is the same) > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file size > is if most of your files are smaller than your filesystem block > size, then you are always going to be performing writes using groups > of subblocks rather than a full block writes. > > Regards, > > Andrew > > On 24 Feb 2022, at 04:39, Alex Chekholko wrote: > ? Hi, Metadata I/Os will always be smaller than the usual data block > size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, > 2022 at 10:26 AM Uwe Falke wrote: Dear all, > sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Hi, > > Metadata I/Os will always be smaller than the usual data block size, right? > Which version of GPFS? > > Regards, > Alex > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: > Dear all, > > sorry for asking a question which seems not directly GPFS related: > > In a setup with 4 NSD servers (old-style, with storage controllers in > the back end), 12 clients and 10 Seagate storage systems, I do see in > benchmark tests that just one of the NSD servers does send smaller IO > requests to the storage than the other 3 (that is, both reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes > ( one server to the controllers A, the other one to controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by > mpt3sas) for all sd devices and all multipath (dm) devices built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so there is some > asymmetry, but that should not affect the IOs, shouldn't it?, and if it > did we would see the same effect in both pairs of NSD servers, but we do > not). > > All 4 storage systems are also configured the same way (2 disk groups / > pools / declustered arrays, one managed by ctrl A, one by ctrl B, and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I do see, both > in iostat and on the storage systems, that the default IO requests are > about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the storage) cause > incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as > the controller is not able to re-coalesce the data properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > INVALID URI REMOVED > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-M&m=- > FdZvYBvHDPnBTu2FtPkLT09ahlYp2QsMutqNV2jWaY&s=S4C2D3_h4FJLAw0PUYLKhKE242vn_fwn-1_EJmHNpE8&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From olaf.weiser at de.ibm.com Thu Feb 24 12:47:59 2022 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 24 Feb 2022 12:47:59 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.1__=4EBB0D60DFD775728f9e8a93df938690 at ibm.com.gif Type: image/gif Size: 45 bytes Desc: not available URL: From krajaram at geocomputing.net Thu Feb 24 14:32:35 2022 From: krajaram at geocomputing.net (Kumaran Rajaram) Date: Thu, 24 Feb 2022 14:32:35 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi Uwe, >> But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. IMHO, If GPFS on this particular NSD server was restarted often during the setup, then it is possible that the GPFS pagepool may not be contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a scatter-gather (SG) list with many small entries (in the memory) resulting in smaller I/O when these buffers are issued to the disks. The fix would be to reboot the server and start GPFS so that pagepool is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) SG entries. >>In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs >>smaller than 4MiB again at some point, so that is not a nice solution. It will be advised not to restart GPFS often in the NSD servers (in production) to keep the pagepool contiguous. Ensure that there is enough free memory in NSD server and not run any memory intensive jobs so that pagepool is not impacted (e.g. swapped out). Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is equally distributed across the NUMA domains for good performance. GPFS numaMemoryInterleave=yes requires that numactl packages are installed and then GPFS restarted. # mmfsadm dump config | egrep "numaMemory|pagepool " ! numaMemoryInterleave yes ! pagepool 282394099712 # pgrep mmfsd | xargs numastat -p Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) Node 0 Node 1 Total --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 1.26 3.26 4.52 Stack 0.01 0.01 0.02 Private 137710.43 137709.96 275420.39 ---------------- --------------- --------------- --------------- Total 137711.70 137713.23 275424.92 My two cents, -Kums Kumaran Rajaram [cid:image001.png at 01D82960.6A9860C0] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Uwe Falke Sent: Wednesday, February 23, 2022 8:04 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] IO sizes Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew On 24 Feb 2022, at 04:39, Alex Chekholko wrote: ? Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry for asking a question which seems ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that just one of the NSD servers does send smaller IO requests to the storage than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by ctrl A, one by ctrl B, and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: image001.png URL: From uwe.falke at kit.edu Fri Feb 25 14:29:23 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Fri, 25 Feb 2022 15:29:23 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: <3fc68f40-8b3a-be33-3451-09a04fdc83a0@kit.edu> Hi, and thanks, Achim and Olaf, mmdiag --iohist on the NSD servers (on all 4 of them) shows IO sizes in IOs to/from the data NSDs (i.e. to/from storage) of 16384 512-byte-sectors? throughout, i.e. 8MiB, agreeing with the FS block size. (Having that information i do not need to ask the clients ...) iostat on NSD servers as well as the? storage system counters say the IOs crafted by the OS layer are 4MiB except for the one suspicious NSD server where they were somewhat smaller than 4MiB before the reboot, but are now somewhat larger than 4MiB (but by a distinct amount). The data piped through the NSD servers are well balanced between the 4 NSD servers, the IO system of the suspicious NSD server just issued a higher rate of IO requests when running smaller IOs and now, with larger IOs it has a lower IO rate than the other three NSD servers. So I am pretty sure it is not GPFS (see my initial post :-); but still some people using GPFS might have encounterd that as well, or might have an idea ;-) Cheers Uwe On 24.02.22 13:47, Olaf Weiser wrote: > in addition, to Achim, > where do you see those "smaller IO"... > have you checked IO sizes with mmfsadm dump iohist on each > NSDclient/Server ?... If ok on that level.. it's not GPFS > Mit freundlichen Gr??en / Kind regards > > Olaf Weiser > > ----- Urspr?ngliche Nachricht ----- > Von: "Achim Rehor" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org > An: "gpfsug main discussion list" > CC: > Betreff: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > Datum: Do, 24. Feb 2022 13:41 > > Hi Uwe, > > first of all, glad to see you back in the GPFS space ;) > > agreed, groups of subblocks being written will end up in IO sizes, > being smaller than the 8MB filesystem blocksize, > also agreed, this cannot be metadata, since their size is MUCH > smaller, like 4k or less, mostly. > > But why would these grouped subblock reads/writes all end up on > the same NSD server, while the others do full block writes ? > > How is your NSD server setup per NSD ? did you 'round-robin' set > the preferred NSD server per NSD ? > are the client nodes transferring the data in anyway doing > specifics ?? > > Sorry for not having a solution for you, jsut sharing a few ideas ;) > > > Mit freundlichen Gr??en / Kind regards > > *Achim Rehor* > > Technical Support Specialist Spectrum Scale and ESS (SME) > Advisory Product Services Professional > IBM Systems Storage Support - EMEA > > > > > > > gpfsug-discuss-bounces at spectrumscale.org wrote on 23/02/2022 22:20:11: > > > From: "Andrew Beattie" > > To: "gpfsug main discussion list" > > Date: 23/02/2022 22:20 > > Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Alex, Metadata will be 4Kib Depending on the filesystem version you > > will also have subblocks to consider V4 filesystems have 1/32 > > subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata > > and data block size is the same) > ???????????ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Alex, > > > > Metadata will be 4Kib > > > > Depending on the filesystem version you will also have subblocks to > > consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/ > > 1024 subblocks (assuming metadata and data block size is the same) > > > > My first question would be is ? Are you sure that Linux OS is > > configured the same on all 4 NSD servers?. > > > > My second question would be do you know what your average file size > > is if most of your files are smaller than your filesystem block > > size, then you are always going to be performing writes using groups > > of subblocks rather than a full block writes. > > > > Regards, > > > > Andrew > > > > On 24 Feb 2022, at 04:39, Alex Chekholko > wrote: > > > ? Hi, Metadata I/Os will always be smaller than the usual data block > > size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, > > 2022 at 10:26 AM Uwe Falke wrote: Dear all, > > sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > Metadata I/Os will always be smaller than the usual data block > size, right? > > Which version of GPFS? > > > > Regards, > > Alex > > > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: > > Dear all, > > > > sorry for asking a question which seems not directly GPFS related: > > > > In a setup with 4 NSD servers (old-style, with storage > controllers in > > the back end), 12 clients and 10 Seagate storage systems, I do > see in > > benchmark tests that ?just one of the NSD servers does send > smaller IO > > requests to the storage ?than the other 3 (that is, both reads and > > writes are smaller). > > > > The NSD servers form 2 pairs, each pair is connected to 5 > seagate boxes > > ( one server to the controllers A, the other one to controllers > B of the > > Seagates, resp.). > > > > All 4 NSD servers are set up similarly: > > > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > > > driver : mpt3sas 31.100.01.00 > > > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as > limited by > > mpt3sas) for all sd devices and all multipath (dm) devices built > on top. > > > > scheduler: deadline > > > > multipath (actually we do have 3 paths to each volume, so there > is some > > asymmetry, but that should not affect the IOs, shouldn't it?, > and if it > > did we would see the same effect in both pairs of NSD servers, > but we do > > not). > > > > All 4 storage systems are also configured the same way (2 disk > groups / > > pools / declustered arrays, one managed by ?ctrl A, one by ctrl > B, ?and > > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > > > The first question I have - but that is not my main one: I do > see, both > > in iostat and on the storage systems, that the default IO > requests are > > about 4MiB, not 8MiB as I'd expect from above settings > (max_sectors_kb > > is really in terms of kiB, not sectors, cf. > > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > > > But what puzzles me even more: one of the server compiles IOs even > > smaller, varying between 3.2MiB and 3.6MiB mostly - both for > reads and > > writes ... I just cannot see why. > > > > I have to suspect that this will (in writing to the storage) cause > > incomplete stripe writes on our erasure-coded volumes (8+2p)(as > long as > > the controller is not able to re-coalesce the data properly; and it > > seems it cannot do it completely at least) > > > > > > If someone of you has seen that already and/or knows a potential > > explanation I'd be glad to learn about. > > > > > > And if some of you wonder: yes, I (was) moved away from IBM and > am now > > at KIT. > > > > Many thanks in advance > > > > Uwe > > > > > > -- > > Karlsruhe Institute of Technology (KIT) > > Steinbuch Centre for Computing (SCC) > > Scientific Data Management (SDM) > > > > Uwe Falke > > > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > > D-76344 Eggenstein-Leopoldshafen > > > > Tel: +49 721 608 28024 > > Email: uwe.falke at kit.edu > > www.scc.kit.edu > > > > Registered office: > > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > > > KIT ? The Research University in the Helmholtz Association > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > INVALID URI REMOVED > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-M&m=- > > > FdZvYBvHDPnBTu2FtPkLT09ahlYp2QsMutqNV2jWaY&s=S4C2D3_h4FJLAw0PUYLKhKE242vn_fwn-1_EJmHNpE8&e= > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.1__%3D4EBB0D60DFD775728f9e8a93df938690%40ibm.com.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From uwe.falke at kit.edu Mon Feb 28 09:17:26 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Mon, 28 Feb 2022 10:17:26 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> Hi, Kumaran, that would explain the smaller IOs before the reboot, but not the larger-than-4MiB IOs afterwards on that machine. Then, I already saw that the numaMemoryInterleave setting seems to have no effect (on that very installation), I just have not yet requested a PMR for it. I'd checked memory usage of course and saw that regardless of this setting always one socket's memory is almost completely consumed while the other one's is rather empty - looks like a bug to me, but that needs further investigation. Uwe On 24.02.22 15:32, Kumaran Rajaram wrote: > > Hi Uwe, > > >> But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > IMHO, If GPFS on this particular NSD server was restarted often during > the setup, then it is possible that the GPFS pagepool may not be > contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a > scatter-gather (SG) list with many small entries (in the memory) > resulting in smaller I/O when these buffers are issued to the disks. > The fix would be to reboot the server and start GPFS so that pagepool > is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) > SG entries. > > >>In the current situation (i.e. with IOs bit larger than 4MiB) > setting max_sectors_kB to 4096 might do the trick, but as I do not > know the cause for that behaviour it might well start to issue IOs > >>smaller than 4MiB again at some point, so that is not a nice solution. > > It will be advised not to restart GPFS often in the NSD servers (in > production) to keep the pagepool contiguous. Ensure that there is > enough free memory in NSD server and not run any memory intensive jobs > so that pagepool is not impacted (e.g. swapped out). > > Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is > equally distributed across the NUMA domains for good performance. GPFS > numaMemoryInterleave=yes requires that numactl packages are installed > and then GPFS restarted. > > # mmfsadm dump config | egrep "numaMemory|pagepool " > > ! numaMemoryInterleave yes > > ! pagepool 282394099712 > > # pgrep mmfsd | xargs numastat -p > > Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) > > ?????????????????????????? Node 0 Node 1?????????? Total > > ????????????????? --------------- --------------- --------------- > > Huge???????????????????????? 0.00 0.00??????????? 0.00 > > Heap???????????????????????? 1.26 3.26???????? ???4.52 > > Stack??????????????????????? 0.01 0.01??????????? 0.02 > > Private???????????????? 137710.43 137709.96?????? 275420.39 > > ----------------? --------------- --------------- --------------- > > Total?????????????????? 137711.70 137713.23 ??????275424.92 > > My two cents, > > -Kums > > Kumaran Rajaram > > *From:* gpfsug-discuss-bounces at spectrumscale.org > *On Behalf Of *Uwe Falke > *Sent:* Wednesday, February 23, 2022 8:04 PM > *To:* gpfsug-discuss at spectrumscale.org > *Subject:* Re: [gpfsug-discuss] IO sizes > > Hi, > > the test bench is gpfsperf running on up to 12 clients with 1...64 > threads doing sequential reads and writes , file size per gpfsperf > process is 12TB (with 6TB I saw caching effects in particular for > large thread numbers ...) > > As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data > disks, as expected in that case. > > Interesting thing though: > > I have rebooted the suspicious node. Now, it does not issue smaller > IOs than the others, but -- unbelievable -- larger ones (up to about > 4.7MiB). This is still harmful as also that size is incompatible with > full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) > > Currently, I draw this information from the storage boxes; I have not > yet checked iostat data for that benchmark test after the reboot > (before, when IO sizes were smaller, we saw that both in iostat and in > the perf data retrieved from the storage controllers). > > And: we have a separate data pool , hence dataOnly NSDs, I am just > talking about these ... > > As for "Are you sure that Linux OS is configured the same on all 4 NSD > servers?." - of course there are not two boxes identical in the world. > I have actually not installed those machines, and, yes, i also > considered reinstalling them (or at least the disturbing one). > > However, I do not have reason to assume or expect a difference, the > supplier has just implemented these systems recently from scratch. > > In the current situation (i.e. with IOs bit larger than 4MiB) setting > max_sectors_kB to 4096 might do the trick, but as I do not know the > cause for that behaviour it might well start to issue IOs smaller than > 4MiB again at some point, so that is not a nice solution. > > Thanks > > Uwe > > On 23.02.22 22:20, Andrew Beattie wrote: > > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks > to consider V4 filesystems have 1/32 subblocks, V5 filesystems > have 1/1024 subblocks (assuming metadata and data block size is > the same) > > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file > size is if most of your files are smaller than your filesystem > block size, then you are always going to be performing writes > using groups of subblocks rather than a full block writes. > > Regards, > > Andrew > > > > On 24 Feb 2022, at 04:39, Alex Chekholko > wrote: > > ? Hi, Metadata I/Os will always be smaller than the usual data > block size, right? Which version of GPFS? Regards, Alex On > Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a > question which seems ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > Metadata I/Os will always be smaller than the usual data block > size, right? > > Which version of GPFS? > > Regards, > > Alex > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: > > Dear all, > > sorry for asking a question which seems not directly GPFS > related: > > In a setup with 4 NSD servers (old-style, with storage > controllers in > the back end), 12 clients and 10 Seagate storage systems, > I do see in > benchmark tests that? just one of the NSD servers does > send smaller IO > requests to the storage? than the other 3 (that is, both > reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 > seagate boxes > ( one server to the controllers A, the other one to > controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, > as limited by > mpt3sas) for all sd devices and all multipath (dm) devices > built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so > there is some > asymmetry, but that should not affect the IOs, shouldn't > it?, and if it > did we would see the same effect in both pairs of NSD > servers, but we do > not). > > All 4 storage systems are also configured the same way (2 > disk groups / > pools / declustered arrays, one managed by? ctrl A, one by > ctrl B,? and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 > NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do > see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I > do see, both > in iostat and on the storage systems, that the default IO > requests are > about 4MiB, not 8MiB as I'd expect from above settings > (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt > ). > > But what puzzles me even more: one of the server compiles > IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both > for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the > storage) cause > incomplete stripe writes on our erasure-coded volumes > (8+2p)(as long as > the controller is not able to re-coalesce the data > properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a > potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from > IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > Uwe Falke > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > Tel: +49 721 608 28024 > Email:uwe.falke at kit.edu > www.scc.kit.edu > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Renar.Grunenberg at huk-coburg.de Mon Feb 28 12:23:55 2022 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Mon, 28 Feb 2022 12:23:55 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> References: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> Message-ID: <7a29b404669942d193ad46c2632d6d30@huk-coburg.de> Hallo Uwe, are numactl already installed on that affected node? If it missed the numa scale stuff is not working. Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss-bounces at spectrumscale.org Im Auftrag von Uwe Falke Gesendet: Montag, 28. Februar 2022 10:17 An: gpfsug-discuss at spectrumscale.org Betreff: Re: [gpfsug-discuss] IO sizes Hi, Kumaran, that would explain the smaller IOs before the reboot, but not the larger-than-4MiB IOs afterwards on that machine. Then, I already saw that the numaMemoryInterleave setting seems to have no effect (on that very installation), I just have not yet requested a PMR for it. I'd checked memory usage of course and saw that regardless of this setting always one socket's memory is almost completely consumed while the other one's is rather empty - looks like a bug to me, but that needs further investigation. Uwe On 24.02.22 15:32, Kumaran Rajaram wrote: Hi Uwe, >> But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. IMHO, If GPFS on this particular NSD server was restarted often during the setup, then it is possible that the GPFS pagepool may not be contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a scatter-gather (SG) list with many small entries (in the memory) resulting in smaller I/O when these buffers are issued to the disks. The fix would be to reboot the server and start GPFS so that pagepool is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) SG entries. >>In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs >>smaller than 4MiB again at some point, so that is not a nice solution. It will be advised not to restart GPFS often in the NSD servers (in production) to keep the pagepool contiguous. Ensure that there is enough free memory in NSD server and not run any memory intensive jobs so that pagepool is not impacted (e.g. swapped out). Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is equally distributed across the NUMA domains for good performance. GPFS numaMemoryInterleave=yes requires that numactl packages are installed and then GPFS restarted. # mmfsadm dump config | egrep "numaMemory|pagepool " ! numaMemoryInterleave yes ! pagepool 282394099712 # pgrep mmfsd | xargs numastat -p Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) Node 0 Node 1 Total --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 1.26 3.26 4.52 Stack 0.01 0.01 0.02 Private 137710.43 137709.96 275420.39 ---------------- --------------- --------------- --------------- Total 137711.70 137713.23 275424.92 My two cents, -Kums Kumaran Rajaram [cid:image001.png at 01D82CA6.6F82DC70] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Uwe Falke Sent: Wednesday, February 23, 2022 8:04 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] IO sizes Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew On 24 Feb 2022, at 04:39, Alex Chekholko wrote: ? Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry for asking a question which seems ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that just one of the NSD servers does send smaller IO requests to the storage than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by ctrl A, one by ctrl B, and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: image001.png URL: From p.ward at nhm.ac.uk Mon Feb 28 16:40:08 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 28 Feb 2022 16:40:08 +0000 Subject: [gpfsug-discuss] Interoperability of Transparent cloud tiering with other IBM Spectrum Scale features Message-ID: I am used to a SCALE solution with space management to a tape tier. Files can not be migrated unless they are backed up. Once migrated and are a stub file they are not backed up as a stub, and they are not excluded from backup. We used the Spectrum Protect BA client, not mmbackup. We have a new SCALE solution with COS, setup with TCT. I am expecting it to operate in the same way. Files can't be migrated unless backed up. Once migrated they are a stub and a don't get backed up again. We are using mmbackup. I migrated files before backup was setup. When backup was turned on, it pulled the files back. The migration policy was set to migrate files not accessed for 2 days. All data met this requirement. Migrations is set to run every 15 minutes, so was pushing them back quite quickly. The cluster was a mess of files going back and forth from COS. To stop this I changed the policy to 14 days. I set mmbackup to exclude migrated files. Things calmed down. I have now almost run out of space on my hot tier, but anything I migrate will expire from backup. The statement below is a bit confusing. HSM and TCT are completely different. I thought TCT was for cloud, and HSM for tape? Both can exist in a cluster but operate on different areas. This suggest to have mmbackup work with data migrated to a cloud tier, we should be using HSM not TCT? Can mmbackup with TCT do what HSM does? https://www.ibm.com/docs/en/spectrum-scale/5.0.5?topic=ics-interoperability-transparent-cloud-tiering-other-spectrum-scale-features Spectrum Protect (TSM) For the file systems that are managed by an HSM system, ensure that hot data is backed up to TSM by using the mmbackup command, and as the data gets cooler, migrate them to the cloud storage tier. This ensures that the mmbackup command has already backed up the cooler files that are migrated to the cloud. Has anyone set something up similar? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From michael.meier at fau.de Tue Feb 1 11:20:37 2022 From: michael.meier at fau.de (Michael Meier) Date: Tue, 1 Feb 2022 12:20:37 +0100 Subject: [gpfsug-discuss] Spectrum Scale and vfs_fruit Message-ID: Hi, A bunch of security updates for Samba were released yesterday, most importantly among them CVE-2021-44142 (https://www.samba.org/samba/security/CVE-2021-44142.html) in the vfs_fruit VFS-module that adds extended support for Apple Clients. Spectrum Scale supports that, so Spectrum Scale might be affected, and I'm trying to find out if we're affected or not. Now we never enabled this via "mmsmb config change --vfs-fruit-enable", and I would expect this to be disabled by default - however, I cannot find an explicit statement like "by default this is disabled" in https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=services-support-vfs-fruit-smb-protocol Am I correct in assuming that it is indeed disabled by default? And how would I verify that? Am I correct in assuming that _if_ it was enabled, then 'fruit' would show up under the 'vfs objects' in 'mmsmb config list'? Regards, -- Michael Meier, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales Rechenzentrum Erlangen Martensstrasse 1, 91058 Erlangen, Germany Tel.: +49 9131 85-20994, Fax: +49 9131 302941 michael.meier at fau.de hpc.fau.de From p.ward at nhm.ac.uk Tue Feb 1 12:28:09 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 1 Feb 2022 12:28:09 +0000 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: Not currently set. I'll look into them. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: 26 January 2022 16:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Awesome, glad that you found them (I missed them the first time too). As for the anomalous changed files, do you have these options set in your client option file? skipacl yes skipaclupdatecheck yes updatectime yes We had similar problems where metadata and ACL updates were interpreted as data changes by mmbackup/dsmc. We also have a case open with IBM where mmbackup will both expire and backup a file in the same run, even in the absence of mtime changes, but it's unclear whether that's program error or something with our include/exclude rules. I'd be curious if you're running into that as well. On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > Good call! > > Yes they are dot files. > > > New issue. > > Mmbackup seems to be backup up the same files over and over without them changing: > areas are being backed up multiple times. > The example below is a co-resident file, the only thing that has changed since it was created 20/10/21, is the file has been accessed for backup. > This file is in the 'changed' list in mmbackup: > > This list has just been created: > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > Listing the last few files in the file (selecting the last one) > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > Check the file stats (access time just before last backup) > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File: '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > Size: 545 Blocks: 32 IO Block: 4194304 regular file > Device: 2bh/43d Inode: 212618897 Links: 1 > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: (1399647564/NHM\dg-mbl-urban-nature-project-rw) > Context: unconfined_u:object_r:unlabeled_t:s0 > Access: 2022-01-25 06:40:58.334961446 +0000 > Modify: 2020-12-01 15:20:40.122053000 +0000 > Change: 2021-10-20 17:55:18.265746459 +0100 > Birth: - > > Check if migrated > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File name : /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > On-line size : 545 > Used blocks : 16 > Data Version : 1 > Meta Version : 1 > State : Co-resident > Container Index : 1 > Base Name : 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > Check if immutable > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > file name: /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > metadata replication: 2 max 2 > data replication: 2 max 2 > immutable: no > appendOnly: no > flags: > storage pool name: data > fileset name: hpc-workspaces-fset > snapshot name: > creation time: Wed Oct 20 17:55:18 2021 > Misc attributes: ARCHIVE > Encrypted: no > > Check active and inactive backups (it was backed up yesterday) > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 11:19:02 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > 11:07:05 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/25/2022 06:41:17 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > It will be backed up again shortly, why? > > And it was backed up again: > # dsmcqbi > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 15:54:09 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > 15:30:03 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/26/2022 12:23:02 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/25/2022 06:41:17 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Skylar > Thompson > Sent: 24 January 2022 15:37 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Hi Paul, > > Did you look for dot files? At least for us on 5.0.5 there's a .list.1. file while the backups are running: > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > Those directories are empty > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of IBM Spectrum > > Scale > > Sent: 22 January 2022 00:35 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > Instead of calculating *.ix.* files, please look at a list file in these directories. > > > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked.]"Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/21/2022 09:38 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > of the script I now copy the contents of the .mmbackupCfg folder to > > a date stamped logging folder Checking how many entries in these files compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you > > > > Right in the command line seems to have worked. > > At the end of the script I now copy the contents of the .mmbackupCfg > > folder to a date stamped logging folder > > > > Checking how many entries in these files compared to the Summary: > > wc -l mmbackup* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 754 total > > From Summary > > Total number of objects inspected: 755 > > I can live with a discrepancy of 1. > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > From Summary > > Total number of objects expired: 2 > > That matches > > > > wc -l mmbackupC* mmbackupS* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 752 total > > Summary: > > Total number of objects backed up: 751 > > > > A difference of 1 I can live with. > > > > What does Statech stand for? > > > > Just this to sort out: > > Total number of objects failed: 1 > > I will add: > > --tsm-errorlog TSMErrorLogFile > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 19 January 2022 15:09 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > This is to set environment for mmbackup. > > If mmbackup is invoked within a script, you can set "export DEBUGmmbackup=2" right above mmbackup command. > > e.g) in your script > > .... > > export DEBUGmmbackup=2 > > mmbackup .... > > > > Or, you can set it in the same command line like > > DEBUGmmbackup=2 mmbackup .... > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to se]"Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to see if they are the cluster manager. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/19/2022 06:04 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > they are the cluster manager. If they are, then they take > > responsibility to start the backup script. The script then randomly selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you. > > > > We run a script on all our nodes that checks to see if they are the cluster manager. > > If they are, then they take responsibility to start the backup script. > > The script then randomly selects one of the available backup nodes and uses dsmsh mmbackup on it. > > > > Where does this command belong? > > I have seen it listed as a export command, again where should that be run ? on all backup nodes, or all nodes? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 18 January 2022 22:54 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files even after successful backup. They are available at MMBACKUP_RECORD_ROOT (default is FSroot or FilesetRoot directory). > > In .mmbackupCfg directory, there are 3 directories: > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to back]"Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to backup using mmbackup. > > > > From: "Paul Ward" > > > To: > > "gpfsug-discuss at spectrumscale.org > org>" > > > org>> > > Date: 01/18/2022 11:56 AM > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > using mmbackup. I have increased the -L value from 3 up to 6 but > > only seem to see the files that are in scope, not the ones that are selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > I am trying to work out what files have been sent to backup using mmbackup. > > I have increased the -L value from 3 up to 6 but only seem to see the files that are in scope, not the ones that are selected. > > > > I can see the three file lists generated during a backup, but can?t seem to find a list of what files were backed up. > > > > It should be the diff of the shadow and shadow-old, but the wc -l of the diff doesn?t match the number of files in the backup summary. > > Wrong assumption? > > > > Where should I be looking ? surely it shouldn?t be this hard to see what files are selected? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > er > > ved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > es > > erved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > 2F > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > 7f > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > d8 > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > C2 > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > ;r > > eserved=0> > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&reserved=0 From dehaan at us.ibm.com Tue Feb 1 16:14:07 2022 From: dehaan at us.ibm.com (David DeHaan) Date: Tue, 1 Feb 2022 09:14:07 -0700 Subject: [gpfsug-discuss] Spectrum Scale and vfs_fruit In-Reply-To: References: Message-ID: Yes, it is disabled by default. And yes, you can tell if it has been enabled by looking at the smb config list. This is what a non-fruit vfs-object line looks like vfs objects = shadow_copy2 syncops gpfs fileid time_audit This is one that has been "fruitified" vfs objects = shadow_copy2 syncops fruit streams_xattr gpfs fileid time_audit *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* David DeHaan Spectrum Scale Test *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From: "Michael Meier" To: gpfsug-discuss at spectrumscale.org Date: 02/01/2022 04:26 AM Subject: [EXTERNAL] [gpfsug-discuss] Spectrum Scale and vfs_fruit Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, A bunch of security updates for Samba were released yesterday, most importantly among them CVE-2021-44142 ( https://www.samba.org/samba/security/CVE-2021-44142.html ) in the vfs_fruit VFS-module that adds extended support for Apple Clients. Spectrum Scale supports that, so Spectrum Scale might be affected, and I'm trying to find out if we're affected or not. Now we never enabled this via "mmsmb config change --vfs-fruit-enable", and I would expect this to be disabled by default - however, I cannot find an explicit statement like "by default this is disabled" in https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=services-support-vfs-fruit-smb-protocol Am I correct in assuming that it is indeed disabled by default? And how would I verify that? Am I correct in assuming that _if_ it was enabled, then 'fruit' would show up under the 'vfs objects' in 'mmsmb config list'? Regards, -- Michael Meier, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales Rechenzentrum Erlangen Martensstrasse 1, 91058 Erlangen, Germany Tel.: +49 9131 85-20994, Fax: +49 9131 302941 michael.meier at fau.de hpc.fau.de _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ivano.talamo at psi.ch Wed Feb 2 09:07:13 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 09:07:13 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce Message-ID: Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Wed Feb 2 09:33:25 2022 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 2 Feb 2022 09:33:25 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: Message-ID: Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec > > By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. > > Thanks, > Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed Feb 2 10:07:25 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 2 Feb 2022 10:07:25 +0000 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Message-ID: Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:548be828-dcc2-4a88-ac2e-ff5106b3f802] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-iity4nk4 Type: application/octet-stream Size: 2541 bytes Desc: Outlook-iity4nk4 URL: From ivano.talamo at psi.ch Wed Feb 2 10:45:26 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 10:45:26 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , Message-ID: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From sthompson2 at lenovo.com Wed Feb 2 10:52:27 2022 From: sthompson2 at lenovo.com (Simon Thompson2) Date: Wed, 2 Feb 2022 10:52:27 +0000 Subject: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is "HA". BUT if you shutdown the GUI servers (say you are waiting for a log4j patch ...) then you have no snapshot automation. Due to the way we structured independent filesets, this could be 50 or so to automate and we wanted to set a say 4 day retention policy. So clicking in the GUI was pretty simple to do this for. What we did found is it a snapshot failed to delete for some reason (quiesce etc), then the GUI never tried again to clean it up so we have monitoring to look for unexpected snapshots that needed cleaning up. Simon ________________________________ Simon Thompson He/Him/His Senior Storage Performance WW HPC Customer Solutions Lenovo UK [Phone]+44 7788 320635 [Email]sthompson2 at lenovo.com Lenovo.com Twitter | Instagram | Facebook | Linkedin | YouTube | Privacy [cid:image003.png at 01D81822.F63BAB90] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Kidger, Daniel Sent: 02 February 2022 10:07 To: gpfsug-discuss at spectrumscale.org Subject: [External] [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:image004.png at 01D81822.F63BAB90] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 20109 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 2541 bytes Desc: image004.png URL: From jordi.caubet at es.ibm.com Wed Feb 2 11:07:37 2022 From: jordi.caubet at es.ibm.com (Jordi Caubet Serrabou) Date: Wed, 2 Feb 2022 11:07:37 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: <4326cfae883b4378bcb284b6daecb05e@psi.ch> References: <4326cfae883b4378bcb284b6daecb05e@psi.ch>, , Message-ID: An HTML attachment was scrubbed... URL: From janfrode at tanso.net Wed Feb 2 11:53:50 2022 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 2 Feb 2022 12:53:50 +0100 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou < jordi.caubet at es.ibm.com> wrote: > Ivano, > > if it happens frequently, I would recommend to open a support case. > > The creation or deletion of a snapshot requires a quiesce of the nodes to > obtain a consistent point-in-time image of the file system and/or update > some internal structures afaik. Quiesce is required for nodes at the > storage cluster but also remote clusters. Quiesce means stop activities > (incl. I/O) for a short period of time to get such consistent image. Also > waiting to flush any data in-flight to disk that does not allow a > consistent point-in-time image. > > Nodes receive a quiesce request and acknowledge when ready. When all nodes > acknowledge, snapshot operation can proceed and immediately I/O can resume. > It usually takes few seconds at most and the operation performed is short > but time I/O is stopped depends of how long it takes to quiesce the nodes. > If some node take longer to agree stop the activities, such node will > be delay the completion of the quiesce and keep I/O paused on the rest. > There could many things while some nodes delay quiesce ack. > > The larger the cluster, the more difficult it gets. The more network > congestion or I/O load, the more difficult it gets. I recommend to open a > ticket for support to try to identify the root cause of which nodes not > acknowledge the quiesce and maybe find the root cause. If I recall some > previous thread, default timeout was 60 seconds which match your log > message. After such timeout, snapshot is considered failed to complete. > > Support might help you understand the root cause and provide some > recommendations if it happens frequently. > > Best Regards, > -- > Jordi Caubet Serrabou > IBM Storage Client Technical Specialist (IBM Spain) > > > ----- Original message ----- > From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Date: Wed, Feb 2, 2022 11:45 AM > > > Hello Andrew, > > > > Thanks for your questions. > > > > We're not experiencing any other issue/slowness during normal activity. > > The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool > for metadata only. > > > > The two NSD servers have 750GB of RAM and 618 are configured as pagepool. > > > > The issue we see is happening on both the two filesystems we have: > > > > - perf filesystem: > > - 1.8 PB size (71% in use) > > - 570 milions of inodes (24% in use) > > > > - tiered filesystem: > > - 400 TB size (34% in use) > > - 230 Milions of files (60% in use) > > > > Cheers, > > Ivano > > > > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> on behalf of Andrew Beattie < > abeattie at au1.ibm.com> > *Sent:* Wednesday, February 2, 2022 10:33 AM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Ivano, > > How big is the filesystem in terms of number of files? > How big is the filesystem in terms of capacity? > Is the Metadata on Flash or Spinning disk? > Do you see issues when users do an LS of the filesystem or only when you > are doing snapshots. > > How much memory do the NSD servers have? > How much is allocated to the OS / Spectrum > Scale Pagepool > > Regards > > Andrew Beattie > Technical Specialist - Storage for Big Data & AI > IBM Technology Group > IBM Australia & New Zealand > P. +61 421 337 927 > E. abeattie at au1.IBM.com > > > > > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: > > > ? > > > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe > also when creating new ones) the filesystem becomes inaccessible on the > clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote > cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 > msec > > By looking around I see we're not the first one. I am wondering if that's > considered an unavoidable part of the snapshotting and if there's any > tunable that can improve the situation. Since when this occurs all the > clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage > cluster is on 5.1.1-0. > > Thanks, > Ivano > > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: > > International Business Machines, S.A. > > Santa Hortensia, 26-28, 28002 Madrid > > Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 > > CIF A28-010791 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Feb 2 12:09:24 2022 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 2 Feb 2022 12:09:24 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed Feb 2 12:08:54 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 2 Feb 2022 12:08:54 +0000 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: Simon, Thanks - that is a good insight. The HA 'feature' of the snapshot automation is perhaps a key feature as Linux still lacks a decent 'cluster cron' Also, If "HA" do we know where the state is centrally kept? On the point of snapshots being left undeleted, do you ever use /usr/lpp/mmfs/gui/cli/lssnapops to see what the queue of outstanding actions is like? (There is also a notification tool: lssnapnotify in that directory that is supposed to alert on failed snapshot actions, although personally I have never used it) Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:fce0ce85-6ae4-44ce-aa94-d7d099e68acb] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Simon Thompson2 Sent: 02 February 2022 10:52 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is ?HA?. BUT if you shutdown the GUI servers (say you are waiting for a log4j patch ?) then you have no snapshot automation. Due to the way we structured independent filesets, this could be 50 or so to automate and we wanted to set a say 4 day retention policy. So clicking in the GUI was pretty simple to do this for. What we did found is it a snapshot failed to delete for some reason (quiesce etc), then the GUI never tried again to clean it up so we have monitoring to look for unexpected snapshots that needed cleaning up. Simon ________________________________ Simon Thompson He/Him/His Senior Storage Performance WW HPC Customer Solutions Lenovo UK [Phone]+44 7788 320635 [Email]sthompson2 at lenovo.com Lenovo.com Twitter | Instagram | Facebook | Linkedin | YouTube | Privacy [cid:image003.png at 01D81822.F63BAB90] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Kidger, Daniel Sent: 02 February 2022 10:07 To: gpfsug-discuss at spectrumscale.org Subject: [External] [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:image004.png at 01D81822.F63BAB90] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 20109 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 2541 bytes Desc: image004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-axuecxph Type: application/octet-stream Size: 2541 bytes Desc: Outlook-axuecxph URL: From anacreo at gmail.com Wed Feb 2 12:41:07 2022 From: anacreo at gmail.com (Alec) Date: Wed, 2 Feb 2022 04:41:07 -0800 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? Alec On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser wrote: > keep in mind... creating many snapshots... means ;-) .. you'll have to > delete many snapshots.. > at a certain level, which depends on #files, #directories, ~workload, > #nodes, #networks etc.... we ve seen cases, where generating just full > snapshots (whole file system) is the better approach instead of > maintaining snapshots for each file set individually .. > > sure. this has other side effects , like space consumption etc... > so as always.. it depends.. > > > > > ----- Urspr?ngliche Nachricht ----- > Von: "Jan-Frode Myklebust" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org > An: "gpfsug main discussion list" > CC: > Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Datum: Mi, 2. Feb 2022 12:54 > > Also, if snapshotting multiple filesets, it's important to group these > into a single mmcrsnapshot command. Then you get a single quiesce, > instead of one per fileset. > > i.e. do: > > snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) > mmcrsnapshot gpfs0 > fileset1:$snapname,filset2:snapname,fileset3:snapname > > instead of: > > mmcrsnapshot gpfs0 fileset1:$snapname > mmcrsnapshot gpfs0 fileset2:$snapname > mmcrsnapshot gpfs0 fileset3:$snapname > > > -jf > > > On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou < > jordi.caubet at es.ibm.com> wrote: > > Ivano, > > if it happens frequently, I would recommend to open a support case. > > The creation or deletion of a snapshot requires a quiesce of the nodes to > obtain a consistent point-in-time image of the file system and/or update > some internal structures afaik. Quiesce is required for nodes at the > storage cluster but also remote clusters. Quiesce means stop activities > (incl. I/O) for a short period of time to get such consistent image. Also > waiting to flush any data in-flight to disk that does not allow a > consistent point-in-time image. > > Nodes receive a quiesce request and acknowledge when ready. When all nodes > acknowledge, snapshot operation can proceed and immediately I/O can resume. > It usually takes few seconds at most and the operation performed is short > but time I/O is stopped depends of how long it takes to quiesce the nodes. > If some node take longer to agree stop the activities, such node will > be delay the completion of the quiesce and keep I/O paused on the rest. > There could many things while some nodes delay quiesce ack. > > The larger the cluster, the more difficult it gets. The more network > congestion or I/O load, the more difficult it gets. I recommend to open a > ticket for support to try to identify the root cause of which nodes not > acknowledge the quiesce and maybe find the root cause. If I recall some > previous thread, default timeout was 60 seconds which match your log > message. After such timeout, snapshot is considered failed to complete. > > Support might help you understand the root cause and provide some > recommendations if it happens frequently. > > Best Regards, > -- > Jordi Caubet Serrabou > IBM Storage Client Technical Specialist (IBM Spain) > > > ----- Original message ----- > From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Date: Wed, Feb 2, 2022 11:45 AM > > > Hello Andrew, > > > > Thanks for your questions. > > > > We're not experiencing any other issue/slowness during normal activity. > > The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool > for metadata only. > > > > The two NSD servers have 750GB of RAM and 618 are configured as pagepool. > > > > The issue we see is happening on both the two filesystems we have: > > > > - perf filesystem: > > - 1.8 PB size (71% in use) > > - 570 milions of inodes (24% in use) > > > > - tiered filesystem: > > - 400 TB size (34% in use) > > - 230 Milions of files (60% in use) > > > > Cheers, > > Ivano > > > > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> on behalf of Andrew Beattie < > abeattie at au1.ibm.com> > *Sent:* Wednesday, February 2, 2022 10:33 AM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Ivano, > > How big is the filesystem in terms of number of files? > How big is the filesystem in terms of capacity? > Is the Metadata on Flash or Spinning disk? > Do you see issues when users do an LS of the filesystem or only when you > are doing snapshots. > > How much memory do the NSD servers have? > How much is allocated to the OS / Spectrum > Scale Pagepool > > Regards > > Andrew Beattie > Technical Specialist - Storage for Big Data & AI > IBM Technology Group > IBM Australia & New Zealand > P. +61 421 337 927 > E. abeattie at au1.IBM.com > > > > > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: > > > ? > > > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe > also when creating new ones) the filesystem becomes inaccessible on the > clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote > cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 > msec > > By looking around I see we're not the first one. I am wondering if that's > considered an unavoidable part of the snapshotting and if there's any > tunable that can improve the situation. Since when this occurs all the > clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage > cluster is on 5.1.1-0. > > Thanks, > Ivano > > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: > > International Business Machines, S.A. > > Santa Hortensia, 26-28, 28002 Madrid > > Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 > > CIF A28-010791 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:55:52 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:55:52 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch>, , , Message-ID: <8d51042ed95b461fb2be3dc33dac030a@psi.ch> Hi Jordi, thanks for the explanation, I can now see better why something like that would happen. Indeed the cluster has a lot of clients, coming via different clusters and even some NFS/SMB via protocol nodes. So I think opening a case makes a lot of sense to track it down. Not sure how we can make the debug transparent to the users, but we'll see. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Jordi Caubet Serrabou Sent: Wednesday, February 2, 2022 12:07 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:57:32 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:57:32 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> , Message-ID: Sure, that makes a lot of sense and we were already doing in that way. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Jan-Frode Myklebust Sent: Wednesday, February 2, 2022 12:53 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:59:30 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:59:30 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , <4326cfae883b4378bcb284b6daecb05e@psi.ch>, Message-ID: Ok that sounds a good candidate for an improvement. Thanks. We didn't want to do a full filesystem snapshot for the space consumption indeed. But we may consider it, keeping an eye on the space. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Olaf Weiser Sent: Wednesday, February 2, 2022 1:09 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. sure. this has other side effects , like space consumption etc... so as always.. it depends.. ----- Urspr?ngliche Nachricht ----- Von: "Jan-Frode Myklebust" Gesendet von: gpfsug-discuss-bounces at spectrumscale.org An: "gpfsug main discussion list" CC: Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Datum: Mi, 2. Feb 2022 12:54 Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 13:03:13 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 13:03:13 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> , Message-ID: That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them to agree. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Alec Sent: Wednesday, February 2, 2022 1:41 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? Alec On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser > wrote: keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. sure. this has other side effects , like space consumption etc... so as always.. it depends.. ----- Urspr?ngliche Nachricht ----- Von: "Jan-Frode Myklebust" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org An: "gpfsug main discussion list" > CC: Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Datum: Mi, 2. Feb 2022 12:54 Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi.caubet at es.ibm.com Wed Feb 2 13:34:20 2022 From: jordi.caubet at es.ibm.com (Jordi Caubet Serrabou) Date: Wed, 2 Feb 2022 13:34:20 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: Message-ID: Maybe some colleagues at IBM devel can correct me, but pagepool size should not make much difference. Afaik, it is mostly read cache data. Another think could be if using HAWC function, I am not sure in such case. Anyhow, looking at your node name, your system seems a DSS from Lenovo so you NSD servers are running GPFS Native RAID and the reason why the pagepool is large there, not for the NSD server role itself, it is for the GNR role that caches disk tracks. Lowering will impact performance. -- Jordi Caubet Serrabou IBM Software Defined Infrastructure (SDI) and Flash Technical Sales Specialist Technical Computing and HPC IT Specialist and Architect > On 2 Feb 2022, at 14:03, Talamo Ivano Giuseppe (PSI) wrote: > > ? > That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. > > Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them to agree. > > > > Cheers, > > Ivano > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Alec > Sent: Wednesday, February 2, 2022 1:41 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. > > Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? > > My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. > > Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? > > Alec > > >> On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser wrote: >> keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. >> at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. >> >> sure. this has other side effects , like space consumption etc... >> so as always.. it depends.. >> >> >> >> ----- Urspr?ngliche Nachricht ----- >> Von: "Jan-Frode Myklebust" >> Gesendet von: gpfsug-discuss-bounces at spectrumscale.org >> An: "gpfsug main discussion list" >> CC: >> Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> Datum: Mi, 2. Feb 2022 12:54 >> >> Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. >> >> i.e. do: >> >> snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) >> mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname >> >> instead of: >> >> mmcrsnapshot gpfs0 fileset1:$snapname >> mmcrsnapshot gpfs0 fileset2:$snapname >> mmcrsnapshot gpfs0 fileset3:$snapname >> >> >> -jf >> >> >> On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou wrote: >> Ivano, >> >> if it happens frequently, I would recommend to open a support case. >> >> The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. >> >> Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. >> There could many things while some nodes delay quiesce ack. >> >> The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. >> >> Support might help you understand the root cause and provide some recommendations if it happens frequently. >> >> Best Regards, >> -- >> Jordi Caubet Serrabou >> IBM Storage Client Technical Specialist (IBM Spain) >> >> ----- Original message ----- >> From: "Talamo Ivano Giuseppe (PSI)" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "gpfsug main discussion list" >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> Date: Wed, Feb 2, 2022 11:45 AM >> >> Hello Andrew, >> >> >> >> Thanks for your questions. >> >> >> >> We're not experiencing any other issue/slowness during normal activity. >> >> The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. >> >> >> >> The two NSD servers have 750GB of RAM and 618 are configured as pagepool. >> >> >> >> The issue we see is happening on both the two filesystems we have: >> >> >> >> - perf filesystem: >> >> - 1.8 PB size (71% in use) >> >> - 570 milions of inodes (24% in use) >> >> >> >> - tiered filesystem: >> >> - 400 TB size (34% in use) >> >> - 230 Milions of files (60% in use) >> >> >> >> Cheers, >> >> Ivano >> >> >> >> >> >> >> >> __________________________________________ >> Paul Scherrer Institut >> Ivano Talamo >> WHGA/038 >> Forschungsstrasse 111 >> 5232 Villigen PSI >> Schweiz >> >> Telefon: +41 56 310 47 11 >> E-Mail: ivano.talamo at psi.ch >> >> >> >> >> From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie >> Sent: Wednesday, February 2, 2022 10:33 AM >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> >> Ivano, >> >> How big is the filesystem in terms of number of files? >> How big is the filesystem in terms of capacity? >> Is the Metadata on Flash or Spinning disk? >> Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. >> >> How much memory do the NSD servers have? >> How much is allocated to the OS / Spectrum >> Scale Pagepool >> >> Regards >> >> Andrew Beattie >> Technical Specialist - Storage for Big Data & AI >> IBM Technology Group >> IBM Australia & New Zealand >> P. +61 421 337 927 >> E. abeattie at au1.IBM.com >> >> >> >>> >>> On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: >>> >>> ? >>> >>> >>> Dear all, >>> >>> Since a while we are experiencing an issue when dealing with snapshots. >>> Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). >>> >>> The clients and the storage are on two different clusters, using remote cluster mount for the access. >>> >>> On the log files many lines like the following appear (on both clusters): >>> Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec >>> >>> By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. >>> >>> If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. >>> >>> Thanks, >>> Ivano >>> >>> >>> >>> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: >> >> International Business Machines, S.A. >> >> Santa Hortensia, 26-28, 28002 Madrid >> >> Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 >> >> CIF A28-010791 >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 -------------- next part -------------- An HTML attachment was scrubbed... URL: From juergen.hannappel at desy.de Wed Feb 2 15:04:24 2022 From: juergen.hannappel at desy.de (Hannappel, Juergen) Date: Wed, 2 Feb 2022 16:04:24 +0100 (CET) Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> Hi, I use a python script via cron job, it checks how many snapshots exist and removes those that exceed a configurable limit, then creates a new one. Deployed via puppet it's much less hassle than click around in a GUI/ > From: "Kidger, Daniel" > To: "gpfsug main discussion list" > Sent: Wednesday, 2 February, 2022 11:07:25 > Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? > Hi all, > Since the subject of snapshots has come up, I also have a question ... > Snapshots can be created from the command line with mmcrsnapshot, and hence can > be automated via con jobs etc. > Snapshots can also be created from the Scale GUI. The GUI also provides its own > automation for the creation, retention, and deletion of snapshots. > My question is: do most customers use the former or the latter for automation? > (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do > exactly the same as what the GUI does it terms of creating automated snapshots. > It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. > How many customers also use the commands found in /usr/lpp/mmfs/gui/cli / ? ) > Daniel > Daniel Kidger > HPC Storage Solutions Architect, EMEA > [ mailto:daniel.kidger at hpe.com | daniel.kidger at hpe.com ] > +44 (0)7818 522266 > [ http://www.hpe.com/ | hpe.com ] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-iity4nk4 Type: image/png Size: 2541 bytes Desc: Outlook-iity4nk4 URL: From mark.bergman at uphs.upenn.edu Wed Feb 2 16:09:02 2022 From: mark.bergman at uphs.upenn.edu (mark.bergman at uphs.upenn.edu) Date: Wed, 02 Feb 2022 11:09:02 -0500 Subject: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: Your message of "Wed, 02 Feb 2022 10:07:25 +0000." References: Message-ID: <1971435-1643818142.818836@ATIP.bjhn.uBcv> Big vote for cron jobs. Our snapshot are created by a script, installed on each GPFS node. The script handles naming, removing old snapshots, checking that sufficient disk space exists before creating a snapshot, etc. We do snapshots every 15 minutes, keeping them with lower frequency over longer intervals. For example: current hour: keep 4 snapshots hours -2 .. -8 keep 3 snapshots per hour hours -8 .. -24 keep 2 snapshots per hour days -1 .. -5 keep 1 snapshot per hour days -5 .. -15 keep 4 snapshots per day days -15 .. -30 keep 1 snapshot per day the duration & frequency & minimum disk space can be adjusted per-filesystem. The automation is done through a cronjob that runs on each GPFS (DSS-G) server to create the snapshot only if the node is currently the cluster master, as in: */15 * * * * root mmlsmgr -Y | grep -q "clusterManager.*:$(hostname --long):" && /path/to/snapshotter This requires no locking and ensures that only a single instance of snapshots is created at each time interval. We use the same trick to gather GPFS health stats, etc., ensuring that the data collection only runs on a single node (the cluster manager). -- Mark Bergman voice: 215-746-4061 mark.bergman at pennmedicine.upenn.edu fax: 215-614-0266 http://www.med.upenn.edu/cbica/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania From info at odina.nl Wed Feb 2 16:22:47 2022 From: info at odina.nl (Jaap Jan Ouwehand) Date: Wed, 02 Feb 2022 17:22:47 +0100 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> References: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> Message-ID: <9CD60B1D-5BF8-4BBD-9F9D-A872D89EE9C4@odina.nl> Hi, I also used a custom script (database driven) via cron which creates many fileset snapshots during the day via the "default helper nodes". Because of the iops, the oldest snapshots are deleted at night. Perhaps it's a good idea to take one global filesystem snapshot and make it available to the filesets with mmsnapdir. Kind regards, Jaap Jan Ouwehand "Hannappel, Juergen" schreef op 2 februari 2022 16:04:24 CET: >Hi, >I use a python script via cron job, it checks how many snapshots exist and removes those that >exceed a configurable limit, then creates a new one. >Deployed via puppet it's much less hassle than click around in a GUI/ > >> From: "Kidger, Daniel" >> To: "gpfsug main discussion list" >> Sent: Wednesday, 2 February, 2022 11:07:25 >> Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? > >> Hi all, > >> Since the subject of snapshots has come up, I also have a question ... > >> Snapshots can be created from the command line with mmcrsnapshot, and hence can >> be automated via con jobs etc. >> Snapshots can also be created from the Scale GUI. The GUI also provides its own >> automation for the creation, retention, and deletion of snapshots. > >> My question is: do most customers use the former or the latter for automation? > >> (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do >> exactly the same as what the GUI does it terms of creating automated snapshots. >> It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. >> How many customers also use the commands found in /usr/lpp/mmfs/gui/cli / ? ) > >> Daniel > >> Daniel Kidger >> HPC Storage Solutions Architect, EMEA >> [ mailto:daniel.kidger at hpe.com | daniel.kidger at hpe.com ] > >> +44 (0)7818 522266 > >> [ http://www.hpe.com/ | hpe.com ] > >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.ward at nhm.ac.uk Mon Feb 7 16:39:25 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 7 Feb 2022 16:39:25 +0000 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: Backups seem to have settled down. A workshop with our partner and IBM is in the pipeline. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Paul Ward Sent: 01 February 2022 12:28 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Not currently set. I'll look into them. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: 26 January 2022 16:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Awesome, glad that you found them (I missed them the first time too). As for the anomalous changed files, do you have these options set in your client option file? skipacl yes skipaclupdatecheck yes updatectime yes We had similar problems where metadata and ACL updates were interpreted as data changes by mmbackup/dsmc. We also have a case open with IBM where mmbackup will both expire and backup a file in the same run, even in the absence of mtime changes, but it's unclear whether that's program error or something with our include/exclude rules. I'd be curious if you're running into that as well. On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > Good call! > > Yes they are dot files. > > > New issue. > > Mmbackup seems to be backup up the same files over and over without them changing: > areas are being backed up multiple times. > The example below is a co-resident file, the only thing that has changed since it was created 20/10/21, is the file has been accessed for backup. > This file is in the 'changed' list in mmbackup: > > This list has just been created: > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > Listing the last few files in the file (selecting the last one) > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > Check the file stats (access time just before last backup) > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File: '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > Size: 545 Blocks: 32 IO Block: 4194304 regular file > Device: 2bh/43d Inode: 212618897 Links: 1 > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: (1399647564/NHM\dg-mbl-urban-nature-project-rw) > Context: unconfined_u:object_r:unlabeled_t:s0 > Access: 2022-01-25 06:40:58.334961446 +0000 > Modify: 2020-12-01 15:20:40.122053000 +0000 > Change: 2021-10-20 17:55:18.265746459 +0100 > Birth: - > > Check if migrated > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File name : /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > On-line size : 545 > Used blocks : 16 > Data Version : 1 > Meta Version : 1 > State : Co-resident > Container Index : 1 > Base Name : 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > Check if immutable > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > file name: /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > metadata replication: 2 max 2 > data replication: 2 max 2 > immutable: no > appendOnly: no > flags: > storage pool name: data > fileset name: hpc-workspaces-fset > snapshot name: > creation time: Wed Oct 20 17:55:18 2021 > Misc attributes: ARCHIVE > Encrypted: no > > Check active and inactive backups (it was backed up yesterday) > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 11:19:02 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > 11:07:05 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/25/2022 06:41:17 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > It will be backed up again shortly, why? > > And it was backed up again: > # dsmcqbi > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 15:54:09 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > 15:30:03 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/26/2022 12:23:02 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/25/2022 06:41:17 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Skylar > Thompson > Sent: 24 January 2022 15:37 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Hi Paul, > > Did you look for dot files? At least for us on 5.0.5 there's a .list.1. file while the backups are running: > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > Those directories are empty > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of IBM Spectrum > > Scale > > Sent: 22 January 2022 00:35 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > Instead of calculating *.ix.* files, please look at a list file in these directories. > > > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked.]"Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/21/2022 09:38 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > of the script I now copy the contents of the .mmbackupCfg folder to > > a date stamped logging folder Checking how many entries in these files compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you > > > > Right in the command line seems to have worked. > > At the end of the script I now copy the contents of the .mmbackupCfg > > folder to a date stamped logging folder > > > > Checking how many entries in these files compared to the Summary: > > wc -l mmbackup* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 754 total > > From Summary > > Total number of objects inspected: 755 > > I can live with a discrepancy of 1. > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > From Summary > > Total number of objects expired: 2 > > That matches > > > > wc -l mmbackupC* mmbackupS* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 752 total > > Summary: > > Total number of objects backed up: 751 > > > > A difference of 1 I can live with. > > > > What does Statech stand for? > > > > Just this to sort out: > > Total number of objects failed: 1 > > I will add: > > --tsm-errorlog TSMErrorLogFile > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 19 January 2022 15:09 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > This is to set environment for mmbackup. > > If mmbackup is invoked within a script, you can set "export DEBUGmmbackup=2" right above mmbackup command. > > e.g) in your script > > .... > > export DEBUGmmbackup=2 > > mmbackup .... > > > > Or, you can set it in the same command line like > > DEBUGmmbackup=2 mmbackup .... > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to se]"Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to see if they are the cluster manager. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/19/2022 06:04 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > they are the cluster manager. If they are, then they take > > responsibility to start the backup script. The script then randomly selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you. > > > > We run a script on all our nodes that checks to see if they are the cluster manager. > > If they are, then they take responsibility to start the backup script. > > The script then randomly selects one of the available backup nodes and uses dsmsh mmbackup on it. > > > > Where does this command belong? > > I have seen it listed as a export command, again where should that be run ? on all backup nodes, or all nodes? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 18 January 2022 22:54 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files even after successful backup. They are available at MMBACKUP_RECORD_ROOT (default is FSroot or FilesetRoot directory). > > In .mmbackupCfg directory, there are 3 directories: > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to back]"Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to backup using mmbackup. > > > > From: "Paul Ward" > > > To: > > "gpfsug-discuss at spectrumscale.org > org>" > > > org>> > > Date: 01/18/2022 11:56 AM > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > using mmbackup. I have increased the -L value from 3 up to 6 but > > only seem to see the files that are in scope, not the ones that are selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > I am trying to work out what files have been sent to backup using mmbackup. > > I have increased the -L value from 3 up to 6 but only seem to see the files that are in scope, not the ones that are selected. > > > > I can see the three file lists generated during a backup, but can?t seem to find a list of what files were backed up. > > > > It should be the diff of the shadow and shadow-old, but the wc -l of the diff doesn?t match the number of files in the backup summary. > > Wrong assumption? > > > > Where should I be looking ? surely it shouldn?t be this hard to see what files are selected? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > er > > ved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > es > > erved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > 25 > > 2F > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > 7f > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > d8 > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > C2 > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > ;r > > eserved=0> > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 From anacreo at gmail.com Mon Feb 7 17:42:36 2022 From: anacreo at gmail.com (Alec) Date: Mon, 7 Feb 2022 09:42:36 -0800 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: I'll share something we do when working with the GPFS policy engine so we don't blow out our backups... So we use a different backup in solution and have our file system broken down into multiple concurrent streams. In my policy engine when making major changes to the file system such as encrypting or compressing data I use a where clause such as: MOD(INODE, 7)<=dayofweek When we call mmpolicy I add -M dayofweek=NN. In this case I'd use cron and pass day of the week. What this achieves is that on each day I only work on 1/7th of each file system... So that no one backup stream is blown out. It is cumulative so 7+ will work on 100% of the file system. It's a nifty trick so figured I'd share it out. In production we do something more like 40, and set shares to increment by 1 on weekdays and 3 on weekends to distribute workload out over the whole month with more work on the weekends. Alec On Mon, Feb 7, 2022, 8:39 AM Paul Ward wrote: > Backups seem to have settled down. > A workshop with our partner and IBM is in the pipeline. > > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Paul Ward > Sent: 01 February 2022 12:28 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Not currently set. I'll look into them. > > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Skylar Thompson > Sent: 26 January 2022 16:50 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Awesome, glad that you found them (I missed them the first time too). > > As for the anomalous changed files, do you have these options set in your > client option file? > > skipacl yes > skipaclupdatecheck yes > updatectime yes > > We had similar problems where metadata and ACL updates were interpreted as > data changes by mmbackup/dsmc. > > We also have a case open with IBM where mmbackup will both expire and > backup a file in the same run, even in the absence of mtime changes, but > it's unclear whether that's program error or something with our > include/exclude rules. I'd be curious if you're running into that as well. > > On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > > Good call! > > > > Yes they are dot files. > > > > > > New issue. > > > > Mmbackup seems to be backup up the same files over and over without them > changing: > > areas are being backed up multiple times. > > The example below is a co-resident file, the only thing that has changed > since it was created 20/10/21, is the file has been accessed for backup. > > This file is in the 'changed' list in mmbackup: > > > > This list has just been created: > > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > > > Listing the last few files in the file (selecting the last one) > > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > > > Check the file stats (access time just before last backup) > > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > File: > '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > > Size: 545 Blocks: 32 IO Block: 4194304 regular file > > Device: 2bh/43d Inode: 212618897 Links: 1 > > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: > (1399647564/NHM\dg-mbl-urban-nature-project-rw) > > Context: unconfined_u:object_r:unlabeled_t:s0 > > Access: 2022-01-25 06:40:58.334961446 +0000 > > Modify: 2020-12-01 15:20:40.122053000 +0000 > > Change: 2021-10-20 17:55:18.265746459 +0100 > > Birth: - > > > > Check if migrated > > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > File name : > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > On-line size : 545 > > Used blocks : 16 > > Data Version : 1 > > Meta Version : 1 > > State : Co-resident > > Container Index : 1 > > Base Name : > 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > > > Check if immutable > > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > file name: > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > metadata replication: 2 max 2 > > data replication: 2 max 2 > > immutable: no > > appendOnly: no > > flags: > > storage pool name: data > > fileset name: hpc-workspaces-fset > > snapshot name: > > creation time: Wed Oct 20 17:55:18 2021 > > Misc attributes: ARCHIVE > > Encrypted: no > > > > Check active and inactive backups (it was backed up yesterday) > > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > IBM Spectrum Protect > > Command Line Backup-Archive Client Interface > > Client Version 8, Release 1, Level 10.0 > > Client date/time: 01/26/2022 11:19:02 > > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights > Reserved. > > > > Node Name: SC-PN-SK-01 > > Session established with server TSM-JERSEY: Windows > > Server Version 8, Release 1, Level 10.100 > > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > > 11:07:05 > > > > Accessing as node: SCALE > > Size Backup Date Mgmt Class > A/I File > > ---- ----------- ---------- > --- ---- > > 545 B 01/25/2022 06:41:17 DEFAULT > A > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 12/28/2021 21:19:18 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:17:35 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:18:05 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > > > > It will be backed up again shortly, why? > > > > And it was backed up again: > > # dsmcqbi > > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > IBM Spectrum Protect > > Command Line Backup-Archive Client Interface > > Client Version 8, Release 1, Level 10.0 > > Client date/time: 01/26/2022 15:54:09 > > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights > Reserved. > > > > Node Name: SC-PN-SK-01 > > Session established with server TSM-JERSEY: Windows > > Server Version 8, Release 1, Level 10.100 > > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > > 15:30:03 > > > > Accessing as node: SCALE > > Size Backup Date Mgmt Class > A/I File > > ---- ----------- ---------- > --- ---- > > 545 B 01/26/2022 12:23:02 DEFAULT > A > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 12/28/2021 21:19:18 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:17:35 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:18:05 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/25/2022 06:41:17 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > > > > > -----Original Message----- > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of Skylar > > Thompson > > Sent: 24 January 2022 15:37 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > Did you look for dot files? At least for us on 5.0.5 there's a > .list.1. file while the backups are running: > > > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > > Those directories are empty > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > > On Behalf Of IBM Spectrum > > > Scale > > > Sent: 22 January 2022 00:35 > > > To: gpfsug main discussion list > > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > > > > Hi Paul, > > > > > > Instead of calculating *.ix.* files, please look at a list file in > these directories. > > > > > > updatedFiles : contains a file that lists all candidates for backup > > > statechFiles : cantains a file that lists all candidates for meta > > > info update expiredFiles : cantains a file that lists all > > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 > AM---Thank you Right in the command line seems to have worked.]"Paul Ward" > ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to > have worked. > > > > > > From: "Paul Ward" > > > > To: "gpfsug main discussion list" > > > > > org>> > > > Cc: > > > "gpfsug-discuss-bounces at spectrumscale.org > > ce > > > s at spectrumscale.org>" > > > > > ce > > > s at spectrumscale.org>> > > > Date: 01/21/2022 09:38 AM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > > Sent > > > by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > > of the script I now copy the contents of the .mmbackupCfg folder to > > > a date stamped logging folder Checking how many entries in these files > compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is > From an External Sender This message came from outside your organization. > > > ZjQcmQRYFpfptBannerEnd > > > Thank you > > > > > > Right in the command line seems to have worked. > > > At the end of the script I now copy the contents of the .mmbackupCfg > > > folder to a date stamped logging folder > > > > > > Checking how many entries in these files compared to the Summary: > > > wc -l mmbackup* > > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > > 754 total > > > From Summary > > > Total number of objects inspected: 755 > > > I can live with a discrepancy of 1. > > > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > > From Summary > > > Total number of objects expired: 2 > > > That matches > > > > > > wc -l mmbackupC* mmbackupS* > > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > > 752 total > > > Summary: > > > Total number of objects backed up: 751 > > > > > > A difference of 1 I can live with. > > > > > > What does Statech stand for? > > > > > > Just this to sort out: > > > Total number of objects failed: 1 > > > I will add: > > > --tsm-errorlog TSMErrorLogFile > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > > Sent: 19 January 2022 15:09 > > > To: gpfsug main discussion list > > > > > org>> > > > Cc: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > > > > This is to set environment for mmbackup. > > > If mmbackup is invoked within a script, you can set "export > DEBUGmmbackup=2" right above mmbackup command. > > > e.g) in your script > > > .... > > > export DEBUGmmbackup=2 > > > mmbackup .... > > > > > > Or, you can set it in the same command line like > > > DEBUGmmbackup=2 mmbackup .... > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 > AM---Thank you. We run a script on all our nodes that checks to se]"Paul > Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our > nodes that checks to see if they are the cluster manager. > > > > > > From: "Paul Ward" > > > > To: "gpfsug main discussion list" > > > > > org>> > > > Cc: > > > "gpfsug-discuss-bounces at spectrumscale.org > > ce > > > s at spectrumscale.org>" > > > > > ce > > > s at spectrumscale.org>> > > > Date: 01/19/2022 06:04 AM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > > Sent > > > by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > > they are the cluster manager. If they are, then they take > > > responsibility to start the backup script. The script then randomly > selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender This message came from outside your > organization. > > > ZjQcmQRYFpfptBannerEnd > > > Thank you. > > > > > > We run a script on all our nodes that checks to see if they are the > cluster manager. > > > If they are, then they take responsibility to start the backup script. > > > The script then randomly selects one of the available backup nodes and > uses dsmsh mmbackup on it. > > > > > > Where does this command belong? > > > I have seen it listed as a export command, again where should that be > run ? on all backup nodes, or all nodes? > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > > Sent: 18 January 2022 22:54 > > > To: gpfsug main discussion list > > > > > org>> > > > Cc: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files > even after successful backup. They are available at MMBACKUP_RECORD_ROOT > (default is FSroot or FilesetRoot directory). > > > In .mmbackupCfg directory, there are 3 directories: > > > updatedFiles : contains a file that lists all candidates for backup > > > statechFiles : cantains a file that lists all candidates for meta > > > info update expiredFiles : cantains a file that lists all > > > candidates for expiration > > > > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, > I am trying to work out what files have been sent to back]"Paul Ward" > ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have > been sent to backup using mmbackup. > > > > > > From: "Paul Ward" > > > > To: > > > "gpfsug-discuss at spectrumscale.org > > org>" > > > > > org>> > > > Date: 01/18/2022 11:56 AM > > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > > using mmbackup. I have increased the -L value from 3 up to 6 but > > > only seem to see the files that are in scope, not the ones that are > selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender This message came from outside your > organization. > > > ZjQcmQRYFpfptBannerEnd > > > Hi, > > > > > > I am trying to work out what files have been sent to backup using > mmbackup. > > > I have increased the -L value from 3 up to 6 but only seem to see the > files that are in scope, not the ones that are selected. > > > > > > I can see the three file lists generated during a backup, but can?t > seem to find a list of what files were backed up. > > > > > > It should be the diff of the shadow and shadow-old, but the wc -l of > the diff doesn?t match the number of files in the backup summary. > > > Wrong assumption? > > > > > > Where should I be looking ? surely it shouldn?t be this hard to see > what files are selected? > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > > ed > > > =0 > > gp > > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > wa > > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > > a0 > > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > > ey > > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > > 00 > > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > > er > > > ved=0> > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > > ed > > > =0 > > gp > > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > wa > > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > > a0 > > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > > ey > > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > > 00 > > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > > es > > > erved=0> > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > > rv > > > ed=0 > > 25 > > > 2F > > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > > 7f > > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > > d8 > > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > > C2 > > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > > ;r > > > eserved=0> > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > > rv > > > ed=0 > > > > > > -- > > -- Skylar Thompson (skylar2 at u.washington.edu) > > -- Genome Sciences Department (UW Medicine), System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- Pronouns: He/Him/His > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > > rved=0 _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > > rved=0 > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.ward at nhm.ac.uk Mon Feb 21 12:30:15 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 21 Feb 2022 12:30:15 +0000 Subject: [gpfsug-discuss] immutable folder Message-ID: HI, I have a folder that I can't delete. IAM mode - non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can't leave it unchanged... Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From scale at us.ibm.com Mon Feb 21 16:11:37 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 21 Feb 2022 12:11:37 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug-discuss at spectrumscale.org" Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Tue Feb 22 10:30:36 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 22 Feb 2022 10:30:36 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug-discuss at spectrumscale.org" > Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From scale at us.ibm.com Tue Feb 22 14:17:00 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Tue, 22 Feb 2022 10:17:00 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. ???????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From cantrell at astro.gsu.edu Tue Feb 22 17:24:09 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 12:24:09 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS Message-ID: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> We're trying to mount multiple mounts at boot up via gpfs. We can mount the main gpfs mount /gpfs1, but would like to mount things like: /home /gpfs1/home /other /gpfs1/other /stuff /gpfs1/stuff But adding that to fstab doesn't work, because from what I understand, that's not how gpfs works with mounts. What's the standard way to accomplish something like this? We've used systemd timers/mounts to accomplish it, but that's not ideal. Is there a way to do this natively with gpfs or does this have to be done through symlinks or gpfs over nfs? From skylar2 at uw.edu Tue Feb 22 17:37:27 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 09:37:27 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> Message-ID: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Assuming this is on Linux, you ought to be able to use bind mounts for that, something like this in fstab or equivalent: /home /gpfs1/home bind defaults 0 0 On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > We're trying to mount multiple mounts at boot up via gpfs. > We can mount the main gpfs mount /gpfs1, but would like to mount things > like: > /home /gpfs1/home > /other /gpfs1/other > /stuff /gpfs1/stuff > > But adding that to fstab doesn't work, because from what I understand, > that's not how gpfs works with mounts. > What's the standard way to accomplish something like this? > We've used systemd timers/mounts to accomplish it, but that's not ideal. > Is there a way to do this natively with gpfs or does this have to be done > through symlinks or gpfs over nfs? -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From ulmer at ulmer.org Tue Feb 22 17:50:13 2022 From: ulmer at ulmer.org (Stephen Ulmer) Date: Tue, 22 Feb 2022 12:50:13 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> Message-ID: <3DE42AF3-34F0-4E3D-8813-813ADF85477A@ulmer.org> > On Feb 22, 2022, at 12:24 PM, Justin Cantrell wrote: > > We're trying to mount multiple mounts at boot up via gpfs. > We can mount the main gpfs mount /gpfs1, but would like to mount things like: > /home /gpfs1/home > /other /gpfs1/other > /stuff /gpfs1/stuff > > But adding that to fstab doesn't work, because from what I understand, that's not how gpfs works with mounts. > What's the standard way to accomplish something like this? > We've used systemd timers/mounts to accomplish it, but that's not ideal. > Is there a way to do this natively with gpfs or does this have to be done through symlinks or gpfs over nfs? > What are you really trying to accomplish? Backward compatibility with an older user experience? Making it shorter to type? Matching the path on non-GPFS nodes? -- Stephen From tina.friedrich at it.ox.ac.uk Tue Feb 22 18:12:23 2022 From: tina.friedrich at it.ox.ac.uk (Tina Friedrich) Date: Tue, 22 Feb 2022 18:12:23 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: <7b8fa26b-bb70-2ba4-0fe4-639ffede6943@it.ox.ac.uk> Bind mounts would definitely work; you can also use the automounter to bind-mount things into place. That's how we do that. E.g. [ ~]$ cat /etc/auto.data /data localhost://mnt/gpfs/bulk/data [ ~]$ cat /etc/auto.master | grep data # data /- /etc/auto.data works very well :) (That automatically bind-mounts it.) You could then also only mount user home directories if they're logged in, instead of showing all of them under /home/. Autofs can do pretty nice wildcarding and such. I would call bind mounting things - regardless of how - a better solution than symlinks, but that might just be my opinion :) Tina On 22/02/2022 17:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I understand, >> that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be done >> through symlinks or gpfs over nfs? > -- Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator Research Computing and Support Services IT Services, University of Oxford http://www.arc.ox.ac.uk http://www.it.ox.ac.uk From anacreo at gmail.com Tue Feb 22 18:56:44 2022 From: anacreo at gmail.com (Alec) Date: Tue, 22 Feb 2022 10:56:44 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: There is a sample script I believe it's called mmfsup. It's a hook that's called at startup of GPFS cluster node. We modify that script to do things such as configure backup ignore lists, update pagepool, and mount GPFS filesystem nodes as appropriate. We basically have a case statement based on class of the node, ie master, client, or primary backup node. Advantage of this is if you do an gpfs stop/start on an already running node things work right... Great in a fire situation... Or if you modify mounts or filesystems... You can call mmfsup say with mmdsh, send verify startup would be right. We started on this path because our backup software default policy would backup GPFS mounts from each node.. so simply adding the ignores at startup from the non backup primary was our solution. We also have mounts that should not be mounted on some nodes, and this handles that very elegantly. Alec On Tue, Feb 22, 2022, 9:37 AM Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > We're trying to mount multiple mounts at boot up via gpfs. > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > like: > > /home /gpfs1/home > > /other /gpfs1/other > > /stuff /gpfs1/stuff > > > > But adding that to fstab doesn't work, because from what I understand, > > that's not how gpfs works with mounts. > > What's the standard way to accomplish something like this? > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > Is there a way to do this natively with gpfs or does this have to be done > > through symlinks or gpfs over nfs? > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cantrell at astro.gsu.edu Tue Feb 22 19:23:53 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 14:23:53 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> I tried a bind mount, but perhaps I'm doing it wrong. The system fails to boot because gpfs doesn't start until too late in the boot process. In fact, the system boots and the gpfs1 partition isn't available for a good 20-30 seconds. /gfs1/home??? /home??? none???? bind I've tried adding mount options of x-systemd-requires=gpfs1, noauto. The noauto lets it boot, but the mount is never mounted properly. Doing a manual mount -a mounts it. On 2/22/22 12:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I understand, >> that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be done >> through symlinks or gpfs over nfs? From skylar2 at uw.edu Tue Feb 22 19:42:45 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 11:42:45 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> Message-ID: <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Like Tina, we're doing bind mounts in autofs. I forgot that there might be a race condition if you're doing it in fstab. If you're on system with systemd, another option might be to do this directly with systemd.mount rather than let the fstab generator make the systemd.mount units: https://www.freedesktop.org/software/systemd/man/systemd.mount.html You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > to boot because gpfs doesn't start until too late in the boot process. > In fact, the system boots and the gpfs1 partition isn't available for a > good 20-30 seconds. > > /gfs1/home??? /home??? none???? bind > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > The noauto lets it boot, but the mount is never mounted properly. Doing > a manual mount -a mounts it. > > On 2/22/22 12:37, Skylar Thompson wrote: > > Assuming this is on Linux, you ought to be able to use bind mounts for > > that, something like this in fstab or equivalent: > > > > /home /gpfs1/home bind defaults 0 0 > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > We're trying to mount multiple mounts at boot up via gpfs. > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > like: > > > /home /gpfs1/home > > > /other /gpfs1/other > > > /stuff /gpfs1/stuff > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > that's not how gpfs works with mounts. > > > What's the standard way to accomplish something like this? > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > Is there a way to do this natively with gpfs or does this have to be done > > > through symlinks or gpfs over nfs? > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From cantrell at astro.gsu.edu Tue Feb 22 20:05:58 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 15:05:58 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Message-ID: This is how we're currently solving this problem, with systemd timer and mount. None of the requires seem to work with gpfs since it starts so late. I would like a better solution. Is it normal for gpfs to start so late?? I think it doesn't mount until after the gpfs.service starts, and even then it's 20-30 seconds. On 2/22/22 14:42, Skylar Thompson wrote: > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > a race condition if you're doing it in fstab. If you're on system with systemd, > another option might be to do this directly with systemd.mount rather than > let the fstab generator make the systemd.mount units: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: >> I tried a bind mount, but perhaps I'm doing it wrong. The system fails >> to boot because gpfs doesn't start until too late in the boot process. >> In fact, the system boots and the gpfs1 partition isn't available for a >> good 20-30 seconds. >> >> /gfs1/home??? /home??? none???? bind >> I've tried adding mount options of x-systemd-requires=gpfs1, noauto. >> The noauto lets it boot, but the mount is never mounted properly. Doing >> a manual mount -a mounts it. >> >> On 2/22/22 12:37, Skylar Thompson wrote: >>> Assuming this is on Linux, you ought to be able to use bind mounts for >>> that, something like this in fstab or equivalent: >>> >>> /home /gpfs1/home bind defaults 0 0 >>> >>> On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >>>> We're trying to mount multiple mounts at boot up via gpfs. >>>> We can mount the main gpfs mount /gpfs1, but would like to mount things >>>> like: >>>> /home /gpfs1/home >>>> /other /gpfs1/other >>>> /stuff /gpfs1/stuff >>>> >>>> But adding that to fstab doesn't work, because from what I understand, >>>> that's not how gpfs works with mounts. >>>> What's the standard way to accomplish something like this? >>>> We've used systemd timers/mounts to accomplish it, but that's not ideal. >>>> Is there a way to do this natively with gpfs or does this have to be done >>>> through symlinks or gpfs over nfs? >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 From skylar2 at uw.edu Tue Feb 22 20:12:03 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 12:12:03 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Message-ID: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> The problem might be that the service indicates success when mmstartup returns rather than when the mount is actually active (requires quorum checking, arbitration, etc.). A couple tricks I can think of would be using ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a callback[2] that triggers on the mount condition for your filesystem that makes the bind mount rather than systemd. [1] https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathIsMountPoint= [2] https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command These are both on our todo list for improving our own GPFS mounting as we have problems with our job scheduler not starting reliably on reboot, but for us we can have Puppet start it on the next run so it just means nodes might not return to service for 30 minutes or so. On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > This is how we're currently solving this problem, with systemd timer and > mount. None of the requires seem to work with gpfs since it starts so late. > I would like a better solution. > > Is it normal for gpfs to start so late?? I think it doesn't mount until > after the gpfs.service starts, and even then it's 20-30 seconds. > > > On 2/22/22 14:42, Skylar Thompson wrote: > > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > > a race condition if you're doing it in fstab. If you're on system with systemd, > > another option might be to do this directly with systemd.mount rather than > > let the fstab generator make the systemd.mount units: > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > > > to boot because gpfs doesn't start until too late in the boot process. > > > In fact, the system boots and the gpfs1 partition isn't available for a > > > good 20-30 seconds. > > > > > > /gfs1/home??? /home??? none???? bind > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > The noauto lets it boot, but the mount is never mounted properly. Doing > > > a manual mount -a mounts it. > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > Assuming this is on Linux, you ought to be able to use bind mounts for > > > > that, something like this in fstab or equivalent: > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > > > like: > > > > > /home /gpfs1/home > > > > > /other /gpfs1/other > > > > > /stuff /gpfs1/stuff > > > > > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > > > that's not how gpfs works with mounts. > > > > > What's the standard way to accomplish something like this? > > > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > > > Is there a way to do this natively with gpfs or does this have to be done > > > > > through symlinks or gpfs over nfs? > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From anacreo at gmail.com Tue Feb 22 20:29:29 2022 From: anacreo at gmail.com (Alec) Date: Tue, 22 Feb 2022 12:29:29 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: The trick for us on AIX in the inittab I have a script fswait.ksh and monitors for the cluster mount point to be available before allowing the cluster dependent startup item (lower in the inittab) I'm pretty sure Linux has a way to define a dependent service.. define a cluster ready service and mark everything else as dependent on that or one of it's descendents. You could simply put the wait on FS in your dependent services start script as an option as well. Lookup systemd and then After= or Part of= if memory serves me right on Linux. For the mmfsup script it goes into /var/mmfs/etc/mmfsup The cluster will call it if present when the node is ready. On Tue, Feb 22, 2022, 12:13 PM Skylar Thompson wrote: > The problem might be that the service indicates success when mmstartup > returns rather than when the mount is actually active (requires quorum > checking, arbitration, etc.). A couple tricks I can think of would be using > ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a > callback[2] that triggers on the mount condition for your filesystem that > makes the bind mount rather than systemd. > > [1] > https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathIsMountPoint= > [2] > https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command > > These are both on our todo list for improving our own GPFS mounting as we > have problems with our job scheduler not starting reliably on reboot, but > for us we can have Puppet start it on the next run so it just means nodes > might not return to service for 30 minutes or so. > > On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > > This is how we're currently solving this problem, with systemd timer and > > mount. None of the requires seem to work with gpfs since it starts so > late. > > I would like a better solution. > > > > Is it normal for gpfs to start so late?? I think it doesn't mount until > > after the gpfs.service starts, and even then it's 20-30 seconds. > > > > > > On 2/22/22 14:42, Skylar Thompson wrote: > > > Like Tina, we're doing bind mounts in autofs. I forgot that there > might be > > > a race condition if you're doing it in fstab. If you're on system with > systemd, > > > another option might be to do this directly with systemd.mount rather > than > > > let the fstab generator make the systemd.mount units: > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > > I tried a bind mount, but perhaps I'm doing it wrong. The system > fails > > > > to boot because gpfs doesn't start until too late in the boot > process. > > > > In fact, the system boots and the gpfs1 partition isn't available > for a > > > > good 20-30 seconds. > > > > > > > > /gfs1/home??? /home??? none???? bind > > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > > The noauto lets it boot, but the mount is never mounted properly. > Doing > > > > a manual mount -a mounts it. > > > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > > Assuming this is on Linux, you ought to be able to use bind mounts > for > > > > > that, something like this in fstab or equivalent: > > > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount > things > > > > > > like: > > > > > > /home /gpfs1/home > > > > > > /other /gpfs1/other > > > > > > /stuff /gpfs1/stuff > > > > > > > > > > > > But adding that to fstab doesn't work, because from what I > understand, > > > > > > that's not how gpfs works with mounts. > > > > > > What's the standard way to accomplish something like this? > > > > > > We've used systemd timers/mounts to accomplish it, but that's > not ideal. > > > > > > Is there a way to do this natively with gpfs or does this have > to be done > > > > > > through symlinks or gpfs over nfs? > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From malone12 at illinois.edu Tue Feb 22 20:21:43 2022 From: malone12 at illinois.edu (Maloney, J.D.) Date: Tue, 22 Feb 2022 20:21:43 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: Our Puppet/Ansible GPFS modules/playbooks handle this sequencing for us (we use bind mounts for things like u, projects, and scratch also). Like Skylar mentioned page pool allocation, quorum checking, and cluster arbitration have to come before a mount of the FS so that time you mentioned doesn?t seem totally off to me. We just make the creation of the bind mounts dependent on the actual GPFS mount occurring in the configuration management tooling which has worked out well for us in that regard. Best, J.D. Maloney Sr. HPC Storage Engineer | Storage Enabling Technologies Group National Center for Supercomputing Applications (NCSA) From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Skylar Thompson Date: Tuesday, February 22, 2022 at 2:13 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to do multiple mounts via GPFS The problem might be that the service indicates success when mmstartup returns rather than when the mount is actually active (requires quorum checking, arbitration, etc.). A couple tricks I can think of would be using ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a callback[2] that triggers on the mount condition for your filesystem that makes the bind mount rather than systemd. [1] https://urldefense.com/v3/__https://www.freedesktop.org/software/systemd/man/systemd.unit.html*ConditionPathIsMountPoint=__;Iw!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv4xJQwzZ$ [2] https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv3f90Gia$ These are both on our todo list for improving our own GPFS mounting as we have problems with our job scheduler not starting reliably on reboot, but for us we can have Puppet start it on the next run so it just means nodes might not return to service for 30 minutes or so. On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > This is how we're currently solving this problem, with systemd timer and > mount. None of the requires seem to work with gpfs since it starts so late. > I would like a better solution. > > Is it normal for gpfs to start so late?? I think it doesn't mount until > after the gpfs.service starts, and even then it's 20-30 seconds. > > > On 2/22/22 14:42, Skylar Thompson wrote: > > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > > a race condition if you're doing it in fstab. If you're on system with systemd, > > another option might be to do this directly with systemd.mount rather than > > let the fstab generator make the systemd.mount units: > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.freedesktop.org*2Fsoftware*2Fsystemd*2Fman*2Fsystemd.mount.html&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=*2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv0tqF9rU$ > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > > > to boot because gpfs doesn't start until too late in the boot process. > > > In fact, the system boots and the gpfs1 partition isn't available for a > > > good 20-30 seconds. > > > > > > /gfs1/home??? /home??? none???? bind > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > The noauto lets it boot, but the mount is never mounted properly. Doing > > > a manual mount -a mounts it. > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > Assuming this is on Linux, you ought to be able to use bind mounts for > > > > that, something like this in fstab or equivalent: > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > > > like: > > > > > /home /gpfs1/home > > > > > /other /gpfs1/other > > > > > /stuff /gpfs1/stuff > > > > > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > > > that's not how gpfs works with mounts. > > > > > What's the standard way to accomplish something like this? > > > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > > > Is there a way to do this natively with gpfs or does this have to be done > > > > > through symlinks or gpfs over nfs? > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=http*3A*2F*2Fgpfsug.org*2Fmailman*2Flistinfo*2Fgpfsug-discuss&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=F4oXAT0zdY*2BS1mR784ZGghUt0G*2F6Ofu36MfJ9WnPsPM*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv5uX7C9S$ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cantrell at astro.gsu.edu Tue Feb 22 22:07:47 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 17:07:47 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: I'd love to see your fstab to see how you're doing that bind mount. Do you use systemd? What cluster manager are you using? On 2/22/22 15:21, Maloney, J.D. wrote: > > Our Puppet/Ansible GPFS modules/playbooks handle this sequencing for > us (we use bind mounts for things like u, projects, and scratch > also).? Like Skylar mentioned page pool allocation, quorum checking, > and cluster arbitration have to come before a mount of the FS so that > time you mentioned doesn?t seem totally off to me. ?We just make the > creation of the bind mounts dependent on the actual GPFS mount > occurring in the configuration management tooling which has worked out > well for us in that regard. > > Best, > > J.D. Maloney > > Sr. HPC Storage Engineer | Storage Enabling Technologies Group > > National Center for Supercomputing Applications (NCSA) > > *From: *gpfsug-discuss-bounces at spectrumscale.org > on behalf of Skylar > Thompson > *Date: *Tuesday, February 22, 2022 at 2:13 PM > *To: *gpfsug-discuss at spectrumscale.org > *Subject: *Re: [gpfsug-discuss] How to do multiple mounts via GPFS > > The problem might be that the service indicates success when mmstartup > returns rather than when the mount is actually active (requires quorum > checking, arbitration, etc.). A couple tricks I can think of would be > using > ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a > callback[2] that triggers on the mount condition for your filesystem that > makes the bind mount rather than systemd. > > [1] > https://urldefense.com/v3/__https://www.freedesktop.org/software/systemd/man/systemd.unit.html*ConditionPathIsMountPoint=__;Iw!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv4xJQwzZ$ > > > [2] > https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv3f90Gia$ > > > > These are both on our todo list for improving our own GPFS mounting as we > have problems with our job scheduler not starting reliably on reboot, but > for us we can have Puppet start it on the next run so it just means nodes > might not return to service for 30 minutes or so. > > On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > > This is how we're currently solving this problem, with systemd timer and > > mount. None of the requires seem to work with gpfs since it starts > so late. > > I would like a better solution. > > > > Is it normal for gpfs to start so late?? I think it doesn't mount until > > after the gpfs.service starts, and even then it's 20-30 seconds. > > > > > > On 2/22/22 14:42, Skylar Thompson wrote: > > > Like Tina, we're doing bind mounts in autofs. I forgot that there > might be > > > a race condition if you're doing it in fstab. If you're on system > with systemd, > > > another option might be to do this directly with systemd.mount > rather than > > > let the fstab generator make the systemd.mount units: > > > > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.freedesktop.org*2Fsoftware*2Fsystemd*2Fman*2Fsystemd.mount.html&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=*2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv0tqF9rU$ > > > > > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount > unit. > > > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > > I tried a bind mount, but perhaps I'm doing it wrong. The system > fails > > > > to boot because gpfs doesn't start until too late in the boot > process. > > > > In fact, the system boots and the gpfs1 partition isn't > available for a > > > > good 20-30 seconds. > > > > > > > > /gfs1/home??? /home??? none???? bind > > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > > The noauto lets it boot, but the mount is never mounted > properly. Doing > > > > a manual mount -a mounts it. > > > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > > Assuming this is on Linux, you ought to be able to use bind > mounts for > > > > > that, something like this in fstab or equivalent: > > > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > > We can mount the main gpfs mount /gpfs1, but would like to > mount things > > > > > > like: > > > > > > /home /gpfs1/home > > > > > > /other /gpfs1/other > > > > > > /stuff /gpfs1/stuff > > > > > > > > > > > > But adding that to fstab doesn't work, because from what I > understand, > > > > > > that's not how gpfs works with mounts. > > > > > > What's the standard way to accomplish something like this? > > > > > > We've used systemd timers/mounts to accomplish it, but > that's not ideal. > > > > > > Is there a way to do this natively with gpfs or does this > have to be done > > > > > > through symlinks or gpfs over nfs? > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=http*3A*2F*2Fgpfsug.org*2Fmailman*2Flistinfo*2Fgpfsug-discuss&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=F4oXAT0zdY*2BS1mR784ZGghUt0G*2F6Ofu36MfJ9WnPsPM*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv5uX7C9S$ > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ > > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From NSCHULD at de.ibm.com Wed Feb 23 07:01:45 2022 From: NSCHULD at de.ibm.com (Norbert Schuld) Date: Wed, 23 Feb 2022 09:01:45 +0200 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu><20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu><34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu><20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: May I point out some additional systemd targets documented here: https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=gpfs-planning-systemd Depending on the need the gpfs-wait-mount.service could be helpful as an "after" clause for other units. An example is provided in /usr/lpp/mmfs/samples/systemd.service.sample Kind regards Norbert Schuld IBM Spectrum Scale Software Development -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 11:03:37 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 11:03:37 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. ???????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug-discuss at spectrumscale.org" > Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender ">This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From juergen.hannappel at desy.de Wed Feb 23 11:49:09 2022 From: juergen.hannappel at desy.de (Hannappel, Juergen) Date: Wed, 23 Feb 2022 12:49:09 +0100 (CET) Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name > From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 > Subject: Re: [gpfsug-discuss] immutable folder > Its not a fileset, its just a folder, well a subfolder? > [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact > experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick > Foster's sample > It?s the ?Nick Foster's sample? folder I want to delete, but it says it is > immutable and I can?t disable that. > I suspect it?s the apostrophe confusing things. > Kindest regards, > Paul > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: [ mailto:p.ward at nhm.ac.uk | p.ward at nhm.ac.uk ] > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale > Sent: 22 February 2022 14:17 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder > Scale disallows deleting fileset junction using rmdir, so I suggested > mmunlinkfileset. > Regards, The Spectrum Scale (GPFS) team > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), > then please post it to the public IBM developerWroks Forum at [ > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=04%7C01%7Cp.ward%40nhm.ac.uk%7Cbd72c8c2ee3d49f619c908d9f60e0732%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637811363409593169%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=XoY%2BAbA5%2FNBwuoJrY12MNurjJrp8KMsV1t63hdItfiM%3D&reserved=0 > | > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > ] . > If your query concerns a potential software error in Spectrum Scale (GPFS) and > you have an IBM software maintenance contract please contact 1-800-237-5511 in > the United States or your local IBM Service Center in other countries. > The forum is informally monitored as time permits and should not be used for > priority messages to the Spectrum Scale (GPFS) team. > From: "Paul Ward" < [ mailto:p.ward at nhm.ac.uk | p.ward at nhm.ac.uk ] > > To: "gpfsug main discussion list" < [ mailto:gpfsug-discuss at spectrumscale.org | > gpfsug-discuss at spectrumscale.org ] > > Date: 02/22/2022 05:31 AM > Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder > Sent by: [ mailto:gpfsug-discuss-bounces at spectrumscale.org | > gpfsug-discuss-bounces at spectrumscale.org ] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From p.ward at nhm.ac.uk Wed Feb 23 12:17:15 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 12:17:15 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Thanks, I couldn't recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory 'it/stu'pid name': No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder... [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It's the "Nick Foster's sample" folder I want to delete, but it says it is immutable and I can't disable that. I suspect it's the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From stockf at us.ibm.com Wed Feb 23 12:51:26 2022 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 23 Feb 2022 12:51:26 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: , <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image001.jpg at 01D828AF.49A09C40.jpg Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 13:52:20 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 13:52:20 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: , <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: 5.1.1-1 Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Frederick Stock Sent: 23 February 2022 12:51 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] immutable folder Paul, what version of Spectrum Scale are you using? Fred _______________________________________________________ Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Paul Ward" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Date: Wed, Feb 23, 2022 7:17 AM Thanks, I couldn't recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory 'it/stu'pid name': No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawingDescription automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder... [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It's the "Nick Foster's sample" folder I want to delete, but it says it is immutable and I can't disable that. I suspect it's the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawingDescription automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From julian.jakobs at cec.mpg.de Wed Feb 23 13:48:10 2022 From: julian.jakobs at cec.mpg.de (Jakobs, Julian) Date: Wed, 23 Feb 2022 13:48:10 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> Message-ID: <67f997e15dc040d2900b2e1f9295dec0@cec.mpg.de> I've ran into the same problem some time ago. What worked for me was this shell script I run as a @reboot cronjob: #!/bin/bash while [ ! -d /gpfs1/home ] do sleep 5 done mount --bind /gpfs1/home /home -----Urspr?ngliche Nachricht----- Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Justin Cantrell Gesendet: Dienstag, 22. Februar 2022 20:24 An: gpfsug-discuss at spectrumscale.org Betreff: Re: [gpfsug-discuss] How to do multiple mounts via GPFS I tried a bind mount, but perhaps I'm doing it wrong. The system fails to boot because gpfs doesn't start until too late in the boot process. In fact, the system boots and the gpfs1 partition isn't available for a good 20-30 seconds. /gfs1/home /home none bind I've tried adding mount options of x-systemd-requires=gpfs1, noauto. The noauto lets it boot, but the mount is never mounted properly. Doing a manual mount -a mounts it. On 2/22/22 12:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount >> things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I >> understand, that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be >> done through symlinks or gpfs over nfs? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6777 bytes Desc: not available URL: From scale at us.ibm.com Wed Feb 23 14:57:24 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 23 Feb 2022 10:57:24 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Your directory is under a fileset with non-compliant iam mode. With fileset in that mode, it follows snapLock protocol - it disallows changing subdir to immutable, but allows changing subdir to mutable. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/23/2022 07:17 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" ??????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name From: "Paul Ward" To: "gpfsug main discussion list" Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org < gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 16:35:14 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 16:35:14 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Its not allowing me! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 23 February 2022 14:57 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Your directory is under a fileset with non-compliant iam mode. With fileset in that mode, it follows snapLock protocol - it disallows changing subdir to immutable, but allows changing subdir to mutable. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/23/2022 07:17 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" ??????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From uwe.falke at kit.edu Wed Feb 23 18:26:50 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Wed, 23 Feb 2022 19:26:50 +0100 Subject: [gpfsug-discuss] IO sizes Message-ID: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that? just one of the NSD servers does send smaller IO requests to the storage? than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by? ctrl A, one by ctrl B,? and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From alex at calicolabs.com Wed Feb 23 18:39:07 2022 From: alex at calicolabs.com (Alex Chekholko) Date: Wed, 23 Feb 2022 10:39:07 -0800 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: > Dear all, > > sorry for asking a question which seems not directly GPFS related: > > In a setup with 4 NSD servers (old-style, with storage controllers in > the back end), 12 clients and 10 Seagate storage systems, I do see in > benchmark tests that just one of the NSD servers does send smaller IO > requests to the storage than the other 3 (that is, both reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes > ( one server to the controllers A, the other one to controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by > mpt3sas) for all sd devices and all multipath (dm) devices built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so there is some > asymmetry, but that should not affect the IOs, shouldn't it?, and if it > did we would see the same effect in both pairs of NSD servers, but we do > not). > > All 4 storage systems are also configured the same way (2 disk groups / > pools / declustered arrays, one managed by ctrl A, one by ctrl B, and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I do see, both > in iostat and on the storage systems, that the default IO requests are > about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the storage) cause > incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as > the controller is not able to re-coalesce the data properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Wed Feb 23 21:20:11 2022 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 23 Feb 2022 21:20:11 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: Message-ID: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew > On 24 Feb 2022, at 04:39, Alex Chekholko wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Hi, > > Metadata I/Os will always be smaller than the usual data block size, right? > Which version of GPFS? > > Regards, > Alex > >> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: >> Dear all, >> >> sorry for asking a question which seems not directly GPFS related: >> >> In a setup with 4 NSD servers (old-style, with storage controllers in >> the back end), 12 clients and 10 Seagate storage systems, I do see in >> benchmark tests that just one of the NSD servers does send smaller IO >> requests to the storage than the other 3 (that is, both reads and >> writes are smaller). >> >> The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes >> ( one server to the controllers A, the other one to controllers B of the >> Seagates, resp.). >> >> All 4 NSD servers are set up similarly: >> >> kernel: 3.10.0-1160.el7.x86_64 #1 SMP >> >> HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx >> >> driver : mpt3sas 31.100.01.00 >> >> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by >> mpt3sas) for all sd devices and all multipath (dm) devices built on top. >> >> scheduler: deadline >> >> multipath (actually we do have 3 paths to each volume, so there is some >> asymmetry, but that should not affect the IOs, shouldn't it?, and if it >> did we would see the same effect in both pairs of NSD servers, but we do >> not). >> >> All 4 storage systems are also configured the same way (2 disk groups / >> pools / declustered arrays, one managed by ctrl A, one by ctrl B, and >> 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). >> >> >> GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO >> requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. >> >> The first question I have - but that is not my main one: I do see, both >> in iostat and on the storage systems, that the default IO requests are >> about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb >> is really in terms of kiB, not sectors, cf. >> https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). >> >> But what puzzles me even more: one of the server compiles IOs even >> smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and >> writes ... I just cannot see why. >> >> I have to suspect that this will (in writing to the storage) cause >> incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as >> the controller is not able to re-coalesce the data properly; and it >> seems it cannot do it completely at least) >> >> >> If someone of you has seen that already and/or knows a potential >> explanation I'd be glad to learn about. >> >> >> And if some of you wonder: yes, I (was) moved away from IBM and am now >> at KIT. >> >> Many thanks in advance >> >> Uwe >> >> >> -- >> Karlsruhe Institute of Technology (KIT) >> Steinbuch Centre for Computing (SCC) >> Scientific Data Management (SDM) >> >> Uwe Falke >> >> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 >> D-76344 Eggenstein-Leopoldshafen >> >> Tel: +49 721 608 28024 >> Email: uwe.falke at kit.edu >> www.scc.kit.edu >> >> Registered office: >> Kaiserstra?e 12, 76131 Karlsruhe, Germany >> >> KIT ? The Research University in the Helmholtz Association >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From uwe.falke at kit.edu Thu Feb 24 01:03:32 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Thu, 24 Feb 2022 02:03:32 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems? recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks to > consider V4 filesystems have 1/32 subblocks, V5 filesystems have > 1/1024 subblocks (assuming metadata and data block size is the same) > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file size is > if most of your files are smaller than your filesystem block size, > then you are always going to be performing writes using groups of > subblocks rather than a full block writes. > > Regards, > > Andrew > > >> On 24 Feb 2022, at 04:39, Alex Chekholko wrote: >> >> ? Hi, Metadata I/Os will always be smaller than the usual data block >> size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, >> 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry >> for asking a question which seems ZjQcmQRYFpfptBannerStart >> This Message Is From an External Sender >> This message came from outside your organization. >> ZjQcmQRYFpfptBannerEnd >> Hi, >> >> Metadata I/Os will always be smaller than the usual data block size, >> right? >> Which version of GPFS? >> >> Regards, >> Alex >> >> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: >> >> Dear all, >> >> sorry for asking a question which seems not directly GPFS related: >> >> In a setup with 4 NSD servers (old-style, with storage >> controllers in >> the back end), 12 clients and 10 Seagate storage systems, I do >> see in >> benchmark tests that? just one of the NSD servers does send >> smaller IO >> requests to the storage? than the other 3 (that is, both reads and >> writes are smaller). >> >> The NSD servers form 2 pairs, each pair is connected to 5 seagate >> boxes >> ( one server to the controllers A, the other one to controllers B >> of the >> Seagates, resp.). >> >> All 4 NSD servers are set up similarly: >> >> kernel: 3.10.0-1160.el7.x86_64 #1 SMP >> >> HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx >> >> driver : mpt3sas 31.100.01.00 >> >> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as >> limited by >> mpt3sas) for all sd devices and all multipath (dm) devices built >> on top. >> >> scheduler: deadline >> >> multipath (actually we do have 3 paths to each volume, so there >> is some >> asymmetry, but that should not affect the IOs, shouldn't it?, and >> if it >> did we would see the same effect in both pairs of NSD servers, >> but we do >> not). >> >> All 4 storage systems are also configured the same way (2 disk >> groups / >> pools / declustered arrays, one managed by? ctrl A, one by ctrl >> B,? and >> 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). >> >> >> GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO >> requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. >> >> The first question I have - but that is not my main one: I do >> see, both >> in iostat and on the storage systems, that the default IO >> requests are >> about 4MiB, not 8MiB as I'd expect from above settings >> (max_sectors_kb >> is really in terms of kiB, not sectors, cf. >> https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). >> >> But what puzzles me even more: one of the server compiles IOs even >> smaller, varying between 3.2MiB and 3.6MiB mostly - both for >> reads and >> writes ... I just cannot see why. >> >> I have to suspect that this will (in writing to the storage) cause >> incomplete stripe writes on our erasure-coded volumes (8+2p)(as >> long as >> the controller is not able to re-coalesce the data properly; and it >> seems it cannot do it completely at least) >> >> >> If someone of you has seen that already and/or knows a potential >> explanation I'd be glad to learn about. >> >> >> And if some of you wonder: yes, I (was) moved away from IBM and >> am now >> at KIT. >> >> Many thanks in advance >> >> Uwe >> >> >> -- >> Karlsruhe Institute of Technology (KIT) >> Steinbuch Centre for Computing (SCC) >> Scientific Data Management (SDM) >> >> Uwe Falke >> >> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 >> D-76344 Eggenstein-Leopoldshafen >> >> Tel: +49 721 608 28024 >> Email: uwe.falke at kit.edu >> www.scc.kit.edu >> >> Registered office: >> Kaiserstra?e 12, 76131 Karlsruhe, Germany >> >> KIT ? The Research University in the Helmholtz Association >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Achim.Rehor at de.ibm.com Thu Feb 24 12:41:11 2022 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Thu, 24 Feb 2022 14:41:11 +0200 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi Uwe, first of all, glad to see you back in the GPFS space ;) agreed, groups of subblocks being written will end up in IO sizes, being smaller than the 8MB filesystem blocksize, also agreed, this cannot be metadata, since their size is MUCH smaller, like 4k or less, mostly. But why would these grouped subblock reads/writes all end up on the same NSD server, while the others do full block writes ? How is your NSD server setup per NSD ? did you 'round-robin' set the preferred NSD server per NSD ? are the client nodes transferring the data in anyway doing specifics ? Sorry for not having a solution for you, jsut sharing a few ideas ;) Mit freundlichen Gr??en / Kind regards Achim Rehor Technical Support Specialist Spectrum Scale and ESS (SME) Advisory Product Services Professional IBM Systems Storage Support - EMEA gpfsug-discuss-bounces at spectrumscale.org wrote on 23/02/2022 22:20:11: > From: "Andrew Beattie" > To: "gpfsug main discussion list" > Date: 23/02/2022 22:20 > Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > Alex, Metadata will be 4Kib Depending on the filesystem version you > will also have subblocks to consider V4 filesystems have 1/32 > subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata > and data block size is the same) ???????????ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks to > consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/ > 1024 subblocks (assuming metadata and data block size is the same) > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file size > is if most of your files are smaller than your filesystem block > size, then you are always going to be performing writes using groups > of subblocks rather than a full block writes. > > Regards, > > Andrew > > On 24 Feb 2022, at 04:39, Alex Chekholko wrote: > ? Hi, Metadata I/Os will always be smaller than the usual data block > size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, > 2022 at 10:26 AM Uwe Falke wrote: Dear all, > sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Hi, > > Metadata I/Os will always be smaller than the usual data block size, right? > Which version of GPFS? > > Regards, > Alex > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: > Dear all, > > sorry for asking a question which seems not directly GPFS related: > > In a setup with 4 NSD servers (old-style, with storage controllers in > the back end), 12 clients and 10 Seagate storage systems, I do see in > benchmark tests that just one of the NSD servers does send smaller IO > requests to the storage than the other 3 (that is, both reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes > ( one server to the controllers A, the other one to controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by > mpt3sas) for all sd devices and all multipath (dm) devices built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so there is some > asymmetry, but that should not affect the IOs, shouldn't it?, and if it > did we would see the same effect in both pairs of NSD servers, but we do > not). > > All 4 storage systems are also configured the same way (2 disk groups / > pools / declustered arrays, one managed by ctrl A, one by ctrl B, and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I do see, both > in iostat and on the storage systems, that the default IO requests are > about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the storage) cause > incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as > the controller is not able to re-coalesce the data properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > INVALID URI REMOVED > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-M&m=- > FdZvYBvHDPnBTu2FtPkLT09ahlYp2QsMutqNV2jWaY&s=S4C2D3_h4FJLAw0PUYLKhKE242vn_fwn-1_EJmHNpE8&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From olaf.weiser at de.ibm.com Thu Feb 24 12:47:59 2022 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 24 Feb 2022 12:47:59 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.1__=4EBB0D60DFD775728f9e8a93df938690 at ibm.com.gif Type: image/gif Size: 45 bytes Desc: not available URL: From krajaram at geocomputing.net Thu Feb 24 14:32:35 2022 From: krajaram at geocomputing.net (Kumaran Rajaram) Date: Thu, 24 Feb 2022 14:32:35 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi Uwe, >> But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. IMHO, If GPFS on this particular NSD server was restarted often during the setup, then it is possible that the GPFS pagepool may not be contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a scatter-gather (SG) list with many small entries (in the memory) resulting in smaller I/O when these buffers are issued to the disks. The fix would be to reboot the server and start GPFS so that pagepool is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) SG entries. >>In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs >>smaller than 4MiB again at some point, so that is not a nice solution. It will be advised not to restart GPFS often in the NSD servers (in production) to keep the pagepool contiguous. Ensure that there is enough free memory in NSD server and not run any memory intensive jobs so that pagepool is not impacted (e.g. swapped out). Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is equally distributed across the NUMA domains for good performance. GPFS numaMemoryInterleave=yes requires that numactl packages are installed and then GPFS restarted. # mmfsadm dump config | egrep "numaMemory|pagepool " ! numaMemoryInterleave yes ! pagepool 282394099712 # pgrep mmfsd | xargs numastat -p Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) Node 0 Node 1 Total --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 1.26 3.26 4.52 Stack 0.01 0.01 0.02 Private 137710.43 137709.96 275420.39 ---------------- --------------- --------------- --------------- Total 137711.70 137713.23 275424.92 My two cents, -Kums Kumaran Rajaram [cid:image001.png at 01D82960.6A9860C0] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Uwe Falke Sent: Wednesday, February 23, 2022 8:04 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] IO sizes Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew On 24 Feb 2022, at 04:39, Alex Chekholko wrote: ? Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry for asking a question which seems ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that just one of the NSD servers does send smaller IO requests to the storage than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by ctrl A, one by ctrl B, and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: image001.png URL: From uwe.falke at kit.edu Fri Feb 25 14:29:23 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Fri, 25 Feb 2022 15:29:23 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: <3fc68f40-8b3a-be33-3451-09a04fdc83a0@kit.edu> Hi, and thanks, Achim and Olaf, mmdiag --iohist on the NSD servers (on all 4 of them) shows IO sizes in IOs to/from the data NSDs (i.e. to/from storage) of 16384 512-byte-sectors? throughout, i.e. 8MiB, agreeing with the FS block size. (Having that information i do not need to ask the clients ...) iostat on NSD servers as well as the? storage system counters say the IOs crafted by the OS layer are 4MiB except for the one suspicious NSD server where they were somewhat smaller than 4MiB before the reboot, but are now somewhat larger than 4MiB (but by a distinct amount). The data piped through the NSD servers are well balanced between the 4 NSD servers, the IO system of the suspicious NSD server just issued a higher rate of IO requests when running smaller IOs and now, with larger IOs it has a lower IO rate than the other three NSD servers. So I am pretty sure it is not GPFS (see my initial post :-); but still some people using GPFS might have encounterd that as well, or might have an idea ;-) Cheers Uwe On 24.02.22 13:47, Olaf Weiser wrote: > in addition, to Achim, > where do you see those "smaller IO"... > have you checked IO sizes with mmfsadm dump iohist on each > NSDclient/Server ?... If ok on that level.. it's not GPFS > Mit freundlichen Gr??en / Kind regards > > Olaf Weiser > > ----- Urspr?ngliche Nachricht ----- > Von: "Achim Rehor" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org > An: "gpfsug main discussion list" > CC: > Betreff: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > Datum: Do, 24. Feb 2022 13:41 > > Hi Uwe, > > first of all, glad to see you back in the GPFS space ;) > > agreed, groups of subblocks being written will end up in IO sizes, > being smaller than the 8MB filesystem blocksize, > also agreed, this cannot be metadata, since their size is MUCH > smaller, like 4k or less, mostly. > > But why would these grouped subblock reads/writes all end up on > the same NSD server, while the others do full block writes ? > > How is your NSD server setup per NSD ? did you 'round-robin' set > the preferred NSD server per NSD ? > are the client nodes transferring the data in anyway doing > specifics ?? > > Sorry for not having a solution for you, jsut sharing a few ideas ;) > > > Mit freundlichen Gr??en / Kind regards > > *Achim Rehor* > > Technical Support Specialist Spectrum Scale and ESS (SME) > Advisory Product Services Professional > IBM Systems Storage Support - EMEA > > > > > > > gpfsug-discuss-bounces at spectrumscale.org wrote on 23/02/2022 22:20:11: > > > From: "Andrew Beattie" > > To: "gpfsug main discussion list" > > Date: 23/02/2022 22:20 > > Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Alex, Metadata will be 4Kib Depending on the filesystem version you > > will also have subblocks to consider V4 filesystems have 1/32 > > subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata > > and data block size is the same) > ???????????ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Alex, > > > > Metadata will be 4Kib > > > > Depending on the filesystem version you will also have subblocks to > > consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/ > > 1024 subblocks (assuming metadata and data block size is the same) > > > > My first question would be is ? Are you sure that Linux OS is > > configured the same on all 4 NSD servers?. > > > > My second question would be do you know what your average file size > > is if most of your files are smaller than your filesystem block > > size, then you are always going to be performing writes using groups > > of subblocks rather than a full block writes. > > > > Regards, > > > > Andrew > > > > On 24 Feb 2022, at 04:39, Alex Chekholko > wrote: > > > ? Hi, Metadata I/Os will always be smaller than the usual data block > > size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, > > 2022 at 10:26 AM Uwe Falke wrote: Dear all, > > sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > Metadata I/Os will always be smaller than the usual data block > size, right? > > Which version of GPFS? > > > > Regards, > > Alex > > > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: > > Dear all, > > > > sorry for asking a question which seems not directly GPFS related: > > > > In a setup with 4 NSD servers (old-style, with storage > controllers in > > the back end), 12 clients and 10 Seagate storage systems, I do > see in > > benchmark tests that ?just one of the NSD servers does send > smaller IO > > requests to the storage ?than the other 3 (that is, both reads and > > writes are smaller). > > > > The NSD servers form 2 pairs, each pair is connected to 5 > seagate boxes > > ( one server to the controllers A, the other one to controllers > B of the > > Seagates, resp.). > > > > All 4 NSD servers are set up similarly: > > > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > > > driver : mpt3sas 31.100.01.00 > > > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as > limited by > > mpt3sas) for all sd devices and all multipath (dm) devices built > on top. > > > > scheduler: deadline > > > > multipath (actually we do have 3 paths to each volume, so there > is some > > asymmetry, but that should not affect the IOs, shouldn't it?, > and if it > > did we would see the same effect in both pairs of NSD servers, > but we do > > not). > > > > All 4 storage systems are also configured the same way (2 disk > groups / > > pools / declustered arrays, one managed by ?ctrl A, one by ctrl > B, ?and > > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > > > The first question I have - but that is not my main one: I do > see, both > > in iostat and on the storage systems, that the default IO > requests are > > about 4MiB, not 8MiB as I'd expect from above settings > (max_sectors_kb > > is really in terms of kiB, not sectors, cf. > > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > > > But what puzzles me even more: one of the server compiles IOs even > > smaller, varying between 3.2MiB and 3.6MiB mostly - both for > reads and > > writes ... I just cannot see why. > > > > I have to suspect that this will (in writing to the storage) cause > > incomplete stripe writes on our erasure-coded volumes (8+2p)(as > long as > > the controller is not able to re-coalesce the data properly; and it > > seems it cannot do it completely at least) > > > > > > If someone of you has seen that already and/or knows a potential > > explanation I'd be glad to learn about. > > > > > > And if some of you wonder: yes, I (was) moved away from IBM and > am now > > at KIT. > > > > Many thanks in advance > > > > Uwe > > > > > > -- > > Karlsruhe Institute of Technology (KIT) > > Steinbuch Centre for Computing (SCC) > > Scientific Data Management (SDM) > > > > Uwe Falke > > > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > > D-76344 Eggenstein-Leopoldshafen > > > > Tel: +49 721 608 28024 > > Email: uwe.falke at kit.edu > > www.scc.kit.edu > > > > Registered office: > > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > > > KIT ? The Research University in the Helmholtz Association > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > INVALID URI REMOVED > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-M&m=- > > > FdZvYBvHDPnBTu2FtPkLT09ahlYp2QsMutqNV2jWaY&s=S4C2D3_h4FJLAw0PUYLKhKE242vn_fwn-1_EJmHNpE8&e= > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.1__%3D4EBB0D60DFD775728f9e8a93df938690%40ibm.com.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From uwe.falke at kit.edu Mon Feb 28 09:17:26 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Mon, 28 Feb 2022 10:17:26 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> Hi, Kumaran, that would explain the smaller IOs before the reboot, but not the larger-than-4MiB IOs afterwards on that machine. Then, I already saw that the numaMemoryInterleave setting seems to have no effect (on that very installation), I just have not yet requested a PMR for it. I'd checked memory usage of course and saw that regardless of this setting always one socket's memory is almost completely consumed while the other one's is rather empty - looks like a bug to me, but that needs further investigation. Uwe On 24.02.22 15:32, Kumaran Rajaram wrote: > > Hi Uwe, > > >> But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > IMHO, If GPFS on this particular NSD server was restarted often during > the setup, then it is possible that the GPFS pagepool may not be > contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a > scatter-gather (SG) list with many small entries (in the memory) > resulting in smaller I/O when these buffers are issued to the disks. > The fix would be to reboot the server and start GPFS so that pagepool > is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) > SG entries. > > >>In the current situation (i.e. with IOs bit larger than 4MiB) > setting max_sectors_kB to 4096 might do the trick, but as I do not > know the cause for that behaviour it might well start to issue IOs > >>smaller than 4MiB again at some point, so that is not a nice solution. > > It will be advised not to restart GPFS often in the NSD servers (in > production) to keep the pagepool contiguous. Ensure that there is > enough free memory in NSD server and not run any memory intensive jobs > so that pagepool is not impacted (e.g. swapped out). > > Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is > equally distributed across the NUMA domains for good performance. GPFS > numaMemoryInterleave=yes requires that numactl packages are installed > and then GPFS restarted. > > # mmfsadm dump config | egrep "numaMemory|pagepool " > > ! numaMemoryInterleave yes > > ! pagepool 282394099712 > > # pgrep mmfsd | xargs numastat -p > > Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) > > ?????????????????????????? Node 0 Node 1?????????? Total > > ????????????????? --------------- --------------- --------------- > > Huge???????????????????????? 0.00 0.00??????????? 0.00 > > Heap???????????????????????? 1.26 3.26???????? ???4.52 > > Stack??????????????????????? 0.01 0.01??????????? 0.02 > > Private???????????????? 137710.43 137709.96?????? 275420.39 > > ----------------? --------------- --------------- --------------- > > Total?????????????????? 137711.70 137713.23 ??????275424.92 > > My two cents, > > -Kums > > Kumaran Rajaram > > *From:* gpfsug-discuss-bounces at spectrumscale.org > *On Behalf Of *Uwe Falke > *Sent:* Wednesday, February 23, 2022 8:04 PM > *To:* gpfsug-discuss at spectrumscale.org > *Subject:* Re: [gpfsug-discuss] IO sizes > > Hi, > > the test bench is gpfsperf running on up to 12 clients with 1...64 > threads doing sequential reads and writes , file size per gpfsperf > process is 12TB (with 6TB I saw caching effects in particular for > large thread numbers ...) > > As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data > disks, as expected in that case. > > Interesting thing though: > > I have rebooted the suspicious node. Now, it does not issue smaller > IOs than the others, but -- unbelievable -- larger ones (up to about > 4.7MiB). This is still harmful as also that size is incompatible with > full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) > > Currently, I draw this information from the storage boxes; I have not > yet checked iostat data for that benchmark test after the reboot > (before, when IO sizes were smaller, we saw that both in iostat and in > the perf data retrieved from the storage controllers). > > And: we have a separate data pool , hence dataOnly NSDs, I am just > talking about these ... > > As for "Are you sure that Linux OS is configured the same on all 4 NSD > servers?." - of course there are not two boxes identical in the world. > I have actually not installed those machines, and, yes, i also > considered reinstalling them (or at least the disturbing one). > > However, I do not have reason to assume or expect a difference, the > supplier has just implemented these systems recently from scratch. > > In the current situation (i.e. with IOs bit larger than 4MiB) setting > max_sectors_kB to 4096 might do the trick, but as I do not know the > cause for that behaviour it might well start to issue IOs smaller than > 4MiB again at some point, so that is not a nice solution. > > Thanks > > Uwe > > On 23.02.22 22:20, Andrew Beattie wrote: > > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks > to consider V4 filesystems have 1/32 subblocks, V5 filesystems > have 1/1024 subblocks (assuming metadata and data block size is > the same) > > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file > size is if most of your files are smaller than your filesystem > block size, then you are always going to be performing writes > using groups of subblocks rather than a full block writes. > > Regards, > > Andrew > > > > On 24 Feb 2022, at 04:39, Alex Chekholko > wrote: > > ? Hi, Metadata I/Os will always be smaller than the usual data > block size, right? Which version of GPFS? Regards, Alex On > Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a > question which seems ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > Metadata I/Os will always be smaller than the usual data block > size, right? > > Which version of GPFS? > > Regards, > > Alex > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: > > Dear all, > > sorry for asking a question which seems not directly GPFS > related: > > In a setup with 4 NSD servers (old-style, with storage > controllers in > the back end), 12 clients and 10 Seagate storage systems, > I do see in > benchmark tests that? just one of the NSD servers does > send smaller IO > requests to the storage? than the other 3 (that is, both > reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 > seagate boxes > ( one server to the controllers A, the other one to > controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, > as limited by > mpt3sas) for all sd devices and all multipath (dm) devices > built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so > there is some > asymmetry, but that should not affect the IOs, shouldn't > it?, and if it > did we would see the same effect in both pairs of NSD > servers, but we do > not). > > All 4 storage systems are also configured the same way (2 > disk groups / > pools / declustered arrays, one managed by? ctrl A, one by > ctrl B,? and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 > NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do > see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I > do see, both > in iostat and on the storage systems, that the default IO > requests are > about 4MiB, not 8MiB as I'd expect from above settings > (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt > ). > > But what puzzles me even more: one of the server compiles > IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both > for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the > storage) cause > incomplete stripe writes on our erasure-coded volumes > (8+2p)(as long as > the controller is not able to re-coalesce the data > properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a > potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from > IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > Uwe Falke > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > Tel: +49 721 608 28024 > Email:uwe.falke at kit.edu > www.scc.kit.edu > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Renar.Grunenberg at huk-coburg.de Mon Feb 28 12:23:55 2022 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Mon, 28 Feb 2022 12:23:55 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> References: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> Message-ID: <7a29b404669942d193ad46c2632d6d30@huk-coburg.de> Hallo Uwe, are numactl already installed on that affected node? If it missed the numa scale stuff is not working. Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss-bounces at spectrumscale.org Im Auftrag von Uwe Falke Gesendet: Montag, 28. Februar 2022 10:17 An: gpfsug-discuss at spectrumscale.org Betreff: Re: [gpfsug-discuss] IO sizes Hi, Kumaran, that would explain the smaller IOs before the reboot, but not the larger-than-4MiB IOs afterwards on that machine. Then, I already saw that the numaMemoryInterleave setting seems to have no effect (on that very installation), I just have not yet requested a PMR for it. I'd checked memory usage of course and saw that regardless of this setting always one socket's memory is almost completely consumed while the other one's is rather empty - looks like a bug to me, but that needs further investigation. Uwe On 24.02.22 15:32, Kumaran Rajaram wrote: Hi Uwe, >> But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. IMHO, If GPFS on this particular NSD server was restarted often during the setup, then it is possible that the GPFS pagepool may not be contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a scatter-gather (SG) list with many small entries (in the memory) resulting in smaller I/O when these buffers are issued to the disks. The fix would be to reboot the server and start GPFS so that pagepool is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) SG entries. >>In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs >>smaller than 4MiB again at some point, so that is not a nice solution. It will be advised not to restart GPFS often in the NSD servers (in production) to keep the pagepool contiguous. Ensure that there is enough free memory in NSD server and not run any memory intensive jobs so that pagepool is not impacted (e.g. swapped out). Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is equally distributed across the NUMA domains for good performance. GPFS numaMemoryInterleave=yes requires that numactl packages are installed and then GPFS restarted. # mmfsadm dump config | egrep "numaMemory|pagepool " ! numaMemoryInterleave yes ! pagepool 282394099712 # pgrep mmfsd | xargs numastat -p Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) Node 0 Node 1 Total --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 1.26 3.26 4.52 Stack 0.01 0.01 0.02 Private 137710.43 137709.96 275420.39 ---------------- --------------- --------------- --------------- Total 137711.70 137713.23 275424.92 My two cents, -Kums Kumaran Rajaram [cid:image001.png at 01D82CA6.6F82DC70] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Uwe Falke Sent: Wednesday, February 23, 2022 8:04 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] IO sizes Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew On 24 Feb 2022, at 04:39, Alex Chekholko wrote: ? Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry for asking a question which seems ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that just one of the NSD servers does send smaller IO requests to the storage than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by ctrl A, one by ctrl B, and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: image001.png URL: From p.ward at nhm.ac.uk Mon Feb 28 16:40:08 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 28 Feb 2022 16:40:08 +0000 Subject: [gpfsug-discuss] Interoperability of Transparent cloud tiering with other IBM Spectrum Scale features Message-ID: I am used to a SCALE solution with space management to a tape tier. Files can not be migrated unless they are backed up. Once migrated and are a stub file they are not backed up as a stub, and they are not excluded from backup. We used the Spectrum Protect BA client, not mmbackup. We have a new SCALE solution with COS, setup with TCT. I am expecting it to operate in the same way. Files can't be migrated unless backed up. Once migrated they are a stub and a don't get backed up again. We are using mmbackup. I migrated files before backup was setup. When backup was turned on, it pulled the files back. The migration policy was set to migrate files not accessed for 2 days. All data met this requirement. Migrations is set to run every 15 minutes, so was pushing them back quite quickly. The cluster was a mess of files going back and forth from COS. To stop this I changed the policy to 14 days. I set mmbackup to exclude migrated files. Things calmed down. I have now almost run out of space on my hot tier, but anything I migrate will expire from backup. The statement below is a bit confusing. HSM and TCT are completely different. I thought TCT was for cloud, and HSM for tape? Both can exist in a cluster but operate on different areas. This suggest to have mmbackup work with data migrated to a cloud tier, we should be using HSM not TCT? Can mmbackup with TCT do what HSM does? https://www.ibm.com/docs/en/spectrum-scale/5.0.5?topic=ics-interoperability-transparent-cloud-tiering-other-spectrum-scale-features Spectrum Protect (TSM) For the file systems that are managed by an HSM system, ensure that hot data is backed up to TSM by using the mmbackup command, and as the data gets cooler, migrate them to the cloud storage tier. This ensures that the mmbackup command has already backed up the cooler files that are migrated to the cloud. Has anyone set something up similar? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From michael.meier at fau.de Tue Feb 1 11:20:37 2022 From: michael.meier at fau.de (Michael Meier) Date: Tue, 1 Feb 2022 12:20:37 +0100 Subject: [gpfsug-discuss] Spectrum Scale and vfs_fruit Message-ID: Hi, A bunch of security updates for Samba were released yesterday, most importantly among them CVE-2021-44142 (https://www.samba.org/samba/security/CVE-2021-44142.html) in the vfs_fruit VFS-module that adds extended support for Apple Clients. Spectrum Scale supports that, so Spectrum Scale might be affected, and I'm trying to find out if we're affected or not. Now we never enabled this via "mmsmb config change --vfs-fruit-enable", and I would expect this to be disabled by default - however, I cannot find an explicit statement like "by default this is disabled" in https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=services-support-vfs-fruit-smb-protocol Am I correct in assuming that it is indeed disabled by default? And how would I verify that? Am I correct in assuming that _if_ it was enabled, then 'fruit' would show up under the 'vfs objects' in 'mmsmb config list'? Regards, -- Michael Meier, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales Rechenzentrum Erlangen Martensstrasse 1, 91058 Erlangen, Germany Tel.: +49 9131 85-20994, Fax: +49 9131 302941 michael.meier at fau.de hpc.fau.de From p.ward at nhm.ac.uk Tue Feb 1 12:28:09 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 1 Feb 2022 12:28:09 +0000 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: Not currently set. I'll look into them. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: 26 January 2022 16:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Awesome, glad that you found them (I missed them the first time too). As for the anomalous changed files, do you have these options set in your client option file? skipacl yes skipaclupdatecheck yes updatectime yes We had similar problems where metadata and ACL updates were interpreted as data changes by mmbackup/dsmc. We also have a case open with IBM where mmbackup will both expire and backup a file in the same run, even in the absence of mtime changes, but it's unclear whether that's program error or something with our include/exclude rules. I'd be curious if you're running into that as well. On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > Good call! > > Yes they are dot files. > > > New issue. > > Mmbackup seems to be backup up the same files over and over without them changing: > areas are being backed up multiple times. > The example below is a co-resident file, the only thing that has changed since it was created 20/10/21, is the file has been accessed for backup. > This file is in the 'changed' list in mmbackup: > > This list has just been created: > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > Listing the last few files in the file (selecting the last one) > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > Check the file stats (access time just before last backup) > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File: '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > Size: 545 Blocks: 32 IO Block: 4194304 regular file > Device: 2bh/43d Inode: 212618897 Links: 1 > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: (1399647564/NHM\dg-mbl-urban-nature-project-rw) > Context: unconfined_u:object_r:unlabeled_t:s0 > Access: 2022-01-25 06:40:58.334961446 +0000 > Modify: 2020-12-01 15:20:40.122053000 +0000 > Change: 2021-10-20 17:55:18.265746459 +0100 > Birth: - > > Check if migrated > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File name : /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > On-line size : 545 > Used blocks : 16 > Data Version : 1 > Meta Version : 1 > State : Co-resident > Container Index : 1 > Base Name : 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > Check if immutable > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > file name: /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > metadata replication: 2 max 2 > data replication: 2 max 2 > immutable: no > appendOnly: no > flags: > storage pool name: data > fileset name: hpc-workspaces-fset > snapshot name: > creation time: Wed Oct 20 17:55:18 2021 > Misc attributes: ARCHIVE > Encrypted: no > > Check active and inactive backups (it was backed up yesterday) > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 11:19:02 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > 11:07:05 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/25/2022 06:41:17 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > It will be backed up again shortly, why? > > And it was backed up again: > # dsmcqbi > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 15:54:09 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > 15:30:03 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/26/2022 12:23:02 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/25/2022 06:41:17 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Skylar > Thompson > Sent: 24 January 2022 15:37 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Hi Paul, > > Did you look for dot files? At least for us on 5.0.5 there's a .list.1. file while the backups are running: > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > Those directories are empty > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of IBM Spectrum > > Scale > > Sent: 22 January 2022 00:35 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > Instead of calculating *.ix.* files, please look at a list file in these directories. > > > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked.]"Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/21/2022 09:38 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > of the script I now copy the contents of the .mmbackupCfg folder to > > a date stamped logging folder Checking how many entries in these files compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you > > > > Right in the command line seems to have worked. > > At the end of the script I now copy the contents of the .mmbackupCfg > > folder to a date stamped logging folder > > > > Checking how many entries in these files compared to the Summary: > > wc -l mmbackup* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 754 total > > From Summary > > Total number of objects inspected: 755 > > I can live with a discrepancy of 1. > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > From Summary > > Total number of objects expired: 2 > > That matches > > > > wc -l mmbackupC* mmbackupS* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 752 total > > Summary: > > Total number of objects backed up: 751 > > > > A difference of 1 I can live with. > > > > What does Statech stand for? > > > > Just this to sort out: > > Total number of objects failed: 1 > > I will add: > > --tsm-errorlog TSMErrorLogFile > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 19 January 2022 15:09 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > This is to set environment for mmbackup. > > If mmbackup is invoked within a script, you can set "export DEBUGmmbackup=2" right above mmbackup command. > > e.g) in your script > > .... > > export DEBUGmmbackup=2 > > mmbackup .... > > > > Or, you can set it in the same command line like > > DEBUGmmbackup=2 mmbackup .... > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to se]"Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to see if they are the cluster manager. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/19/2022 06:04 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > they are the cluster manager. If they are, then they take > > responsibility to start the backup script. The script then randomly selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you. > > > > We run a script on all our nodes that checks to see if they are the cluster manager. > > If they are, then they take responsibility to start the backup script. > > The script then randomly selects one of the available backup nodes and uses dsmsh mmbackup on it. > > > > Where does this command belong? > > I have seen it listed as a export command, again where should that be run ? on all backup nodes, or all nodes? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 18 January 2022 22:54 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files even after successful backup. They are available at MMBACKUP_RECORD_ROOT (default is FSroot or FilesetRoot directory). > > In .mmbackupCfg directory, there are 3 directories: > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to back]"Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to backup using mmbackup. > > > > From: "Paul Ward" > > > To: > > "gpfsug-discuss at spectrumscale.org > org>" > > > org>> > > Date: 01/18/2022 11:56 AM > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > using mmbackup. I have increased the -L value from 3 up to 6 but > > only seem to see the files that are in scope, not the ones that are selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > I am trying to work out what files have been sent to backup using mmbackup. > > I have increased the -L value from 3 up to 6 but only seem to see the files that are in scope, not the ones that are selected. > > > > I can see the three file lists generated during a backup, but can?t seem to find a list of what files were backed up. > > > > It should be the diff of the shadow and shadow-old, but the wc -l of the diff doesn?t match the number of files in the backup summary. > > Wrong assumption? > > > > Where should I be looking ? surely it shouldn?t be this hard to see what files are selected? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > er > > ved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > es > > erved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > 2F > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > 7f > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > d8 > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > C2 > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > ;r > > eserved=0> > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&reserved=0 From dehaan at us.ibm.com Tue Feb 1 16:14:07 2022 From: dehaan at us.ibm.com (David DeHaan) Date: Tue, 1 Feb 2022 09:14:07 -0700 Subject: [gpfsug-discuss] Spectrum Scale and vfs_fruit In-Reply-To: References: Message-ID: Yes, it is disabled by default. And yes, you can tell if it has been enabled by looking at the smb config list. This is what a non-fruit vfs-object line looks like vfs objects = shadow_copy2 syncops gpfs fileid time_audit This is one that has been "fruitified" vfs objects = shadow_copy2 syncops fruit streams_xattr gpfs fileid time_audit *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* David DeHaan Spectrum Scale Test *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From: "Michael Meier" To: gpfsug-discuss at spectrumscale.org Date: 02/01/2022 04:26 AM Subject: [EXTERNAL] [gpfsug-discuss] Spectrum Scale and vfs_fruit Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, A bunch of security updates for Samba were released yesterday, most importantly among them CVE-2021-44142 ( https://www.samba.org/samba/security/CVE-2021-44142.html ) in the vfs_fruit VFS-module that adds extended support for Apple Clients. Spectrum Scale supports that, so Spectrum Scale might be affected, and I'm trying to find out if we're affected or not. Now we never enabled this via "mmsmb config change --vfs-fruit-enable", and I would expect this to be disabled by default - however, I cannot find an explicit statement like "by default this is disabled" in https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=services-support-vfs-fruit-smb-protocol Am I correct in assuming that it is indeed disabled by default? And how would I verify that? Am I correct in assuming that _if_ it was enabled, then 'fruit' would show up under the 'vfs objects' in 'mmsmb config list'? Regards, -- Michael Meier, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales Rechenzentrum Erlangen Martensstrasse 1, 91058 Erlangen, Germany Tel.: +49 9131 85-20994, Fax: +49 9131 302941 michael.meier at fau.de hpc.fau.de _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ivano.talamo at psi.ch Wed Feb 2 09:07:13 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 09:07:13 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce Message-ID: Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Wed Feb 2 09:33:25 2022 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 2 Feb 2022 09:33:25 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: Message-ID: Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec > > By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. > > Thanks, > Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed Feb 2 10:07:25 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 2 Feb 2022 10:07:25 +0000 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Message-ID: Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:548be828-dcc2-4a88-ac2e-ff5106b3f802] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-iity4nk4 Type: application/octet-stream Size: 2541 bytes Desc: Outlook-iity4nk4 URL: From ivano.talamo at psi.ch Wed Feb 2 10:45:26 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 10:45:26 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , Message-ID: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From sthompson2 at lenovo.com Wed Feb 2 10:52:27 2022 From: sthompson2 at lenovo.com (Simon Thompson2) Date: Wed, 2 Feb 2022 10:52:27 +0000 Subject: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is "HA". BUT if you shutdown the GUI servers (say you are waiting for a log4j patch ...) then you have no snapshot automation. Due to the way we structured independent filesets, this could be 50 or so to automate and we wanted to set a say 4 day retention policy. So clicking in the GUI was pretty simple to do this for. What we did found is it a snapshot failed to delete for some reason (quiesce etc), then the GUI never tried again to clean it up so we have monitoring to look for unexpected snapshots that needed cleaning up. Simon ________________________________ Simon Thompson He/Him/His Senior Storage Performance WW HPC Customer Solutions Lenovo UK [Phone]+44 7788 320635 [Email]sthompson2 at lenovo.com Lenovo.com Twitter | Instagram | Facebook | Linkedin | YouTube | Privacy [cid:image003.png at 01D81822.F63BAB90] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Kidger, Daniel Sent: 02 February 2022 10:07 To: gpfsug-discuss at spectrumscale.org Subject: [External] [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:image004.png at 01D81822.F63BAB90] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 20109 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 2541 bytes Desc: image004.png URL: From jordi.caubet at es.ibm.com Wed Feb 2 11:07:37 2022 From: jordi.caubet at es.ibm.com (Jordi Caubet Serrabou) Date: Wed, 2 Feb 2022 11:07:37 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: <4326cfae883b4378bcb284b6daecb05e@psi.ch> References: <4326cfae883b4378bcb284b6daecb05e@psi.ch>, , Message-ID: An HTML attachment was scrubbed... URL: From janfrode at tanso.net Wed Feb 2 11:53:50 2022 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 2 Feb 2022 12:53:50 +0100 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou < jordi.caubet at es.ibm.com> wrote: > Ivano, > > if it happens frequently, I would recommend to open a support case. > > The creation or deletion of a snapshot requires a quiesce of the nodes to > obtain a consistent point-in-time image of the file system and/or update > some internal structures afaik. Quiesce is required for nodes at the > storage cluster but also remote clusters. Quiesce means stop activities > (incl. I/O) for a short period of time to get such consistent image. Also > waiting to flush any data in-flight to disk that does not allow a > consistent point-in-time image. > > Nodes receive a quiesce request and acknowledge when ready. When all nodes > acknowledge, snapshot operation can proceed and immediately I/O can resume. > It usually takes few seconds at most and the operation performed is short > but time I/O is stopped depends of how long it takes to quiesce the nodes. > If some node take longer to agree stop the activities, such node will > be delay the completion of the quiesce and keep I/O paused on the rest. > There could many things while some nodes delay quiesce ack. > > The larger the cluster, the more difficult it gets. The more network > congestion or I/O load, the more difficult it gets. I recommend to open a > ticket for support to try to identify the root cause of which nodes not > acknowledge the quiesce and maybe find the root cause. If I recall some > previous thread, default timeout was 60 seconds which match your log > message. After such timeout, snapshot is considered failed to complete. > > Support might help you understand the root cause and provide some > recommendations if it happens frequently. > > Best Regards, > -- > Jordi Caubet Serrabou > IBM Storage Client Technical Specialist (IBM Spain) > > > ----- Original message ----- > From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Date: Wed, Feb 2, 2022 11:45 AM > > > Hello Andrew, > > > > Thanks for your questions. > > > > We're not experiencing any other issue/slowness during normal activity. > > The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool > for metadata only. > > > > The two NSD servers have 750GB of RAM and 618 are configured as pagepool. > > > > The issue we see is happening on both the two filesystems we have: > > > > - perf filesystem: > > - 1.8 PB size (71% in use) > > - 570 milions of inodes (24% in use) > > > > - tiered filesystem: > > - 400 TB size (34% in use) > > - 230 Milions of files (60% in use) > > > > Cheers, > > Ivano > > > > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> on behalf of Andrew Beattie < > abeattie at au1.ibm.com> > *Sent:* Wednesday, February 2, 2022 10:33 AM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Ivano, > > How big is the filesystem in terms of number of files? > How big is the filesystem in terms of capacity? > Is the Metadata on Flash or Spinning disk? > Do you see issues when users do an LS of the filesystem or only when you > are doing snapshots. > > How much memory do the NSD servers have? > How much is allocated to the OS / Spectrum > Scale Pagepool > > Regards > > Andrew Beattie > Technical Specialist - Storage for Big Data & AI > IBM Technology Group > IBM Australia & New Zealand > P. +61 421 337 927 > E. abeattie at au1.IBM.com > > > > > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: > > > ? > > > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe > also when creating new ones) the filesystem becomes inaccessible on the > clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote > cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 > msec > > By looking around I see we're not the first one. I am wondering if that's > considered an unavoidable part of the snapshotting and if there's any > tunable that can improve the situation. Since when this occurs all the > clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage > cluster is on 5.1.1-0. > > Thanks, > Ivano > > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: > > International Business Machines, S.A. > > Santa Hortensia, 26-28, 28002 Madrid > > Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 > > CIF A28-010791 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Feb 2 12:09:24 2022 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 2 Feb 2022 12:09:24 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed Feb 2 12:08:54 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 2 Feb 2022 12:08:54 +0000 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: Simon, Thanks - that is a good insight. The HA 'feature' of the snapshot automation is perhaps a key feature as Linux still lacks a decent 'cluster cron' Also, If "HA" do we know where the state is centrally kept? On the point of snapshots being left undeleted, do you ever use /usr/lpp/mmfs/gui/cli/lssnapops to see what the queue of outstanding actions is like? (There is also a notification tool: lssnapnotify in that directory that is supposed to alert on failed snapshot actions, although personally I have never used it) Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:fce0ce85-6ae4-44ce-aa94-d7d099e68acb] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Simon Thompson2 Sent: 02 February 2022 10:52 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is ?HA?. BUT if you shutdown the GUI servers (say you are waiting for a log4j patch ?) then you have no snapshot automation. Due to the way we structured independent filesets, this could be 50 or so to automate and we wanted to set a say 4 day retention policy. So clicking in the GUI was pretty simple to do this for. What we did found is it a snapshot failed to delete for some reason (quiesce etc), then the GUI never tried again to clean it up so we have monitoring to look for unexpected snapshots that needed cleaning up. Simon ________________________________ Simon Thompson He/Him/His Senior Storage Performance WW HPC Customer Solutions Lenovo UK [Phone]+44 7788 320635 [Email]sthompson2 at lenovo.com Lenovo.com Twitter | Instagram | Facebook | Linkedin | YouTube | Privacy [cid:image003.png at 01D81822.F63BAB90] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Kidger, Daniel Sent: 02 February 2022 10:07 To: gpfsug-discuss at spectrumscale.org Subject: [External] [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the creation, retention, and deletion of snapshots. My question is: do most customers use the former or the latter for automation? (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do exactly the same as what the GUI does it terms of creating automated snapshots. It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. How many customers also use the commands found in /usr/lpp/mmfs/gui/cli/ ? ) Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:image004.png at 01D81822.F63BAB90] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 20109 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 2541 bytes Desc: image004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-axuecxph Type: application/octet-stream Size: 2541 bytes Desc: Outlook-axuecxph URL: From anacreo at gmail.com Wed Feb 2 12:41:07 2022 From: anacreo at gmail.com (Alec) Date: Wed, 2 Feb 2022 04:41:07 -0800 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> Message-ID: Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? Alec On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser wrote: > keep in mind... creating many snapshots... means ;-) .. you'll have to > delete many snapshots.. > at a certain level, which depends on #files, #directories, ~workload, > #nodes, #networks etc.... we ve seen cases, where generating just full > snapshots (whole file system) is the better approach instead of > maintaining snapshots for each file set individually .. > > sure. this has other side effects , like space consumption etc... > so as always.. it depends.. > > > > > ----- Urspr?ngliche Nachricht ----- > Von: "Jan-Frode Myklebust" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org > An: "gpfsug main discussion list" > CC: > Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Datum: Mi, 2. Feb 2022 12:54 > > Also, if snapshotting multiple filesets, it's important to group these > into a single mmcrsnapshot command. Then you get a single quiesce, > instead of one per fileset. > > i.e. do: > > snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) > mmcrsnapshot gpfs0 > fileset1:$snapname,filset2:snapname,fileset3:snapname > > instead of: > > mmcrsnapshot gpfs0 fileset1:$snapname > mmcrsnapshot gpfs0 fileset2:$snapname > mmcrsnapshot gpfs0 fileset3:$snapname > > > -jf > > > On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou < > jordi.caubet at es.ibm.com> wrote: > > Ivano, > > if it happens frequently, I would recommend to open a support case. > > The creation or deletion of a snapshot requires a quiesce of the nodes to > obtain a consistent point-in-time image of the file system and/or update > some internal structures afaik. Quiesce is required for nodes at the > storage cluster but also remote clusters. Quiesce means stop activities > (incl. I/O) for a short period of time to get such consistent image. Also > waiting to flush any data in-flight to disk that does not allow a > consistent point-in-time image. > > Nodes receive a quiesce request and acknowledge when ready. When all nodes > acknowledge, snapshot operation can proceed and immediately I/O can resume. > It usually takes few seconds at most and the operation performed is short > but time I/O is stopped depends of how long it takes to quiesce the nodes. > If some node take longer to agree stop the activities, such node will > be delay the completion of the quiesce and keep I/O paused on the rest. > There could many things while some nodes delay quiesce ack. > > The larger the cluster, the more difficult it gets. The more network > congestion or I/O load, the more difficult it gets. I recommend to open a > ticket for support to try to identify the root cause of which nodes not > acknowledge the quiesce and maybe find the root cause. If I recall some > previous thread, default timeout was 60 seconds which match your log > message. After such timeout, snapshot is considered failed to complete. > > Support might help you understand the root cause and provide some > recommendations if it happens frequently. > > Best Regards, > -- > Jordi Caubet Serrabou > IBM Storage Client Technical Specialist (IBM Spain) > > > ----- Original message ----- > From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem > quiesce > Date: Wed, Feb 2, 2022 11:45 AM > > > Hello Andrew, > > > > Thanks for your questions. > > > > We're not experiencing any other issue/slowness during normal activity. > > The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool > for metadata only. > > > > The two NSD servers have 750GB of RAM and 618 are configured as pagepool. > > > > The issue we see is happening on both the two filesystems we have: > > > > - perf filesystem: > > - 1.8 PB size (71% in use) > > - 570 milions of inodes (24% in use) > > > > - tiered filesystem: > > - 400 TB size (34% in use) > > - 230 Milions of files (60% in use) > > > > Cheers, > > Ivano > > > > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> on behalf of Andrew Beattie < > abeattie at au1.ibm.com> > *Sent:* Wednesday, February 2, 2022 10:33 AM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Ivano, > > How big is the filesystem in terms of number of files? > How big is the filesystem in terms of capacity? > Is the Metadata on Flash or Spinning disk? > Do you see issues when users do an LS of the filesystem or only when you > are doing snapshots. > > How much memory do the NSD servers have? > How much is allocated to the OS / Spectrum > Scale Pagepool > > Regards > > Andrew Beattie > Technical Specialist - Storage for Big Data & AI > IBM Technology Group > IBM Australia & New Zealand > P. +61 421 337 927 > E. abeattie at au1.IBM.com > > > > > On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: > > > ? > > > Dear all, > > Since a while we are experiencing an issue when dealing with snapshots. > Basically what happens is that when deleting a fileset snapshot (and maybe > also when creating new ones) the filesystem becomes inaccessible on the > clients for the duration of the operation (can take a few minutes). > > The clients and the storage are on two different clusters, using remote > cluster mount for the access. > > On the log files many lines like the following appear (on both clusters): > Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 > msec > > By looking around I see we're not the first one. I am wondering if that's > considered an unavoidable part of the snapshotting and if there's any > tunable that can improve the situation. Since when this occurs all the > clients are stuck and users are very quick to complain. > > If it can help, the clients are running GPFS 5.1.2-1 while the storage > cluster is on 5.1.1-0. > > Thanks, > Ivano > > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: > > International Business Machines, S.A. > > Santa Hortensia, 26-28, 28002 Madrid > > Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 > > CIF A28-010791 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:55:52 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:55:52 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch>, , , Message-ID: <8d51042ed95b461fb2be3dc33dac030a@psi.ch> Hi Jordi, thanks for the explanation, I can now see better why something like that would happen. Indeed the cluster has a lot of clients, coming via different clusters and even some NFS/SMB via protocol nodes. So I think opening a case makes a lot of sense to track it down. Not sure how we can make the debug transparent to the users, but we'll see. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Jordi Caubet Serrabou Sent: Wednesday, February 2, 2022 12:07 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:57:32 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:57:32 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> , Message-ID: Sure, that makes a lot of sense and we were already doing in that way. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Jan-Frode Myklebust Sent: Wednesday, February 2, 2022 12:53 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 12:59:30 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 12:59:30 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: , <4326cfae883b4378bcb284b6daecb05e@psi.ch>, Message-ID: Ok that sounds a good candidate for an improvement. Thanks. We didn't want to do a full filesystem snapshot for the space consumption indeed. But we may consider it, keeping an eye on the space. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Olaf Weiser Sent: Wednesday, February 2, 2022 1:09 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. sure. this has other side effects , like space consumption etc... so as always.. it depends.. ----- Urspr?ngliche Nachricht ----- Von: "Jan-Frode Myklebust" Gesendet von: gpfsug-discuss-bounces at spectrumscale.org An: "gpfsug main discussion list" CC: Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Datum: Mi, 2. Feb 2022 12:54 Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Feb 2 13:03:13 2022 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe (PSI)) Date: Wed, 2 Feb 2022 13:03:13 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: References: <4326cfae883b4378bcb284b6daecb05e@psi.ch> , Message-ID: That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them to agree. Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Alec Sent: Wednesday, February 2, 2022 1:41 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? Alec On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser > wrote: keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. sure. this has other side effects , like space consumption etc... so as always.. it depends.. ----- Urspr?ngliche Nachricht ----- Von: "Jan-Frode Myklebust" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org An: "gpfsug main discussion list" > CC: Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Datum: Mi, 2. Feb 2022 12:54 Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname instead of: mmcrsnapshot gpfs0 fileset1:$snapname mmcrsnapshot gpfs0 fileset2:$snapname mmcrsnapshot gpfs0 fileset3:$snapname -jf On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou > wrote: Ivano, if it happens frequently, I would recommend to open a support case. The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. There could many things while some nodes delay quiesce ack. The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. Support might help you understand the root cause and provide some recommendations if it happens frequently. Best Regards, -- Jordi Caubet Serrabou IBM Storage Client Technical Specialist (IBM Spain) ----- Original message ----- From: "Talamo Ivano Giuseppe (PSI)" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce Date: Wed, Feb 2, 2022 11:45 AM Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The issue we see is happening on both the two filesystems we have: - perf filesystem: - 1.8 PB size (71% in use) - 570 milions of inodes (24% in use) - tiered filesystem: - 400 TB size (34% in use) - 230 Milions of files (60% in use) Cheers, Ivano __________________________________________ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Andrew Beattie > Sent: Wednesday, February 2, 2022 10:33 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How much is allocated to the OS / Spectrum Scale Pagepool Regards Andrew Beattie Technical Specialist - Storage for Big Data & AI IBM Technology Group IBM Australia & New Zealand P. +61 421 337 927 E. abeattie at au1.IBM.com On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) > wrote: ? Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). The clients and the storage are on two different clusters, using remote cluster mount for the access. On the log files many lines like the following appear (on both clusters): Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. Thanks, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi.caubet at es.ibm.com Wed Feb 2 13:34:20 2022 From: jordi.caubet at es.ibm.com (Jordi Caubet Serrabou) Date: Wed, 2 Feb 2022 13:34:20 +0000 Subject: [gpfsug-discuss] snapshots causing filesystem quiesce In-Reply-To: Message-ID: Maybe some colleagues at IBM devel can correct me, but pagepool size should not make much difference. Afaik, it is mostly read cache data. Another think could be if using HAWC function, I am not sure in such case. Anyhow, looking at your node name, your system seems a DSS from Lenovo so you NSD servers are running GPFS Native RAID and the reason why the pagepool is large there, not for the NSD server role itself, it is for the GNR role that caches disk tracks. Lowering will impact performance. -- Jordi Caubet Serrabou IBM Software Defined Infrastructure (SDI) and Flash Technical Sales Specialist Technical Computing and HPC IT Specialist and Architect > On 2 Feb 2022, at 14:03, Talamo Ivano Giuseppe (PSI) wrote: > > ? > That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. > > Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them to agree. > > > > Cheers, > > Ivano > > > > __________________________________________ > Paul Scherrer Institut > Ivano Talamo > WHGA/038 > Forschungsstrasse 111 > 5232 Villigen PSI > Schweiz > > Telefon: +41 56 310 47 11 > E-Mail: ivano.talamo at psi.ch > > > > From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Alec > Sent: Wednesday, February 2, 2022 1:41 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce > > Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. > > Perhaps the pagepool is so large that the time it takes to clear that much RAM is the actual time out? > > My environment has only a million files but has quite a bit more storage and has only an 8gb pagepool. Seems you are saying you have 618gb of RAM for pagepool... Even at 8GB/second that would take 77 seconds to flush it out.. > > Perhaps drop the pagepool in half and see if your timeout adjusts accordingly? > > Alec > > >> On Wed, Feb 2, 2022, 4:09 AM Olaf Weiser wrote: >> keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. >> at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc.... we ve seen cases, where generating just full snapshots (whole file system) is the better approach instead of maintaining snapshots for each file set individually .. >> >> sure. this has other side effects , like space consumption etc... >> so as always.. it depends.. >> >> >> >> ----- Urspr?ngliche Nachricht ----- >> Von: "Jan-Frode Myklebust" >> Gesendet von: gpfsug-discuss-bounces at spectrumscale.org >> An: "gpfsug main discussion list" >> CC: >> Betreff: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> Datum: Mi, 2. Feb 2022 12:54 >> >> Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. >> >> i.e. do: >> >> snapname=$(date --utc + at GMT-%Y.%m.%d-%H.%M.%S) >> mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:snapname >> >> instead of: >> >> mmcrsnapshot gpfs0 fileset1:$snapname >> mmcrsnapshot gpfs0 fileset2:$snapname >> mmcrsnapshot gpfs0 fileset3:$snapname >> >> >> -jf >> >> >> On Wed, Feb 2, 2022 at 12:07 PM Jordi Caubet Serrabou wrote: >> Ivano, >> >> if it happens frequently, I would recommend to open a support case. >> >> The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the storage cluster but also remote clusters. Quiesce means stop activities (incl. I/O) for a short period of time to get such consistent image. Also waiting to flush any data in-flight to disk that does not allow a consistent point-in-time image. >> >> Nodes receive a quiesce request and acknowledge when ready. When all nodes acknowledge, snapshot operation can proceed and immediately I/O can resume. It usually takes few seconds at most and the operation performed is short but time I/O is stopped depends of how long it takes to quiesce the nodes. If some node take longer to agree stop the activities, such node will be delay the completion of the quiesce and keep I/O paused on the rest. >> There could many things while some nodes delay quiesce ack. >> >> The larger the cluster, the more difficult it gets. The more network congestion or I/O load, the more difficult it gets. I recommend to open a ticket for support to try to identify the root cause of which nodes not acknowledge the quiesce and maybe find the root cause. If I recall some previous thread, default timeout was 60 seconds which match your log message. After such timeout, snapshot is considered failed to complete. >> >> Support might help you understand the root cause and provide some recommendations if it happens frequently. >> >> Best Regards, >> -- >> Jordi Caubet Serrabou >> IBM Storage Client Technical Specialist (IBM Spain) >> >> ----- Original message ----- >> From: "Talamo Ivano Giuseppe (PSI)" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "gpfsug main discussion list" >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> Date: Wed, Feb 2, 2022 11:45 AM >> >> Hello Andrew, >> >> >> >> Thanks for your questions. >> >> >> >> We're not experiencing any other issue/slowness during normal activity. >> >> The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. >> >> >> >> The two NSD servers have 750GB of RAM and 618 are configured as pagepool. >> >> >> >> The issue we see is happening on both the two filesystems we have: >> >> >> >> - perf filesystem: >> >> - 1.8 PB size (71% in use) >> >> - 570 milions of inodes (24% in use) >> >> >> >> - tiered filesystem: >> >> - 400 TB size (34% in use) >> >> - 230 Milions of files (60% in use) >> >> >> >> Cheers, >> >> Ivano >> >> >> >> >> >> >> >> __________________________________________ >> Paul Scherrer Institut >> Ivano Talamo >> WHGA/038 >> Forschungsstrasse 111 >> 5232 Villigen PSI >> Schweiz >> >> Telefon: +41 56 310 47 11 >> E-Mail: ivano.talamo at psi.ch >> >> >> >> >> From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Andrew Beattie >> Sent: Wednesday, February 2, 2022 10:33 AM >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] snapshots causing filesystem quiesce >> >> Ivano, >> >> How big is the filesystem in terms of number of files? >> How big is the filesystem in terms of capacity? >> Is the Metadata on Flash or Spinning disk? >> Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. >> >> How much memory do the NSD servers have? >> How much is allocated to the OS / Spectrum >> Scale Pagepool >> >> Regards >> >> Andrew Beattie >> Technical Specialist - Storage for Big Data & AI >> IBM Technology Group >> IBM Australia & New Zealand >> P. +61 421 337 927 >> E. abeattie at au1.IBM.com >> >> >> >>> >>> On 2 Feb 2022, at 19:14, Talamo Ivano Giuseppe (PSI) wrote: >>> >>> ? >>> >>> >>> Dear all, >>> >>> Since a while we are experiencing an issue when dealing with snapshots. >>> Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few minutes). >>> >>> The clients and the storage are on two different clusters, using remote cluster mount for the access. >>> >>> On the log files many lines like the following appear (on both clusters): >>> Snapshot whole quiesce of SG perf from xbldssio1 on this node lasted 60166 msec >>> >>> By looking around I see we're not the first one. I am wondering if that's considered an unavoidable part of the snapshotting and if there's any tunable that can improve the situation. Since when this occurs all the clients are stuck and users are very quick to complain. >>> >>> If it can help, the clients are running GPFS 5.1.2-1 while the storage cluster is on 5.1.1-0. >>> >>> Thanks, >>> Ivano >>> >>> >>> >>> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: >> >> International Business Machines, S.A. >> >> Santa Hortensia, 26-28, 28002 Madrid >> >> Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 >> >> CIF A28-010791 >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss Salvo indicado de otro modo m?s arriba / Unless stated otherwise above: International Business Machines, S.A. Santa Hortensia, 26-28, 28002 Madrid Registro Mercantil de Madrid; Folio 1; Tomo 1525; Hoja M-28146 CIF A28-010791 -------------- next part -------------- An HTML attachment was scrubbed... URL: From juergen.hannappel at desy.de Wed Feb 2 15:04:24 2022 From: juergen.hannappel at desy.de (Hannappel, Juergen) Date: Wed, 2 Feb 2022 16:04:24 +0100 (CET) Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: References: Message-ID: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> Hi, I use a python script via cron job, it checks how many snapshots exist and removes those that exceed a configurable limit, then creates a new one. Deployed via puppet it's much less hassle than click around in a GUI/ > From: "Kidger, Daniel" > To: "gpfsug main discussion list" > Sent: Wednesday, 2 February, 2022 11:07:25 > Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? > Hi all, > Since the subject of snapshots has come up, I also have a question ... > Snapshots can be created from the command line with mmcrsnapshot, and hence can > be automated via con jobs etc. > Snapshots can also be created from the Scale GUI. The GUI also provides its own > automation for the creation, retention, and deletion of snapshots. > My question is: do most customers use the former or the latter for automation? > (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do > exactly the same as what the GUI does it terms of creating automated snapshots. > It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. > How many customers also use the commands found in /usr/lpp/mmfs/gui/cli / ? ) > Daniel > Daniel Kidger > HPC Storage Solutions Architect, EMEA > [ mailto:daniel.kidger at hpe.com | daniel.kidger at hpe.com ] > +44 (0)7818 522266 > [ http://www.hpe.com/ | hpe.com ] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-iity4nk4 Type: image/png Size: 2541 bytes Desc: Outlook-iity4nk4 URL: From mark.bergman at uphs.upenn.edu Wed Feb 2 16:09:02 2022 From: mark.bergman at uphs.upenn.edu (mark.bergman at uphs.upenn.edu) Date: Wed, 02 Feb 2022 11:09:02 -0500 Subject: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: Your message of "Wed, 02 Feb 2022 10:07:25 +0000." References: Message-ID: <1971435-1643818142.818836@ATIP.bjhn.uBcv> Big vote for cron jobs. Our snapshot are created by a script, installed on each GPFS node. The script handles naming, removing old snapshots, checking that sufficient disk space exists before creating a snapshot, etc. We do snapshots every 15 minutes, keeping them with lower frequency over longer intervals. For example: current hour: keep 4 snapshots hours -2 .. -8 keep 3 snapshots per hour hours -8 .. -24 keep 2 snapshots per hour days -1 .. -5 keep 1 snapshot per hour days -5 .. -15 keep 4 snapshots per day days -15 .. -30 keep 1 snapshot per day the duration & frequency & minimum disk space can be adjusted per-filesystem. The automation is done through a cronjob that runs on each GPFS (DSS-G) server to create the snapshot only if the node is currently the cluster master, as in: */15 * * * * root mmlsmgr -Y | grep -q "clusterManager.*:$(hostname --long):" && /path/to/snapshotter This requires no locking and ensures that only a single instance of snapshots is created at each time interval. We use the same trick to gather GPFS health stats, etc., ensuring that the data collection only runs on a single node (the cluster manager). -- Mark Bergman voice: 215-746-4061 mark.bergman at pennmedicine.upenn.edu fax: 215-614-0266 http://www.med.upenn.edu/cbica/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania From info at odina.nl Wed Feb 2 16:22:47 2022 From: info at odina.nl (Jaap Jan Ouwehand) Date: Wed, 02 Feb 2022 17:22:47 +0100 Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? In-Reply-To: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> References: <679823632.5186930.1643814264071.JavaMail.zimbra@desy.de> Message-ID: <9CD60B1D-5BF8-4BBD-9F9D-A872D89EE9C4@odina.nl> Hi, I also used a custom script (database driven) via cron which creates many fileset snapshots during the day via the "default helper nodes". Because of the iops, the oldest snapshots are deleted at night. Perhaps it's a good idea to take one global filesystem snapshot and make it available to the filesets with mmsnapdir. Kind regards, Jaap Jan Ouwehand "Hannappel, Juergen" schreef op 2 februari 2022 16:04:24 CET: >Hi, >I use a python script via cron job, it checks how many snapshots exist and removes those that >exceed a configurable limit, then creates a new one. >Deployed via puppet it's much less hassle than click around in a GUI/ > >> From: "Kidger, Daniel" >> To: "gpfsug main discussion list" >> Sent: Wednesday, 2 February, 2022 11:07:25 >> Subject: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ? > >> Hi all, > >> Since the subject of snapshots has come up, I also have a question ... > >> Snapshots can be created from the command line with mmcrsnapshot, and hence can >> be automated via con jobs etc. >> Snapshots can also be created from the Scale GUI. The GUI also provides its own >> automation for the creation, retention, and deletion of snapshots. > >> My question is: do most customers use the former or the latter for automation? > >> (I also note that /usr/lpp/mmfs/gui/cli/mksnaprule exists and appears to do >> exactly the same as what the GUI does it terms of creating automated snapshots. >> It is a relic of V7000 Unified but still works fine in Spectrum Scale 5.1.2.2. >> How many customers also use the commands found in /usr/lpp/mmfs/gui/cli / ? ) > >> Daniel > >> Daniel Kidger >> HPC Storage Solutions Architect, EMEA >> [ mailto:daniel.kidger at hpe.com | daniel.kidger at hpe.com ] > >> +44 (0)7818 522266 > >> [ http://www.hpe.com/ | hpe.com ] > >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.ward at nhm.ac.uk Mon Feb 7 16:39:25 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 7 Feb 2022 16:39:25 +0000 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: Backups seem to have settled down. A workshop with our partner and IBM is in the pipeline. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Paul Ward Sent: 01 February 2022 12:28 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Not currently set. I'll look into them. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: 26 January 2022 16:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmbackup file selections Awesome, glad that you found them (I missed them the first time too). As for the anomalous changed files, do you have these options set in your client option file? skipacl yes skipaclupdatecheck yes updatectime yes We had similar problems where metadata and ACL updates were interpreted as data changes by mmbackup/dsmc. We also have a case open with IBM where mmbackup will both expire and backup a file in the same run, even in the absence of mtime changes, but it's unclear whether that's program error or something with our include/exclude rules. I'd be curious if you're running into that as well. On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > Good call! > > Yes they are dot files. > > > New issue. > > Mmbackup seems to be backup up the same files over and over without them changing: > areas are being backed up multiple times. > The example below is a co-resident file, the only thing that has changed since it was created 20/10/21, is the file has been accessed for backup. > This file is in the 'changed' list in mmbackup: > > This list has just been created: > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > Listing the last few files in the file (selecting the last one) > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > Check the file stats (access time just before last backup) > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File: '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > Size: 545 Blocks: 32 IO Block: 4194304 regular file > Device: 2bh/43d Inode: 212618897 Links: 1 > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: (1399647564/NHM\dg-mbl-urban-nature-project-rw) > Context: unconfined_u:object_r:unlabeled_t:s0 > Access: 2022-01-25 06:40:58.334961446 +0000 > Modify: 2020-12-01 15:20:40.122053000 +0000 > Change: 2021-10-20 17:55:18.265746459 +0100 > Birth: - > > Check if migrated > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > File name : /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > On-line size : 545 > Used blocks : 16 > Data Version : 1 > Meta Version : 1 > State : Co-resident > Container Index : 1 > Base Name : 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > Check if immutable > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > file name: /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > metadata replication: 2 max 2 > data replication: 2 max 2 > immutable: no > appendOnly: no > flags: > storage pool name: data > fileset name: hpc-workspaces-fset > snapshot name: > creation time: Wed Oct 20 17:55:18 2021 > Misc attributes: ARCHIVE > Encrypted: no > > Check active and inactive backups (it was backed up yesterday) > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 11:19:02 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > 11:07:05 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/25/2022 06:41:17 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > It will be backed up again shortly, why? > > And it was backed up again: > # dsmcqbi > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > IBM Spectrum Protect > Command Line Backup-Archive Client Interface > Client Version 8, Release 1, Level 10.0 > Client date/time: 01/26/2022 15:54:09 > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights Reserved. > > Node Name: SC-PN-SK-01 > Session established with server TSM-JERSEY: Windows > Server Version 8, Release 1, Level 10.100 > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > 15:30:03 > > Accessing as node: SCALE > Size Backup Date Mgmt Class A/I File > ---- ----------- ---------- --- ---- > 545 B 01/26/2022 12:23:02 DEFAULT A /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 12/28/2021 21:19:18 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:17:35 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/04/2022 06:18:05 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > 545 B 01/25/2022 06:41:17 DEFAULT I /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Skylar > Thompson > Sent: 24 January 2022 15:37 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Hi Paul, > > Did you look for dot files? At least for us on 5.0.5 there's a .list.1. file while the backups are running: > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > Those directories are empty > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of IBM Spectrum > > Scale > > Sent: 22 January 2022 00:35 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > Instead of calculating *.ix.* files, please look at a list file in these directories. > > > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked.]"Paul Ward" ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to have worked. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/21/2022 09:38 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > of the script I now copy the contents of the .mmbackupCfg folder to > > a date stamped logging folder Checking how many entries in these files compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you > > > > Right in the command line seems to have worked. > > At the end of the script I now copy the contents of the .mmbackupCfg > > folder to a date stamped logging folder > > > > Checking how many entries in these files compared to the Summary: > > wc -l mmbackup* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 754 total > > From Summary > > Total number of objects inspected: 755 > > I can live with a discrepancy of 1. > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > From Summary > > Total number of objects expired: 2 > > That matches > > > > wc -l mmbackupC* mmbackupS* > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > 752 total > > Summary: > > Total number of objects backed up: 751 > > > > A difference of 1 I can live with. > > > > What does Statech stand for? > > > > Just this to sort out: > > Total number of objects failed: 1 > > I will add: > > --tsm-errorlog TSMErrorLogFile > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 19 January 2022 15:09 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > This is to set environment for mmbackup. > > If mmbackup is invoked within a script, you can set "export DEBUGmmbackup=2" right above mmbackup command. > > e.g) in your script > > .... > > export DEBUGmmbackup=2 > > mmbackup .... > > > > Or, you can set it in the same command line like > > DEBUGmmbackup=2 mmbackup .... > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to se]"Paul Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our nodes that checks to see if they are the cluster manager. > > > > From: "Paul Ward" > > > To: "gpfsug main discussion list" > > > org>> > > Cc: > > "gpfsug-discuss-bounces at spectrumscale.org > ce > > s at spectrumscale.org>" > > > ce > > s at spectrumscale.org>> > > Date: 01/19/2022 06:04 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > Sent > > by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > they are the cluster manager. If they are, then they take > > responsibility to start the backup script. The script then randomly selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Thank you. > > > > We run a script on all our nodes that checks to see if they are the cluster manager. > > If they are, then they take responsibility to start the backup script. > > The script then randomly selects one of the available backup nodes and uses dsmsh mmbackup on it. > > > > Where does this command belong? > > I have seen it listed as a export command, again where should that be run ? on all backup nodes, or all nodes? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > > > From: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > Sent: 18 January 2022 22:54 > > To: gpfsug main discussion list > > > org>> > > Cc: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files even after successful backup. They are available at MMBACKUP_RECORD_ROOT (default is FSroot or FilesetRoot directory). > > In .mmbackupCfg directory, there are 3 directories: > > updatedFiles : contains a file that lists all candidates for backup > > statechFiles : cantains a file that lists all candidates for meta > > info update expiredFiles : cantains a file that lists all > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------- > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to back]"Paul Ward" ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have been sent to backup using mmbackup. > > > > From: "Paul Ward" > > > To: > > "gpfsug-discuss at spectrumscale.org > org>" > > > org>> > > Date: 01/18/2022 11:56 AM > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > gpfsug-discuss-bounces at spectrumscale.org > es > > @spectrumscale.org> > > > > ________________________________ > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > using mmbackup. I have increased the -L value from 3 up to 6 but > > only seem to see the files that are in scope, not the ones that are selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > I am trying to work out what files have been sent to backup using mmbackup. > > I have increased the -L value from 3 up to 6 but only seem to see the files that are in scope, not the ones that are selected. > > > > I can see the three file lists generated during a backup, but can?t seem to find a list of what files were backed up. > > > > It should be the diff of the shadow and shadow-old, but the wc -l of the diff doesn?t match the number of files in the backup summary. > > Wrong assumption? > > > > Where should I be looking ? surely it shouldn?t be this hard to see what files are selected? > > > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > [A picture containing drawing Description automatically generated] > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > er > > ved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > ed > > =0 > gp > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > wa > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > a0 > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > ey > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > 00 > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > es > > erved=0> > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > 25 > > 2F > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > 7f > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > d8 > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > C2 > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > ;r > > eserved=0> > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > su > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > d% > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > 4c > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > WI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > &a > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > rv > > ed=0 > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > rved=0 -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 From anacreo at gmail.com Mon Feb 7 17:42:36 2022 From: anacreo at gmail.com (Alec) Date: Mon, 7 Feb 2022 09:42:36 -0800 Subject: [gpfsug-discuss] mmbackup file selections In-Reply-To: References: <20220124153631.oxu4ytbq4vqcotr3@utumno.gs.washington.edu> <20220126165013.z7vo3m4d666el7wr@utumno.gs.washington.edu> Message-ID: I'll share something we do when working with the GPFS policy engine so we don't blow out our backups... So we use a different backup in solution and have our file system broken down into multiple concurrent streams. In my policy engine when making major changes to the file system such as encrypting or compressing data I use a where clause such as: MOD(INODE, 7)<=dayofweek When we call mmpolicy I add -M dayofweek=NN. In this case I'd use cron and pass day of the week. What this achieves is that on each day I only work on 1/7th of each file system... So that no one backup stream is blown out. It is cumulative so 7+ will work on 100% of the file system. It's a nifty trick so figured I'd share it out. In production we do something more like 40, and set shares to increment by 1 on weekdays and 3 on weekends to distribute workload out over the whole month with more work on the weekends. Alec On Mon, Feb 7, 2022, 8:39 AM Paul Ward wrote: > Backups seem to have settled down. > A workshop with our partner and IBM is in the pipeline. > > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Paul Ward > Sent: 01 February 2022 12:28 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Not currently set. I'll look into them. > > > Kindest regards, > Paul > > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: p.ward at nhm.ac.uk > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Skylar Thompson > Sent: 26 January 2022 16:50 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] mmbackup file selections > > Awesome, glad that you found them (I missed them the first time too). > > As for the anomalous changed files, do you have these options set in your > client option file? > > skipacl yes > skipaclupdatecheck yes > updatectime yes > > We had similar problems where metadata and ACL updates were interpreted as > data changes by mmbackup/dsmc. > > We also have a case open with IBM where mmbackup will both expire and > backup a file in the same run, even in the absence of mtime changes, but > it's unclear whether that's program error or something with our > include/exclude rules. I'd be curious if you're running into that as well. > > On Wed, Jan 26, 2022 at 03:55:48PM +0000, Paul Ward wrote: > > Good call! > > > > Yes they are dot files. > > > > > > New issue. > > > > Mmbackup seems to be backup up the same files over and over without them > changing: > > areas are being backed up multiple times. > > The example below is a co-resident file, the only thing that has changed > since it was created 20/10/21, is the file has been accessed for backup. > > This file is in the 'changed' list in mmbackup: > > > > This list has just been created: > > -rw-r--r--. 1 root root 6591914 Jan 26 11:12 > > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > > > Listing the last few files in the file (selecting the last one) > > 11:17:52 [root at scale-sk-pn-1 .mmbackupCfg]# tail > > mmbackupChanged.ix.197984.22A38AA7.39.nhmfsa > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604556977.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557039.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557102.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557164.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557226.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557288.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557351.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557413.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557476.png" > > > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > > > Check the file stats (access time just before last backup) > > 11:18:05 [root at scale-sk-pn-1 .mmbackupCfg]# stat > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > File: > '/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png' > > Size: 545 Blocks: 32 IO Block: 4194304 regular file > > Device: 2bh/43d Inode: 212618897 Links: 1 > > Access: (0644/-rw-r--r--) Uid: (1399613896/NHM\edwab) Gid: > (1399647564/NHM\dg-mbl-urban-nature-project-rw) > > Context: unconfined_u:object_r:unlabeled_t:s0 > > Access: 2022-01-25 06:40:58.334961446 +0000 > > Modify: 2020-12-01 15:20:40.122053000 +0000 > > Change: 2021-10-20 17:55:18.265746459 +0100 > > Birth: - > > > > Check if migrated > > 11:18:16 [root at scale-sk-pn-1 .mmbackupCfg]# dsmls > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > File name : > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > On-line size : 545 > > Used blocks : 16 > > Data Version : 1 > > Meta Version : 1 > > State : Co-resident > > Container Index : 1 > > Base Name : > 34C0B77D20194B0B.EACEB2055F6CAA58.56D56C5F140C8C9D.0000000000000000.2197396D.000000000CAC4E91 > > > > Check if immutable > > 11:18:26 [root at scale-sk-pn-1 .mmbackupCfg]# mstat > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > file name: > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > metadata replication: 2 max 2 > > data replication: 2 max 2 > > immutable: no > > appendOnly: no > > flags: > > storage pool name: data > > fileset name: hpc-workspaces-fset > > snapshot name: > > creation time: Wed Oct 20 17:55:18 2021 > > Misc attributes: ARCHIVE > > Encrypted: no > > > > Check active and inactive backups (it was backed up yesterday) > > 11:18:52 [root at scale-sk-pn-1 .mmbackupCfg]# dsmcqbi > "/gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png" > > IBM Spectrum Protect > > Command Line Backup-Archive Client Interface > > Client Version 8, Release 1, Level 10.0 > > Client date/time: 01/26/2022 11:19:02 > > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights > Reserved. > > > > Node Name: SC-PN-SK-01 > > Session established with server TSM-JERSEY: Windows > > Server Version 8, Release 1, Level 10.100 > > Server date/time: 01/26/2022 11:19:02 Last access: 01/26/2022 > > 11:07:05 > > > > Accessing as node: SCALE > > Size Backup Date Mgmt Class > A/I File > > ---- ----------- ---------- > --- ---- > > 545 B 01/25/2022 06:41:17 DEFAULT > A > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 12/28/2021 21:19:18 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:17:35 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:18:05 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > > > > It will be backed up again shortly, why? > > > > And it was backed up again: > > # dsmcqbi > > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature- > > project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > IBM Spectrum Protect > > Command Line Backup-Archive Client Interface > > Client Version 8, Release 1, Level 10.0 > > Client date/time: 01/26/2022 15:54:09 > > (c) Copyright by IBM Corporation and other(s) 1990, 2020. All Rights > Reserved. > > > > Node Name: SC-PN-SK-01 > > Session established with server TSM-JERSEY: Windows > > Server Version 8, Release 1, Level 10.100 > > Server date/time: 01/26/2022 15:54:10 Last access: 01/26/2022 > > 15:30:03 > > > > Accessing as node: SCALE > > Size Backup Date Mgmt Class > A/I File > > ---- ----------- ---------- > --- ---- > > 545 B 01/26/2022 12:23:02 DEFAULT > A > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 12/28/2021 21:19:18 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:17:35 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/04/2022 06:18:05 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > 545 B 01/25/2022 06:41:17 DEFAULT > I > /gpfs/nhmfsa/bulk/share/data/mbl/share/workspaces/groups/urban-nature-project/audiowaveform/300_40/unp-grounds-01-1604557538.png > > > > Kindest regards, > > Paul > > > > Paul Ward > > TS Infrastructure Architect > > Natural History Museum > > T: 02079426450 > > E: p.ward at nhm.ac.uk > > > > > > -----Original Message----- > > From: gpfsug-discuss-bounces at spectrumscale.org > > On Behalf Of Skylar > > Thompson > > Sent: 24 January 2022 15:37 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > Hi Paul, > > > > Did you look for dot files? At least for us on 5.0.5 there's a > .list.1. file while the backups are running: > > > > /gpfs/grc6/.mmbackupCfg/updatedFiles/: > > -r-------- 1 root nickers 6158526821 Jan 23 18:28 .list.1.gpfs-grc6 > > /gpfs/grc6/.mmbackupCfg/expiredFiles/: > > -r-------- 1 root nickers 85862211 Jan 23 18:28 .list.1.gpfs-grc6 > > > > On Mon, Jan 24, 2022 at 02:31:54PM +0000, Paul Ward wrote: > > > Those directories are empty > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: gpfsug-discuss-bounces at spectrumscale.org > > > On Behalf Of IBM Spectrum > > > Scale > > > Sent: 22 January 2022 00:35 > > > To: gpfsug main discussion list > > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > > > > Hi Paul, > > > > > > Instead of calculating *.ix.* files, please look at a list file in > these directories. > > > > > > updatedFiles : contains a file that lists all candidates for backup > > > statechFiles : cantains a file that lists all candidates for meta > > > info update expiredFiles : cantains a file that lists all > > > candidates for expiration > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > > > > [Inactive hide details for "Paul Ward" ---01/21/2022 09:38:49 > AM---Thank you Right in the command line seems to have worked.]"Paul Ward" > ---01/21/2022 09:38:49 AM---Thank you Right in the command line seems to > have worked. > > > > > > From: "Paul Ward" > > > > To: "gpfsug main discussion list" > > > > > org>> > > > Cc: > > > "gpfsug-discuss-bounces at spectrumscale.org > > ce > > > s at spectrumscale.org>" > > > > > ce > > > s at spectrumscale.org>> > > > Date: 01/21/2022 09:38 AM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > > Sent > > > by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > Thank you Right in the command line seems to have worked. At the end > > > of the script I now copy the contents of the .mmbackupCfg folder to > > > a date stamped logging folder Checking how many entries in these files > compared to the Summary: ???????ZjQcmQRYFpfptBannerStart This Message Is > From an External Sender This message came from outside your organization. > > > ZjQcmQRYFpfptBannerEnd > > > Thank you > > > > > > Right in the command line seems to have worked. > > > At the end of the script I now copy the contents of the .mmbackupCfg > > > folder to a date stamped logging folder > > > > > > Checking how many entries in these files compared to the Summary: > > > wc -l mmbackup* > > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > > 754 total > > > From Summary > > > Total number of objects inspected: 755 > > > I can live with a discrepancy of 1. > > > > > > 2 mmbackupExpired.ix.78683.2DD25239.1.nhmfsa > > > From Summary > > > Total number of objects expired: 2 > > > That matches > > > > > > wc -l mmbackupC* mmbackupS* > > > 188 mmbackupChanged.ix.155513.6E9E8BE2.1.nhmfsa > > > 47 mmbackupChanged.ix.219901.8E89AB35.1.nhmfsa > > > 188 mmbackupChanged.ix.37893.EDFB8FA7.1.nhmfsa > > > 40 mmbackupChanged.ix.81032.78717A00.1.nhmfsa > > > 141 mmbackupStatech.ix.219901.8E89AB35.1.nhmfsa > > > 148 mmbackupStatech.ix.81032.78717A00.1.nhmfsa > > > 752 total > > > Summary: > > > Total number of objects backed up: 751 > > > > > > A difference of 1 I can live with. > > > > > > What does Statech stand for? > > > > > > Just this to sort out: > > > Total number of objects failed: 1 > > > I will add: > > > --tsm-errorlog TSMErrorLogFile > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > > Sent: 19 January 2022 15:09 > > > To: gpfsug main discussion list > > > > > org>> > > > Cc: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > > > > This is to set environment for mmbackup. > > > If mmbackup is invoked within a script, you can set "export > DEBUGmmbackup=2" right above mmbackup command. > > > e.g) in your script > > > .... > > > export DEBUGmmbackup=2 > > > mmbackup .... > > > > > > Or, you can set it in the same command line like > > > DEBUGmmbackup=2 mmbackup .... > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/19/2022 06:04:03 > AM---Thank you. We run a script on all our nodes that checks to se]"Paul > Ward" ---01/19/2022 06:04:03 AM---Thank you. We run a script on all our > nodes that checks to see if they are the cluster manager. > > > > > > From: "Paul Ward" > > > > To: "gpfsug main discussion list" > > > > > org>> > > > Cc: > > > "gpfsug-discuss-bounces at spectrumscale.org > > ce > > > s at spectrumscale.org>" > > > > > ce > > > s at spectrumscale.org>> > > > Date: 01/19/2022 06:04 AM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup file selections > > > Sent > > > by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > > > > Thank you. We run a script on all our nodes that checks to see if > > > they are the cluster manager. If they are, then they take > > > responsibility to start the backup script. The script then randomly > selects one of the available backup nodes and uses ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender This message came from outside your > organization. > > > ZjQcmQRYFpfptBannerEnd > > > Thank you. > > > > > > We run a script on all our nodes that checks to see if they are the > cluster manager. > > > If they are, then they take responsibility to start the backup script. > > > The script then randomly selects one of the available backup nodes and > uses dsmsh mmbackup on it. > > > > > > Where does this command belong? > > > I have seen it listed as a export command, again where should that be > run ? on all backup nodes, or all nodes? > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > > > > From: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > ce s at spectrumscale.org>> On Behalf Of IBM Spectrum Scale > > > Sent: 18 January 2022 22:54 > > > To: gpfsug main discussion list > > > > > org>> > > > Cc: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > Subject: Re: [gpfsug-discuss] mmbackup file selections > > > > > > Hi Paul, > > > > > > If you run mmbackup with "DEBUGmmbackup=2", it keeps all working files > even after successful backup. They are available at MMBACKUP_RECORD_ROOT > (default is FSroot or FilesetRoot directory). > > > In .mmbackupCfg directory, there are 3 directories: > > > updatedFiles : contains a file that lists all candidates for backup > > > statechFiles : cantains a file that lists all candidates for meta > > > info update expiredFiles : cantains a file that lists all > > > candidates for expiration > > > > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > -------------------------------------------------------------------- > > > -- > > > -------------------------------------------- > > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > > > > > [Inactive hide details for "Paul Ward" ---01/18/2022 11:56:40 AM---Hi, > I am trying to work out what files have been sent to back]"Paul Ward" > ---01/18/2022 11:56:40 AM---Hi, I am trying to work out what files have > been sent to backup using mmbackup. > > > > > > From: "Paul Ward" > > > > To: > > > "gpfsug-discuss at spectrumscale.org > > org>" > > > > > org>> > > > Date: 01/18/2022 11:56 AM > > > Subject: [EXTERNAL] [gpfsug-discuss] mmbackup file selections Sent by: > > > gpfsug-discuss-bounces at spectrumscale.org > > es > > > @spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > > > > > > > Hi, I am trying to work out what files have been sent to backup > > > using mmbackup. I have increased the -L value from 3 up to 6 but > > > only seem to see the files that are in scope, not the ones that are > selected. I can see the three file lists generated ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender This message came from outside your > organization. > > > ZjQcmQRYFpfptBannerEnd > > > Hi, > > > > > > I am trying to work out what files have been sent to backup using > mmbackup. > > > I have increased the -L value from 3 up to 6 but only seem to see the > files that are in scope, not the ones that are selected. > > > > > > I can see the three file lists generated during a backup, but can?t > seem to find a list of what files were backed up. > > > > > > It should be the diff of the shadow and shadow-old, but the wc -l of > the diff doesn?t match the number of files in the backup summary. > > > Wrong assumption? > > > > > > Where should I be looking ? surely it shouldn?t be this hard to see > what files are selected? > > > > > > > > > Kindest regards, > > > Paul > > > > > > Paul Ward > > > TS Infrastructure Architect > > > Natural History Museum > > > T: 02079426450 > > > E: p.ward at nhm.ac.uk > > > [A picture containing drawing Description automatically generated] > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > > ed > > > =0 > > gp > > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > wa > > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > > a0 > > > d4c8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8 > > > ey > > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > > 00 > > > 0&sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&res > > > er > > > ved=0> > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879087616%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=72gqmRJEgZ97s3%2BjmFD12PpfcJJKUVJuyvyJf4beXS8%3D&reserv > > > ed > > > =0 > > gp > > > fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > wa > > > rd%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437f > > > a0 > > > d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8 > > > ey > > > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2 > > > 00 > > > 0&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&r > > > es > > > erved=0> > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > > rv > > > ed=0 > > 25 > > > 2F > > > gpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp. > > > ward%40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e7843 > > > 7f > > > a0d4c8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3 > > > d8 > > > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7 > > > C2 > > > 000&sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D& > > > ;r > > > eserved=0> > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf > > > su > > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.war > > > d% > > > 40nhm.ac.uk%7Cd4c22f3c612c4cb6deb908d9df4fd706%7C73a29c014e78437fa0d > > > 4c > > > 8553e1960c1%7C1%7C0%7C637786356879243834%7CUnknown%7CTWFpbGZsb3d8eyJ > > > WI > > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000 > > > &a > > > mp;sdata=ng2wVGN4u37lfaRjVYe%2F7sq9AwrXTWnVIQ7iVB%2BZWuc%3D&rese > > > rv > > > ed=0 > > > > > > -- > > -- Skylar Thompson (skylar2 at u.washington.edu) > > -- Genome Sciences Department (UW Medicine), System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- Pronouns: He/Him/His > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > > rved=0 _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsu > > g.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward% > > 40nhm.ac.uk%7C2a53f85fa35840d8969f08d9e0ec093f%7C73a29c014e78437fa0d4c > > 8553e1960c1%7C1%7C0%7C637788126972842626%7CUnknown%7CTWFpbGZsb3d8eyJWI > > joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > mp;sdata=Vo0YKGexQUUmzE2MAV9%2BKt5GDSm2xIcB%2F8E%2BxUvBeqE%3D&rese > > rved=0 > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cp.ward%40nhm.ac.uk%7C6d97e9a0e37c471cae7308d9e57e53d5%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637793154323249334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=LAVGUD2z%2BD2BcOJkan%2FLiOOlDyH44D5m2YHjIFk62HI%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.ward at nhm.ac.uk Mon Feb 21 12:30:15 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 21 Feb 2022 12:30:15 +0000 Subject: [gpfsug-discuss] immutable folder Message-ID: HI, I have a folder that I can't delete. IAM mode - non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can't leave it unchanged... Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From scale at us.ibm.com Mon Feb 21 16:11:37 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 21 Feb 2022 12:11:37 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug-discuss at spectrumscale.org" Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Tue Feb 22 10:30:36 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 22 Feb 2022 10:30:36 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug-discuss at spectrumscale.org" > Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From scale at us.ibm.com Tue Feb 22 14:17:00 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Tue, 22 Feb 2022 10:17:00 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. ???????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From cantrell at astro.gsu.edu Tue Feb 22 17:24:09 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 12:24:09 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS Message-ID: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> We're trying to mount multiple mounts at boot up via gpfs. We can mount the main gpfs mount /gpfs1, but would like to mount things like: /home /gpfs1/home /other /gpfs1/other /stuff /gpfs1/stuff But adding that to fstab doesn't work, because from what I understand, that's not how gpfs works with mounts. What's the standard way to accomplish something like this? We've used systemd timers/mounts to accomplish it, but that's not ideal. Is there a way to do this natively with gpfs or does this have to be done through symlinks or gpfs over nfs? From skylar2 at uw.edu Tue Feb 22 17:37:27 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 09:37:27 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> Message-ID: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Assuming this is on Linux, you ought to be able to use bind mounts for that, something like this in fstab or equivalent: /home /gpfs1/home bind defaults 0 0 On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > We're trying to mount multiple mounts at boot up via gpfs. > We can mount the main gpfs mount /gpfs1, but would like to mount things > like: > /home /gpfs1/home > /other /gpfs1/other > /stuff /gpfs1/stuff > > But adding that to fstab doesn't work, because from what I understand, > that's not how gpfs works with mounts. > What's the standard way to accomplish something like this? > We've used systemd timers/mounts to accomplish it, but that's not ideal. > Is there a way to do this natively with gpfs or does this have to be done > through symlinks or gpfs over nfs? -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From ulmer at ulmer.org Tue Feb 22 17:50:13 2022 From: ulmer at ulmer.org (Stephen Ulmer) Date: Tue, 22 Feb 2022 12:50:13 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> Message-ID: <3DE42AF3-34F0-4E3D-8813-813ADF85477A@ulmer.org> > On Feb 22, 2022, at 12:24 PM, Justin Cantrell wrote: > > We're trying to mount multiple mounts at boot up via gpfs. > We can mount the main gpfs mount /gpfs1, but would like to mount things like: > /home /gpfs1/home > /other /gpfs1/other > /stuff /gpfs1/stuff > > But adding that to fstab doesn't work, because from what I understand, that's not how gpfs works with mounts. > What's the standard way to accomplish something like this? > We've used systemd timers/mounts to accomplish it, but that's not ideal. > Is there a way to do this natively with gpfs or does this have to be done through symlinks or gpfs over nfs? > What are you really trying to accomplish? Backward compatibility with an older user experience? Making it shorter to type? Matching the path on non-GPFS nodes? -- Stephen From tina.friedrich at it.ox.ac.uk Tue Feb 22 18:12:23 2022 From: tina.friedrich at it.ox.ac.uk (Tina Friedrich) Date: Tue, 22 Feb 2022 18:12:23 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: <7b8fa26b-bb70-2ba4-0fe4-639ffede6943@it.ox.ac.uk> Bind mounts would definitely work; you can also use the automounter to bind-mount things into place. That's how we do that. E.g. [ ~]$ cat /etc/auto.data /data localhost://mnt/gpfs/bulk/data [ ~]$ cat /etc/auto.master | grep data # data /- /etc/auto.data works very well :) (That automatically bind-mounts it.) You could then also only mount user home directories if they're logged in, instead of showing all of them under /home/. Autofs can do pretty nice wildcarding and such. I would call bind mounting things - regardless of how - a better solution than symlinks, but that might just be my opinion :) Tina On 22/02/2022 17:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I understand, >> that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be done >> through symlinks or gpfs over nfs? > -- Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator Research Computing and Support Services IT Services, University of Oxford http://www.arc.ox.ac.uk http://www.it.ox.ac.uk From anacreo at gmail.com Tue Feb 22 18:56:44 2022 From: anacreo at gmail.com (Alec) Date: Tue, 22 Feb 2022 10:56:44 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: There is a sample script I believe it's called mmfsup. It's a hook that's called at startup of GPFS cluster node. We modify that script to do things such as configure backup ignore lists, update pagepool, and mount GPFS filesystem nodes as appropriate. We basically have a case statement based on class of the node, ie master, client, or primary backup node. Advantage of this is if you do an gpfs stop/start on an already running node things work right... Great in a fire situation... Or if you modify mounts or filesystems... You can call mmfsup say with mmdsh, send verify startup would be right. We started on this path because our backup software default policy would backup GPFS mounts from each node.. so simply adding the ignores at startup from the non backup primary was our solution. We also have mounts that should not be mounted on some nodes, and this handles that very elegantly. Alec On Tue, Feb 22, 2022, 9:37 AM Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > We're trying to mount multiple mounts at boot up via gpfs. > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > like: > > /home /gpfs1/home > > /other /gpfs1/other > > /stuff /gpfs1/stuff > > > > But adding that to fstab doesn't work, because from what I understand, > > that's not how gpfs works with mounts. > > What's the standard way to accomplish something like this? > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > Is there a way to do this natively with gpfs or does this have to be done > > through symlinks or gpfs over nfs? > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cantrell at astro.gsu.edu Tue Feb 22 19:23:53 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 14:23:53 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> Message-ID: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> I tried a bind mount, but perhaps I'm doing it wrong. The system fails to boot because gpfs doesn't start until too late in the boot process. In fact, the system boots and the gpfs1 partition isn't available for a good 20-30 seconds. /gfs1/home??? /home??? none???? bind I've tried adding mount options of x-systemd-requires=gpfs1, noauto. The noauto lets it boot, but the mount is never mounted properly. Doing a manual mount -a mounts it. On 2/22/22 12:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I understand, >> that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be done >> through symlinks or gpfs over nfs? From skylar2 at uw.edu Tue Feb 22 19:42:45 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 11:42:45 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> Message-ID: <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Like Tina, we're doing bind mounts in autofs. I forgot that there might be a race condition if you're doing it in fstab. If you're on system with systemd, another option might be to do this directly with systemd.mount rather than let the fstab generator make the systemd.mount units: https://www.freedesktop.org/software/systemd/man/systemd.mount.html You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > to boot because gpfs doesn't start until too late in the boot process. > In fact, the system boots and the gpfs1 partition isn't available for a > good 20-30 seconds. > > /gfs1/home??? /home??? none???? bind > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > The noauto lets it boot, but the mount is never mounted properly. Doing > a manual mount -a mounts it. > > On 2/22/22 12:37, Skylar Thompson wrote: > > Assuming this is on Linux, you ought to be able to use bind mounts for > > that, something like this in fstab or equivalent: > > > > /home /gpfs1/home bind defaults 0 0 > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > We're trying to mount multiple mounts at boot up via gpfs. > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > like: > > > /home /gpfs1/home > > > /other /gpfs1/other > > > /stuff /gpfs1/stuff > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > that's not how gpfs works with mounts. > > > What's the standard way to accomplish something like this? > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > Is there a way to do this natively with gpfs or does this have to be done > > > through symlinks or gpfs over nfs? > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From cantrell at astro.gsu.edu Tue Feb 22 20:05:58 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 15:05:58 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Message-ID: This is how we're currently solving this problem, with systemd timer and mount. None of the requires seem to work with gpfs since it starts so late. I would like a better solution. Is it normal for gpfs to start so late?? I think it doesn't mount until after the gpfs.service starts, and even then it's 20-30 seconds. On 2/22/22 14:42, Skylar Thompson wrote: > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > a race condition if you're doing it in fstab. If you're on system with systemd, > another option might be to do this directly with systemd.mount rather than > let the fstab generator make the systemd.mount units: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: >> I tried a bind mount, but perhaps I'm doing it wrong. The system fails >> to boot because gpfs doesn't start until too late in the boot process. >> In fact, the system boots and the gpfs1 partition isn't available for a >> good 20-30 seconds. >> >> /gfs1/home??? /home??? none???? bind >> I've tried adding mount options of x-systemd-requires=gpfs1, noauto. >> The noauto lets it boot, but the mount is never mounted properly. Doing >> a manual mount -a mounts it. >> >> On 2/22/22 12:37, Skylar Thompson wrote: >>> Assuming this is on Linux, you ought to be able to use bind mounts for >>> that, something like this in fstab or equivalent: >>> >>> /home /gpfs1/home bind defaults 0 0 >>> >>> On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >>>> We're trying to mount multiple mounts at boot up via gpfs. >>>> We can mount the main gpfs mount /gpfs1, but would like to mount things >>>> like: >>>> /home /gpfs1/home >>>> /other /gpfs1/other >>>> /stuff /gpfs1/stuff >>>> >>>> But adding that to fstab doesn't work, because from what I understand, >>>> that's not how gpfs works with mounts. >>>> What's the standard way to accomplish something like this? >>>> We've used systemd timers/mounts to accomplish it, but that's not ideal. >>>> Is there a way to do this natively with gpfs or does this have to be done >>>> through symlinks or gpfs over nfs? >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 From skylar2 at uw.edu Tue Feb 22 20:12:03 2022 From: skylar2 at uw.edu (Skylar Thompson) Date: Tue, 22 Feb 2022 12:12:03 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> Message-ID: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> The problem might be that the service indicates success when mmstartup returns rather than when the mount is actually active (requires quorum checking, arbitration, etc.). A couple tricks I can think of would be using ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a callback[2] that triggers on the mount condition for your filesystem that makes the bind mount rather than systemd. [1] https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathIsMountPoint= [2] https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command These are both on our todo list for improving our own GPFS mounting as we have problems with our job scheduler not starting reliably on reboot, but for us we can have Puppet start it on the next run so it just means nodes might not return to service for 30 minutes or so. On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > This is how we're currently solving this problem, with systemd timer and > mount. None of the requires seem to work with gpfs since it starts so late. > I would like a better solution. > > Is it normal for gpfs to start so late?? I think it doesn't mount until > after the gpfs.service starts, and even then it's 20-30 seconds. > > > On 2/22/22 14:42, Skylar Thompson wrote: > > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > > a race condition if you're doing it in fstab. If you're on system with systemd, > > another option might be to do this directly with systemd.mount rather than > > let the fstab generator make the systemd.mount units: > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > > > to boot because gpfs doesn't start until too late in the boot process. > > > In fact, the system boots and the gpfs1 partition isn't available for a > > > good 20-30 seconds. > > > > > > /gfs1/home??? /home??? none???? bind > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > The noauto lets it boot, but the mount is never mounted properly. Doing > > > a manual mount -a mounts it. > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > Assuming this is on Linux, you ought to be able to use bind mounts for > > > > that, something like this in fstab or equivalent: > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > > > like: > > > > > /home /gpfs1/home > > > > > /other /gpfs1/other > > > > > /stuff /gpfs1/stuff > > > > > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > > > that's not how gpfs works with mounts. > > > > > What's the standard way to accomplish something like this? > > > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > > > Is there a way to do this natively with gpfs or does this have to be done > > > > > through symlinks or gpfs over nfs? > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From anacreo at gmail.com Tue Feb 22 20:29:29 2022 From: anacreo at gmail.com (Alec) Date: Tue, 22 Feb 2022 12:29:29 -0800 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: The trick for us on AIX in the inittab I have a script fswait.ksh and monitors for the cluster mount point to be available before allowing the cluster dependent startup item (lower in the inittab) I'm pretty sure Linux has a way to define a dependent service.. define a cluster ready service and mark everything else as dependent on that or one of it's descendents. You could simply put the wait on FS in your dependent services start script as an option as well. Lookup systemd and then After= or Part of= if memory serves me right on Linux. For the mmfsup script it goes into /var/mmfs/etc/mmfsup The cluster will call it if present when the node is ready. On Tue, Feb 22, 2022, 12:13 PM Skylar Thompson wrote: > The problem might be that the service indicates success when mmstartup > returns rather than when the mount is actually active (requires quorum > checking, arbitration, etc.). A couple tricks I can think of would be using > ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a > callback[2] that triggers on the mount condition for your filesystem that > makes the bind mount rather than systemd. > > [1] > https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionPathIsMountPoint= > [2] > https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command > > These are both on our todo list for improving our own GPFS mounting as we > have problems with our job scheduler not starting reliably on reboot, but > for us we can have Puppet start it on the next run so it just means nodes > might not return to service for 30 minutes or so. > > On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > > This is how we're currently solving this problem, with systemd timer and > > mount. None of the requires seem to work with gpfs since it starts so > late. > > I would like a better solution. > > > > Is it normal for gpfs to start so late?? I think it doesn't mount until > > after the gpfs.service starts, and even then it's 20-30 seconds. > > > > > > On 2/22/22 14:42, Skylar Thompson wrote: > > > Like Tina, we're doing bind mounts in autofs. I forgot that there > might be > > > a race condition if you're doing it in fstab. If you're on system with > systemd, > > > another option might be to do this directly with systemd.mount rather > than > > > let the fstab generator make the systemd.mount units: > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd.mount.html&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4%3D&reserved=0 > > > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > > I tried a bind mount, but perhaps I'm doing it wrong. The system > fails > > > > to boot because gpfs doesn't start until too late in the boot > process. > > > > In fact, the system boots and the gpfs1 partition isn't available > for a > > > > good 20-30 seconds. > > > > > > > > /gfs1/home??? /home??? none???? bind > > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > > The noauto lets it boot, but the mount is never mounted properly. > Doing > > > > a manual mount -a mounts it. > > > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > > Assuming this is on Linux, you ought to be able to use bind mounts > for > > > > > that, something like this in fstab or equivalent: > > > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount > things > > > > > > like: > > > > > > /home /gpfs1/home > > > > > > /other /gpfs1/other > > > > > > /stuff /gpfs1/stuff > > > > > > > > > > > > But adding that to fstab doesn't work, because from what I > understand, > > > > > > that's not how gpfs works with mounts. > > > > > > What's the standard way to accomplish something like this? > > > > > > We've used systemd timers/mounts to accomplish it, but that's > not ideal. > > > > > > Is there a way to do this natively with gpfs or does this have > to be done > > > > > > through symlinks or gpfs over nfs? > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=04%7C01%7Cjcantrell1%40gsu.edu%7C2a65cd0ddefd48cb81a308d9f63bb840%7C515ad73d8d5e4169895c9789dc742a70%7C0%7C0%7C637811559082622923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F4oXAT0zdY%2BS1mR784ZGghUt0G%2F6Ofu36MfJ9WnPsPM%3D&reserved=0 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From malone12 at illinois.edu Tue Feb 22 20:21:43 2022 From: malone12 at illinois.edu (Maloney, J.D.) Date: Tue, 22 Feb 2022 20:21:43 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: Our Puppet/Ansible GPFS modules/playbooks handle this sequencing for us (we use bind mounts for things like u, projects, and scratch also). Like Skylar mentioned page pool allocation, quorum checking, and cluster arbitration have to come before a mount of the FS so that time you mentioned doesn?t seem totally off to me. We just make the creation of the bind mounts dependent on the actual GPFS mount occurring in the configuration management tooling which has worked out well for us in that regard. Best, J.D. Maloney Sr. HPC Storage Engineer | Storage Enabling Technologies Group National Center for Supercomputing Applications (NCSA) From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Skylar Thompson Date: Tuesday, February 22, 2022 at 2:13 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to do multiple mounts via GPFS The problem might be that the service indicates success when mmstartup returns rather than when the mount is actually active (requires quorum checking, arbitration, etc.). A couple tricks I can think of would be using ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a callback[2] that triggers on the mount condition for your filesystem that makes the bind mount rather than systemd. [1] https://urldefense.com/v3/__https://www.freedesktop.org/software/systemd/man/systemd.unit.html*ConditionPathIsMountPoint=__;Iw!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv4xJQwzZ$ [2] https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv3f90Gia$ These are both on our todo list for improving our own GPFS mounting as we have problems with our job scheduler not starting reliably on reboot, but for us we can have Puppet start it on the next run so it just means nodes might not return to service for 30 minutes or so. On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > This is how we're currently solving this problem, with systemd timer and > mount. None of the requires seem to work with gpfs since it starts so late. > I would like a better solution. > > Is it normal for gpfs to start so late?? I think it doesn't mount until > after the gpfs.service starts, and even then it's 20-30 seconds. > > > On 2/22/22 14:42, Skylar Thompson wrote: > > Like Tina, we're doing bind mounts in autofs. I forgot that there might be > > a race condition if you're doing it in fstab. If you're on system with systemd, > > another option might be to do this directly with systemd.mount rather than > > let the fstab generator make the systemd.mount units: > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.freedesktop.org*2Fsoftware*2Fsystemd*2Fman*2Fsystemd.mount.html&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=*2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv0tqF9rU$ > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount unit. > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > I tried a bind mount, but perhaps I'm doing it wrong. The system fails > > > to boot because gpfs doesn't start until too late in the boot process. > > > In fact, the system boots and the gpfs1 partition isn't available for a > > > good 20-30 seconds. > > > > > > /gfs1/home??? /home??? none???? bind > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > The noauto lets it boot, but the mount is never mounted properly. Doing > > > a manual mount -a mounts it. > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > Assuming this is on Linux, you ought to be able to use bind mounts for > > > > that, something like this in fstab or equivalent: > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > We can mount the main gpfs mount /gpfs1, but would like to mount things > > > > > like: > > > > > /home /gpfs1/home > > > > > /other /gpfs1/other > > > > > /stuff /gpfs1/stuff > > > > > > > > > > But adding that to fstab doesn't work, because from what I understand, > > > > > that's not how gpfs works with mounts. > > > > > What's the standard way to accomplish something like this? > > > > > We've used systemd timers/mounts to accomplish it, but that's not ideal. > > > > > Is there a way to do this natively with gpfs or does this have to be done > > > > > through symlinks or gpfs over nfs? > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=http*3A*2F*2Fgpfsug.org*2Fmailman*2Flistinfo*2Fgpfsug-discuss&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=F4oXAT0zdY*2BS1mR784ZGghUt0G*2F6Ofu36MfJ9WnPsPM*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv5uX7C9S$ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cantrell at astro.gsu.edu Tue Feb 22 22:07:47 2022 From: cantrell at astro.gsu.edu (Justin Cantrell) Date: Tue, 22 Feb 2022 17:07:47 -0500 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> <20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: I'd love to see your fstab to see how you're doing that bind mount. Do you use systemd? What cluster manager are you using? On 2/22/22 15:21, Maloney, J.D. wrote: > > Our Puppet/Ansible GPFS modules/playbooks handle this sequencing for > us (we use bind mounts for things like u, projects, and scratch > also).? Like Skylar mentioned page pool allocation, quorum checking, > and cluster arbitration have to come before a mount of the FS so that > time you mentioned doesn?t seem totally off to me. ?We just make the > creation of the bind mounts dependent on the actual GPFS mount > occurring in the configuration management tooling which has worked out > well for us in that regard. > > Best, > > J.D. Maloney > > Sr. HPC Storage Engineer | Storage Enabling Technologies Group > > National Center for Supercomputing Applications (NCSA) > > *From: *gpfsug-discuss-bounces at spectrumscale.org > on behalf of Skylar > Thompson > *Date: *Tuesday, February 22, 2022 at 2:13 PM > *To: *gpfsug-discuss at spectrumscale.org > *Subject: *Re: [gpfsug-discuss] How to do multiple mounts via GPFS > > The problem might be that the service indicates success when mmstartup > returns rather than when the mount is actually active (requires quorum > checking, arbitration, etc.). A couple tricks I can think of would be > using > ConditionPathIsMountPoint from systemd.unit[1], or maybe adding a > callback[2] that triggers on the mount condition for your filesystem that > makes the bind mount rather than systemd. > > [1] > https://urldefense.com/v3/__https://www.freedesktop.org/software/systemd/man/systemd.unit.html*ConditionPathIsMountPoint=__;Iw!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv4xJQwzZ$ > > > [2] > https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=reference-mmaddcallback-command__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv3f90Gia$ > > > > These are both on our todo list for improving our own GPFS mounting as we > have problems with our job scheduler not starting reliably on reboot, but > for us we can have Puppet start it on the next run so it just means nodes > might not return to service for 30 minutes or so. > > On Tue, Feb 22, 2022 at 03:05:58PM -0500, Justin Cantrell wrote: > > This is how we're currently solving this problem, with systemd timer and > > mount. None of the requires seem to work with gpfs since it starts > so late. > > I would like a better solution. > > > > Is it normal for gpfs to start so late?? I think it doesn't mount until > > after the gpfs.service starts, and even then it's 20-30 seconds. > > > > > > On 2/22/22 14:42, Skylar Thompson wrote: > > > Like Tina, we're doing bind mounts in autofs. I forgot that there > might be > > > a race condition if you're doing it in fstab. If you're on system > with systemd, > > > another option might be to do this directly with systemd.mount > rather than > > > let the fstab generator make the systemd.mount units: > > > > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.freedesktop.org*2Fsoftware*2Fsystemd*2Fman*2Fsystemd.mount.html&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=*2BWWD7cCNSMeJEYwELldYT3pLdXVX3AxJj7gqZQCqUv4*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv0tqF9rU$ > > > > > > > > You could then set RequiresMountFor=gpfs1.mount in the bind mount > unit. > > > > > > On Tue, Feb 22, 2022 at 02:23:53PM -0500, Justin Cantrell wrote: > > > > I tried a bind mount, but perhaps I'm doing it wrong. The system > fails > > > > to boot because gpfs doesn't start until too late in the boot > process. > > > > In fact, the system boots and the gpfs1 partition isn't > available for a > > > > good 20-30 seconds. > > > > > > > > /gfs1/home??? /home??? none???? bind > > > > I've tried adding mount options of x-systemd-requires=gpfs1, noauto. > > > > The noauto lets it boot, but the mount is never mounted > properly. Doing > > > > a manual mount -a mounts it. > > > > > > > > On 2/22/22 12:37, Skylar Thompson wrote: > > > > > Assuming this is on Linux, you ought to be able to use bind > mounts for > > > > > that, something like this in fstab or equivalent: > > > > > > > > > > /home /gpfs1/home bind defaults 0 0 > > > > > > > > > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: > > > > > > We're trying to mount multiple mounts at boot up via gpfs. > > > > > > We can mount the main gpfs mount /gpfs1, but would like to > mount things > > > > > > like: > > > > > > /home /gpfs1/home > > > > > > /other /gpfs1/other > > > > > > /stuff /gpfs1/stuff > > > > > > > > > > > > But adding that to fstab doesn't work, because from what I > understand, > > > > > > that's not how gpfs works with mounts. > > > > > > What's the standard way to accomplish something like this? > > > > > > We've used systemd timers/mounts to accomplish it, but > that's not ideal. > > > > > > Is there a way to do this natively with gpfs or does this > have to be done > > > > > > through symlinks or gpfs over nfs? > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > > https://urldefense.com/v3/__https://nam11.safelinks.protection.outlook.com/?url=http*3A*2F*2Fgpfsug.org*2Fmailman*2Flistinfo*2Fgpfsug-discuss&data=04*7C01*7Cjcantrell1*40gsu.edu*7C2a65cd0ddefd48cb81a308d9f63bb840*7C515ad73d8d5e4169895c9789dc742a70*7C0*7C0*7C637811559082622923*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000&sdata=F4oXAT0zdY*2BS1mR784ZGghUt0G*2F6Ofu36MfJ9WnPsPM*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUl!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv5uX7C9S$ > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ > > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department (UW Medicine), System Administrator > -- Foege Building S046, (206)-685-7354 > -- Pronouns: He/Him/His > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!DZ3fjg!vTT86FG0CVqF6KsdQdq6n66YOYiOPr6K2MrdTnqc2vnVduE1uhiO8VJcWTqzv34vkiw2$ > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From NSCHULD at de.ibm.com Wed Feb 23 07:01:45 2022 From: NSCHULD at de.ibm.com (Norbert Schuld) Date: Wed, 23 Feb 2022 09:01:45 +0200 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu><20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu><34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu><20220222194245.ebv5a7vzyouez4sg@utumno.gs.washington.edu> <20220222201203.oflttzewmzhvqwty@utumno.gs.washington.edu> Message-ID: May I point out some additional systemd targets documented here: https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=gpfs-planning-systemd Depending on the need the gpfs-wait-mount.service could be helpful as an "after" clause for other units. An example is provided in /usr/lpp/mmfs/samples/systemd.service.sample Kind regards Norbert Schuld IBM Spectrum Scale Software Development -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 11:03:37 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 11:03:37 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. ???????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thank you for the suggestion? The fileset is in active use and is backed up using spectrum protect. This is therefore advised against. Was this option suggested to ?close open files? ? The issue is a directory not files. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 21 February 2022 16:12 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Hi Paul, Have you tried mmunlinkfileset first? Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug-discuss at spectrumscale.org" > Date: 02/21/2022 07:31 AM Subject: [EXTERNAL] [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 ??????????????????????????????????????????????????????????????????????????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender ">This message came from outside your organization. ZjQcmQRYFpfptBannerEnd HI, I have a folder that I can?t delete. IAM mode ? non-compliant It is empty: file name: Nick Foster's sample/ metadata replication: 2 max 2 immutable: yes appendOnly: no indefiniteRetention: no expiration Time: Thu Jan 9 23:10:25 2020 flags: storage pool name: system fileset name: bulk-fset snapshot name: creation time: Sat Jan 9 04:44:16 2016 Misc attributes: DIRECTORY READONLY Encrypted: no Try and turn off immutability: mmchattr -i no "Nick Foster's sample" Nick Foster's sample: Change immutable flag failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Unchanged, Permission denied! So can?t leave it unchanged? Tried setting indefiniteRetention no and yes: mmchattr -i no --indefinite-retention no "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to No, Permission denied! mmchattr -i no --indefinite-retention yes "Nick Foster's sample" Nick Foster's sample: Change immutable, enforceRetention flags failed: Operation not permitted. Can not set immutable or appendOnly flag to No and indefiniteRetention flag to Yes, Permission denied! Any ideas? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From juergen.hannappel at desy.de Wed Feb 23 11:49:09 2022 From: juergen.hannappel at desy.de (Hannappel, Juergen) Date: Wed, 23 Feb 2022 12:49:09 +0100 (CET) Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: Message-ID: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name > From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 > Subject: Re: [gpfsug-discuss] immutable folder > Its not a fileset, its just a folder, well a subfolder? > [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact > experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick > Foster's sample > It?s the ?Nick Foster's sample? folder I want to delete, but it says it is > immutable and I can?t disable that. > I suspect it?s the apostrophe confusing things. > Kindest regards, > Paul > Paul Ward > TS Infrastructure Architect > Natural History Museum > T: 02079426450 > E: [ mailto:p.ward at nhm.ac.uk | p.ward at nhm.ac.uk ] > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale > Sent: 22 February 2022 14:17 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder > Scale disallows deleting fileset junction using rmdir, so I suggested > mmunlinkfileset. > Regards, The Spectrum Scale (GPFS) team > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), > then please post it to the public IBM developerWroks Forum at [ > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=04%7C01%7Cp.ward%40nhm.ac.uk%7Cbd72c8c2ee3d49f619c908d9f60e0732%7C73a29c014e78437fa0d4c8553e1960c1%7C1%7C0%7C637811363409593169%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=XoY%2BAbA5%2FNBwuoJrY12MNurjJrp8KMsV1t63hdItfiM%3D&reserved=0 > | > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > ] . > If your query concerns a potential software error in Spectrum Scale (GPFS) and > you have an IBM software maintenance contract please contact 1-800-237-5511 in > the United States or your local IBM Service Center in other countries. > The forum is informally monitored as time permits and should not be used for > priority messages to the Spectrum Scale (GPFS) team. > From: "Paul Ward" < [ mailto:p.ward at nhm.ac.uk | p.ward at nhm.ac.uk ] > > To: "gpfsug main discussion list" < [ mailto:gpfsug-discuss at spectrumscale.org | > gpfsug-discuss at spectrumscale.org ] > > Date: 02/22/2022 05:31 AM > Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder > Sent by: [ mailto:gpfsug-discuss-bounces at spectrumscale.org | > gpfsug-discuss-bounces at spectrumscale.org ] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From p.ward at nhm.ac.uk Wed Feb 23 12:17:15 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 12:17:15 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Thanks, I couldn't recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory 'it/stu'pid name': No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder... [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It's the "Nick Foster's sample" folder I want to delete, but it says it is immutable and I can't disable that. I suspect it's the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From stockf at us.ibm.com Wed Feb 23 12:51:26 2022 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 23 Feb 2022 12:51:26 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: , <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image001.jpg at 01D828AF.49A09C40.jpg Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 13:52:20 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 13:52:20 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: , <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: 5.1.1-1 Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Frederick Stock Sent: 23 February 2022 12:51 To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] immutable folder Paul, what version of Spectrum Scale are you using? Fred _______________________________________________________ Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Paul Ward" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug main discussion list" > Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Date: Wed, Feb 23, 2022 7:17 AM Thanks, I couldn't recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory 'it/stu'pid name': No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawingDescription automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder... [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It's the "Nick Foster's sample" folder I want to delete, but it says it is immutable and I can't disable that. I suspect it's the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawingDescription automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From julian.jakobs at cec.mpg.de Wed Feb 23 13:48:10 2022 From: julian.jakobs at cec.mpg.de (Jakobs, Julian) Date: Wed, 23 Feb 2022 13:48:10 +0000 Subject: [gpfsug-discuss] How to do multiple mounts via GPFS In-Reply-To: <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> References: <2a8ec3a5-a370-2c7f-e7ca-1f61478ce9e5@astro.gsu.edu> <20220222173727.bv6efgsvnpbbeytm@utumno.gs.washington.edu> <34c3237b-3d89-2cf2-ff14-bbe5e276efda@astro.gsu.edu> Message-ID: <67f997e15dc040d2900b2e1f9295dec0@cec.mpg.de> I've ran into the same problem some time ago. What worked for me was this shell script I run as a @reboot cronjob: #!/bin/bash while [ ! -d /gpfs1/home ] do sleep 5 done mount --bind /gpfs1/home /home -----Urspr?ngliche Nachricht----- Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Justin Cantrell Gesendet: Dienstag, 22. Februar 2022 20:24 An: gpfsug-discuss at spectrumscale.org Betreff: Re: [gpfsug-discuss] How to do multiple mounts via GPFS I tried a bind mount, but perhaps I'm doing it wrong. The system fails to boot because gpfs doesn't start until too late in the boot process. In fact, the system boots and the gpfs1 partition isn't available for a good 20-30 seconds. /gfs1/home /home none bind I've tried adding mount options of x-systemd-requires=gpfs1, noauto. The noauto lets it boot, but the mount is never mounted properly. Doing a manual mount -a mounts it. On 2/22/22 12:37, Skylar Thompson wrote: > Assuming this is on Linux, you ought to be able to use bind mounts for > that, something like this in fstab or equivalent: > > /home /gpfs1/home bind defaults 0 0 > > On Tue, Feb 22, 2022 at 12:24:09PM -0500, Justin Cantrell wrote: >> We're trying to mount multiple mounts at boot up via gpfs. >> We can mount the main gpfs mount /gpfs1, but would like to mount >> things >> like: >> /home /gpfs1/home >> /other /gpfs1/other >> /stuff /gpfs1/stuff >> >> But adding that to fstab doesn't work, because from what I >> understand, that's not how gpfs works with mounts. >> What's the standard way to accomplish something like this? >> We've used systemd timers/mounts to accomplish it, but that's not ideal. >> Is there a way to do this natively with gpfs or does this have to be >> done through symlinks or gpfs over nfs? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6777 bytes Desc: not available URL: From scale at us.ibm.com Wed Feb 23 14:57:24 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 23 Feb 2022 10:57:24 -0400 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Your directory is under a fileset with non-compliant iam mode. With fileset in that mode, it follows snapLock protocol - it disallows changing subdir to immutable, but allows changing subdir to mutable. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/23/2022 07:17 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" ??????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name From: "Paul Ward" To: "gpfsug main discussion list" Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk From: gpfsug-discuss-bounces at spectrumscale.org < gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" To: "gpfsug main discussion list" Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 5356 bytes Desc: not available URL: From p.ward at nhm.ac.uk Wed Feb 23 16:35:14 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Wed, 23 Feb 2022 16:35:14 +0000 Subject: [gpfsug-discuss] immutable folder In-Reply-To: References: <1989346846.8388142.1645616949278.JavaMail.zimbra@desy.de> Message-ID: Its not allowing me! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: 23 February 2022 14:57 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder Your directory is under a fileset with non-compliant iam mode. With fileset in that mode, it follows snapLock protocol - it disallows changing subdir to immutable, but allows changing subdir to mutable. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/23/2022 07:17 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" ??????????????????ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Thanks, I couldn?t recreate that test: # mkdir "it/stu'pid name" mkdir: cannot create directory ?it/stu'pid name?: No such file or directory [Removing the / ] # mkdir "itstu'pid name" # mmchattr -i yes itstu\'pid\ name/ itstu'pid name/: Change immutable flag failed: Invalid argument. Can not set directory to be immutable or appendOnly under current fileset mode! Which begs the question, how do I have an immutable folder! Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Hannappel, Juergen Sent: 23 February 2022 11:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder While the apostrophe is evil it's not the problem: root at it-gti-02 test1]# mkdir "it/stu'pid name" [root at it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name [root at it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name ________________________________ From: "Paul Ward" > To: "gpfsug main discussion list" > Sent: Wednesday, 23 February, 2022 12:03:37 Subject: Re: [gpfsug-discuss] immutable folder Its not a fileset, its just a folder, well a subfolder? [filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample It?s the ?Nick Foster's sample? folder I want to delete, but it says it is immutable and I can?t disable that. I suspect it?s the apostrophe confusing things. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] From:gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale Sent: 22 February 2022 14:17 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] immutable folder Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Paul Ward" > To: "gpfsug main discussion list" > Date: 02/22/2022 05:31 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folder Sent by: gpfsug-discuss-bounces at spectrumscale.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: From uwe.falke at kit.edu Wed Feb 23 18:26:50 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Wed, 23 Feb 2022 19:26:50 +0100 Subject: [gpfsug-discuss] IO sizes Message-ID: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that? just one of the NSD servers does send smaller IO requests to the storage? than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by? ctrl A, one by ctrl B,? and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From alex at calicolabs.com Wed Feb 23 18:39:07 2022 From: alex at calicolabs.com (Alex Chekholko) Date: Wed, 23 Feb 2022 10:39:07 -0800 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: > Dear all, > > sorry for asking a question which seems not directly GPFS related: > > In a setup with 4 NSD servers (old-style, with storage controllers in > the back end), 12 clients and 10 Seagate storage systems, I do see in > benchmark tests that just one of the NSD servers does send smaller IO > requests to the storage than the other 3 (that is, both reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes > ( one server to the controllers A, the other one to controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by > mpt3sas) for all sd devices and all multipath (dm) devices built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so there is some > asymmetry, but that should not affect the IOs, shouldn't it?, and if it > did we would see the same effect in both pairs of NSD servers, but we do > not). > > All 4 storage systems are also configured the same way (2 disk groups / > pools / declustered arrays, one managed by ctrl A, one by ctrl B, and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I do see, both > in iostat and on the storage systems, that the default IO requests are > about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the storage) cause > incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as > the controller is not able to re-coalesce the data properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Wed Feb 23 21:20:11 2022 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 23 Feb 2022 21:20:11 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: Message-ID: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew > On 24 Feb 2022, at 04:39, Alex Chekholko wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Hi, > > Metadata I/Os will always be smaller than the usual data block size, right? > Which version of GPFS? > > Regards, > Alex > >> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: >> Dear all, >> >> sorry for asking a question which seems not directly GPFS related: >> >> In a setup with 4 NSD servers (old-style, with storage controllers in >> the back end), 12 clients and 10 Seagate storage systems, I do see in >> benchmark tests that just one of the NSD servers does send smaller IO >> requests to the storage than the other 3 (that is, both reads and >> writes are smaller). >> >> The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes >> ( one server to the controllers A, the other one to controllers B of the >> Seagates, resp.). >> >> All 4 NSD servers are set up similarly: >> >> kernel: 3.10.0-1160.el7.x86_64 #1 SMP >> >> HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx >> >> driver : mpt3sas 31.100.01.00 >> >> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by >> mpt3sas) for all sd devices and all multipath (dm) devices built on top. >> >> scheduler: deadline >> >> multipath (actually we do have 3 paths to each volume, so there is some >> asymmetry, but that should not affect the IOs, shouldn't it?, and if it >> did we would see the same effect in both pairs of NSD servers, but we do >> not). >> >> All 4 storage systems are also configured the same way (2 disk groups / >> pools / declustered arrays, one managed by ctrl A, one by ctrl B, and >> 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). >> >> >> GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO >> requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. >> >> The first question I have - but that is not my main one: I do see, both >> in iostat and on the storage systems, that the default IO requests are >> about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb >> is really in terms of kiB, not sectors, cf. >> https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). >> >> But what puzzles me even more: one of the server compiles IOs even >> smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and >> writes ... I just cannot see why. >> >> I have to suspect that this will (in writing to the storage) cause >> incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as >> the controller is not able to re-coalesce the data properly; and it >> seems it cannot do it completely at least) >> >> >> If someone of you has seen that already and/or knows a potential >> explanation I'd be glad to learn about. >> >> >> And if some of you wonder: yes, I (was) moved away from IBM and am now >> at KIT. >> >> Many thanks in advance >> >> Uwe >> >> >> -- >> Karlsruhe Institute of Technology (KIT) >> Steinbuch Centre for Computing (SCC) >> Scientific Data Management (SDM) >> >> Uwe Falke >> >> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 >> D-76344 Eggenstein-Leopoldshafen >> >> Tel: +49 721 608 28024 >> Email: uwe.falke at kit.edu >> www.scc.kit.edu >> >> Registered office: >> Kaiserstra?e 12, 76131 Karlsruhe, Germany >> >> KIT ? The Research University in the Helmholtz Association >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From uwe.falke at kit.edu Thu Feb 24 01:03:32 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Thu, 24 Feb 2022 02:03:32 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems? recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks to > consider V4 filesystems have 1/32 subblocks, V5 filesystems have > 1/1024 subblocks (assuming metadata and data block size is the same) > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file size is > if most of your files are smaller than your filesystem block size, > then you are always going to be performing writes using groups of > subblocks rather than a full block writes. > > Regards, > > Andrew > > >> On 24 Feb 2022, at 04:39, Alex Chekholko wrote: >> >> ? Hi, Metadata I/Os will always be smaller than the usual data block >> size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, >> 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry >> for asking a question which seems ZjQcmQRYFpfptBannerStart >> This Message Is From an External Sender >> This message came from outside your organization. >> ZjQcmQRYFpfptBannerEnd >> Hi, >> >> Metadata I/Os will always be smaller than the usual data block size, >> right? >> Which version of GPFS? >> >> Regards, >> Alex >> >> On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: >> >> Dear all, >> >> sorry for asking a question which seems not directly GPFS related: >> >> In a setup with 4 NSD servers (old-style, with storage >> controllers in >> the back end), 12 clients and 10 Seagate storage systems, I do >> see in >> benchmark tests that? just one of the NSD servers does send >> smaller IO >> requests to the storage? than the other 3 (that is, both reads and >> writes are smaller). >> >> The NSD servers form 2 pairs, each pair is connected to 5 seagate >> boxes >> ( one server to the controllers A, the other one to controllers B >> of the >> Seagates, resp.). >> >> All 4 NSD servers are set up similarly: >> >> kernel: 3.10.0-1160.el7.x86_64 #1 SMP >> >> HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx >> >> driver : mpt3sas 31.100.01.00 >> >> max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as >> limited by >> mpt3sas) for all sd devices and all multipath (dm) devices built >> on top. >> >> scheduler: deadline >> >> multipath (actually we do have 3 paths to each volume, so there >> is some >> asymmetry, but that should not affect the IOs, shouldn't it?, and >> if it >> did we would see the same effect in both pairs of NSD servers, >> but we do >> not). >> >> All 4 storage systems are also configured the same way (2 disk >> groups / >> pools / declustered arrays, one managed by? ctrl A, one by ctrl >> B,? and >> 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). >> >> >> GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO >> requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. >> >> The first question I have - but that is not my main one: I do >> see, both >> in iostat and on the storage systems, that the default IO >> requests are >> about 4MiB, not 8MiB as I'd expect from above settings >> (max_sectors_kb >> is really in terms of kiB, not sectors, cf. >> https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). >> >> But what puzzles me even more: one of the server compiles IOs even >> smaller, varying between 3.2MiB and 3.6MiB mostly - both for >> reads and >> writes ... I just cannot see why. >> >> I have to suspect that this will (in writing to the storage) cause >> incomplete stripe writes on our erasure-coded volumes (8+2p)(as >> long as >> the controller is not able to re-coalesce the data properly; and it >> seems it cannot do it completely at least) >> >> >> If someone of you has seen that already and/or knows a potential >> explanation I'd be glad to learn about. >> >> >> And if some of you wonder: yes, I (was) moved away from IBM and >> am now >> at KIT. >> >> Many thanks in advance >> >> Uwe >> >> >> -- >> Karlsruhe Institute of Technology (KIT) >> Steinbuch Centre for Computing (SCC) >> Scientific Data Management (SDM) >> >> Uwe Falke >> >> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 >> D-76344 Eggenstein-Leopoldshafen >> >> Tel: +49 721 608 28024 >> Email: uwe.falke at kit.edu >> www.scc.kit.edu >> >> Registered office: >> Kaiserstra?e 12, 76131 Karlsruhe, Germany >> >> KIT ? The Research University in the Helmholtz Association >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Achim.Rehor at de.ibm.com Thu Feb 24 12:41:11 2022 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Thu, 24 Feb 2022 14:41:11 +0200 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi Uwe, first of all, glad to see you back in the GPFS space ;) agreed, groups of subblocks being written will end up in IO sizes, being smaller than the 8MB filesystem blocksize, also agreed, this cannot be metadata, since their size is MUCH smaller, like 4k or less, mostly. But why would these grouped subblock reads/writes all end up on the same NSD server, while the others do full block writes ? How is your NSD server setup per NSD ? did you 'round-robin' set the preferred NSD server per NSD ? are the client nodes transferring the data in anyway doing specifics ? Sorry for not having a solution for you, jsut sharing a few ideas ;) Mit freundlichen Gr??en / Kind regards Achim Rehor Technical Support Specialist Spectrum Scale and ESS (SME) Advisory Product Services Professional IBM Systems Storage Support - EMEA gpfsug-discuss-bounces at spectrumscale.org wrote on 23/02/2022 22:20:11: > From: "Andrew Beattie" > To: "gpfsug main discussion list" > Date: 23/02/2022 22:20 > Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > Alex, Metadata will be 4Kib Depending on the filesystem version you > will also have subblocks to consider V4 filesystems have 1/32 > subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata > and data block size is the same) ???????????ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks to > consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/ > 1024 subblocks (assuming metadata and data block size is the same) > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file size > is if most of your files are smaller than your filesystem block > size, then you are always going to be performing writes using groups > of subblocks rather than a full block writes. > > Regards, > > Andrew > > On 24 Feb 2022, at 04:39, Alex Chekholko wrote: > ? Hi, Metadata I/Os will always be smaller than the usual data block > size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, > 2022 at 10:26 AM Uwe Falke wrote: Dear all, > sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Hi, > > Metadata I/Os will always be smaller than the usual data block size, right? > Which version of GPFS? > > Regards, > Alex > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: > Dear all, > > sorry for asking a question which seems not directly GPFS related: > > In a setup with 4 NSD servers (old-style, with storage controllers in > the back end), 12 clients and 10 Seagate storage systems, I do see in > benchmark tests that just one of the NSD servers does send smaller IO > requests to the storage than the other 3 (that is, both reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes > ( one server to the controllers A, the other one to controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by > mpt3sas) for all sd devices and all multipath (dm) devices built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so there is some > asymmetry, but that should not affect the IOs, shouldn't it?, and if it > did we would see the same effect in both pairs of NSD servers, but we do > not). > > All 4 storage systems are also configured the same way (2 disk groups / > pools / declustered arrays, one managed by ctrl A, one by ctrl B, and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I do see, both > in iostat and on the storage systems, that the default IO requests are > about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the storage) cause > incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as > the controller is not able to re-coalesce the data properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > INVALID URI REMOVED > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-M&m=- > FdZvYBvHDPnBTu2FtPkLT09ahlYp2QsMutqNV2jWaY&s=S4C2D3_h4FJLAw0PUYLKhKE242vn_fwn-1_EJmHNpE8&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From olaf.weiser at de.ibm.com Thu Feb 24 12:47:59 2022 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 24 Feb 2022 12:47:59 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.1__=4EBB0D60DFD775728f9e8a93df938690 at ibm.com.gif Type: image/gif Size: 45 bytes Desc: not available URL: From krajaram at geocomputing.net Thu Feb 24 14:32:35 2022 From: krajaram at geocomputing.net (Kumaran Rajaram) Date: Thu, 24 Feb 2022 14:32:35 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: Hi Uwe, >> But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. IMHO, If GPFS on this particular NSD server was restarted often during the setup, then it is possible that the GPFS pagepool may not be contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a scatter-gather (SG) list with many small entries (in the memory) resulting in smaller I/O when these buffers are issued to the disks. The fix would be to reboot the server and start GPFS so that pagepool is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) SG entries. >>In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs >>smaller than 4MiB again at some point, so that is not a nice solution. It will be advised not to restart GPFS often in the NSD servers (in production) to keep the pagepool contiguous. Ensure that there is enough free memory in NSD server and not run any memory intensive jobs so that pagepool is not impacted (e.g. swapped out). Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is equally distributed across the NUMA domains for good performance. GPFS numaMemoryInterleave=yes requires that numactl packages are installed and then GPFS restarted. # mmfsadm dump config | egrep "numaMemory|pagepool " ! numaMemoryInterleave yes ! pagepool 282394099712 # pgrep mmfsd | xargs numastat -p Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) Node 0 Node 1 Total --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 1.26 3.26 4.52 Stack 0.01 0.01 0.02 Private 137710.43 137709.96 275420.39 ---------------- --------------- --------------- --------------- Total 137711.70 137713.23 275424.92 My two cents, -Kums Kumaran Rajaram [cid:image001.png at 01D82960.6A9860C0] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Uwe Falke Sent: Wednesday, February 23, 2022 8:04 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] IO sizes Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew On 24 Feb 2022, at 04:39, Alex Chekholko wrote: ? Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry for asking a question which seems ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that just one of the NSD servers does send smaller IO requests to the storage than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by ctrl A, one by ctrl B, and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: image001.png URL: From uwe.falke at kit.edu Fri Feb 25 14:29:23 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Fri, 25 Feb 2022 15:29:23 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: <3fc68f40-8b3a-be33-3451-09a04fdc83a0@kit.edu> Hi, and thanks, Achim and Olaf, mmdiag --iohist on the NSD servers (on all 4 of them) shows IO sizes in IOs to/from the data NSDs (i.e. to/from storage) of 16384 512-byte-sectors? throughout, i.e. 8MiB, agreeing with the FS block size. (Having that information i do not need to ask the clients ...) iostat on NSD servers as well as the? storage system counters say the IOs crafted by the OS layer are 4MiB except for the one suspicious NSD server where they were somewhat smaller than 4MiB before the reboot, but are now somewhat larger than 4MiB (but by a distinct amount). The data piped through the NSD servers are well balanced between the 4 NSD servers, the IO system of the suspicious NSD server just issued a higher rate of IO requests when running smaller IOs and now, with larger IOs it has a lower IO rate than the other three NSD servers. So I am pretty sure it is not GPFS (see my initial post :-); but still some people using GPFS might have encounterd that as well, or might have an idea ;-) Cheers Uwe On 24.02.22 13:47, Olaf Weiser wrote: > in addition, to Achim, > where do you see those "smaller IO"... > have you checked IO sizes with mmfsadm dump iohist on each > NSDclient/Server ?... If ok on that level.. it's not GPFS > Mit freundlichen Gr??en / Kind regards > > Olaf Weiser > > ----- Urspr?ngliche Nachricht ----- > Von: "Achim Rehor" > Gesendet von: gpfsug-discuss-bounces at spectrumscale.org > An: "gpfsug main discussion list" > CC: > Betreff: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > Datum: Do, 24. Feb 2022 13:41 > > Hi Uwe, > > first of all, glad to see you back in the GPFS space ;) > > agreed, groups of subblocks being written will end up in IO sizes, > being smaller than the 8MB filesystem blocksize, > also agreed, this cannot be metadata, since their size is MUCH > smaller, like 4k or less, mostly. > > But why would these grouped subblock reads/writes all end up on > the same NSD server, while the others do full block writes ? > > How is your NSD server setup per NSD ? did you 'round-robin' set > the preferred NSD server per NSD ? > are the client nodes transferring the data in anyway doing > specifics ?? > > Sorry for not having a solution for you, jsut sharing a few ideas ;) > > > Mit freundlichen Gr??en / Kind regards > > *Achim Rehor* > > Technical Support Specialist Spectrum Scale and ESS (SME) > Advisory Product Services Professional > IBM Systems Storage Support - EMEA > > > > > > > gpfsug-discuss-bounces at spectrumscale.org wrote on 23/02/2022 22:20:11: > > > From: "Andrew Beattie" > > To: "gpfsug main discussion list" > > Date: 23/02/2022 22:20 > > Subject: [EXTERNAL] Re: [gpfsug-discuss] IO sizes > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Alex, Metadata will be 4Kib Depending on the filesystem version you > > will also have subblocks to consider V4 filesystems have 1/32 > > subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata > > and data block size is the same) > ???????????ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Alex, > > > > Metadata will be 4Kib > > > > Depending on the filesystem version you will also have subblocks to > > consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/ > > 1024 subblocks (assuming metadata and data block size is the same) > > > > My first question would be is ? Are you sure that Linux OS is > > configured the same on all 4 NSD servers?. > > > > My second question would be do you know what your average file size > > is if most of your files are smaller than your filesystem block > > size, then you are always going to be performing writes using groups > > of subblocks rather than a full block writes. > > > > Regards, > > > > Andrew > > > > On 24 Feb 2022, at 04:39, Alex Chekholko > wrote: > > > ? Hi, Metadata I/Os will always be smaller than the usual data block > > size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, > > 2022 at 10:26 AM Uwe Falke wrote: Dear all, > > sorry for asking a question which seems ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > > > Metadata I/Os will always be smaller than the usual data block > size, right? > > Which version of GPFS? > > > > Regards, > > Alex > > > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: > > Dear all, > > > > sorry for asking a question which seems not directly GPFS related: > > > > In a setup with 4 NSD servers (old-style, with storage > controllers in > > the back end), 12 clients and 10 Seagate storage systems, I do > see in > > benchmark tests that ?just one of the NSD servers does send > smaller IO > > requests to the storage ?than the other 3 (that is, both reads and > > writes are smaller). > > > > The NSD servers form 2 pairs, each pair is connected to 5 > seagate boxes > > ( one server to the controllers A, the other one to controllers > B of the > > Seagates, resp.). > > > > All 4 NSD servers are set up similarly: > > > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > > > HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > > > driver : mpt3sas 31.100.01.00 > > > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as > limited by > > mpt3sas) for all sd devices and all multipath (dm) devices built > on top. > > > > scheduler: deadline > > > > multipath (actually we do have 3 paths to each volume, so there > is some > > asymmetry, but that should not affect the IOs, shouldn't it?, > and if it > > did we would see the same effect in both pairs of NSD servers, > but we do > > not). > > > > All 4 storage systems are also configured the same way (2 disk > groups / > > pools / declustered arrays, one managed by ?ctrl A, one by ctrl > B, ?and > > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). > > > > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO > > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > > > The first question I have - but that is not my main one: I do > see, both > > in iostat and on the storage systems, that the default IO > requests are > > about 4MiB, not 8MiB as I'd expect from above settings > (max_sectors_kb > > is really in terms of kiB, not sectors, cf. > > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). > > > > But what puzzles me even more: one of the server compiles IOs even > > smaller, varying between 3.2MiB and 3.6MiB mostly - both for > reads and > > writes ... I just cannot see why. > > > > I have to suspect that this will (in writing to the storage) cause > > incomplete stripe writes on our erasure-coded volumes (8+2p)(as > long as > > the controller is not able to re-coalesce the data properly; and it > > seems it cannot do it completely at least) > > > > > > If someone of you has seen that already and/or knows a potential > > explanation I'd be glad to learn about. > > > > > > And if some of you wonder: yes, I (was) moved away from IBM and > am now > > at KIT. > > > > Many thanks in advance > > > > Uwe > > > > > > -- > > Karlsruhe Institute of Technology (KIT) > > Steinbuch Centre for Computing (SCC) > > Scientific Data Management (SDM) > > > > Uwe Falke > > > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > > D-76344 Eggenstein-Leopoldshafen > > > > Tel: +49 721 608 28024 > > Email: uwe.falke at kit.edu > > www.scc.kit.edu > > > > Registered office: > > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > > > KIT ? The Research University in the Helmholtz Association > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > INVALID URI REMOVED > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-M&m=- > > > FdZvYBvHDPnBTu2FtPkLT09ahlYp2QsMutqNV2jWaY&s=S4C2D3_h4FJLAw0PUYLKhKE242vn_fwn-1_EJmHNpE8&e= > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.1__%3D4EBB0D60DFD775728f9e8a93df938690%40ibm.com.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From uwe.falke at kit.edu Mon Feb 28 09:17:26 2022 From: uwe.falke at kit.edu (Uwe Falke) Date: Mon, 28 Feb 2022 10:17:26 +0100 Subject: [gpfsug-discuss] IO sizes In-Reply-To: References: Message-ID: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> Hi, Kumaran, that would explain the smaller IOs before the reboot, but not the larger-than-4MiB IOs afterwards on that machine. Then, I already saw that the numaMemoryInterleave setting seems to have no effect (on that very installation), I just have not yet requested a PMR for it. I'd checked memory usage of course and saw that regardless of this setting always one socket's memory is almost completely consumed while the other one's is rather empty - looks like a bug to me, but that needs further investigation. Uwe On 24.02.22 15:32, Kumaran Rajaram wrote: > > Hi Uwe, > > >> But what puzzles me even more: one of the server compiles IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and > writes ... I just cannot see why. > > IMHO, If GPFS on this particular NSD server was restarted often during > the setup, then it is possible that the GPFS pagepool may not be > contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a > scatter-gather (SG) list with many small entries (in the memory) > resulting in smaller I/O when these buffers are issued to the disks. > The fix would be to reboot the server and start GPFS so that pagepool > is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) > SG entries. > > >>In the current situation (i.e. with IOs bit larger than 4MiB) > setting max_sectors_kB to 4096 might do the trick, but as I do not > know the cause for that behaviour it might well start to issue IOs > >>smaller than 4MiB again at some point, so that is not a nice solution. > > It will be advised not to restart GPFS often in the NSD servers (in > production) to keep the pagepool contiguous. Ensure that there is > enough free memory in NSD server and not run any memory intensive jobs > so that pagepool is not impacted (e.g. swapped out). > > Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is > equally distributed across the NUMA domains for good performance. GPFS > numaMemoryInterleave=yes requires that numactl packages are installed > and then GPFS restarted. > > # mmfsadm dump config | egrep "numaMemory|pagepool " > > ! numaMemoryInterleave yes > > ! pagepool 282394099712 > > # pgrep mmfsd | xargs numastat -p > > Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) > > ?????????????????????????? Node 0 Node 1?????????? Total > > ????????????????? --------------- --------------- --------------- > > Huge???????????????????????? 0.00 0.00??????????? 0.00 > > Heap???????????????????????? 1.26 3.26???????? ???4.52 > > Stack??????????????????????? 0.01 0.01??????????? 0.02 > > Private???????????????? 137710.43 137709.96?????? 275420.39 > > ----------------? --------------- --------------- --------------- > > Total?????????????????? 137711.70 137713.23 ??????275424.92 > > My two cents, > > -Kums > > Kumaran Rajaram > > *From:* gpfsug-discuss-bounces at spectrumscale.org > *On Behalf Of *Uwe Falke > *Sent:* Wednesday, February 23, 2022 8:04 PM > *To:* gpfsug-discuss at spectrumscale.org > *Subject:* Re: [gpfsug-discuss] IO sizes > > Hi, > > the test bench is gpfsperf running on up to 12 clients with 1...64 > threads doing sequential reads and writes , file size per gpfsperf > process is 12TB (with 6TB I saw caching effects in particular for > large thread numbers ...) > > As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data > disks, as expected in that case. > > Interesting thing though: > > I have rebooted the suspicious node. Now, it does not issue smaller > IOs than the others, but -- unbelievable -- larger ones (up to about > 4.7MiB). This is still harmful as also that size is incompatible with > full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) > > Currently, I draw this information from the storage boxes; I have not > yet checked iostat data for that benchmark test after the reboot > (before, when IO sizes were smaller, we saw that both in iostat and in > the perf data retrieved from the storage controllers). > > And: we have a separate data pool , hence dataOnly NSDs, I am just > talking about these ... > > As for "Are you sure that Linux OS is configured the same on all 4 NSD > servers?." - of course there are not two boxes identical in the world. > I have actually not installed those machines, and, yes, i also > considered reinstalling them (or at least the disturbing one). > > However, I do not have reason to assume or expect a difference, the > supplier has just implemented these systems recently from scratch. > > In the current situation (i.e. with IOs bit larger than 4MiB) setting > max_sectors_kB to 4096 might do the trick, but as I do not know the > cause for that behaviour it might well start to issue IOs smaller than > 4MiB again at some point, so that is not a nice solution. > > Thanks > > Uwe > > On 23.02.22 22:20, Andrew Beattie wrote: > > Alex, > > Metadata will be 4Kib > > Depending on the filesystem version you will also have subblocks > to consider V4 filesystems have 1/32 subblocks, V5 filesystems > have 1/1024 subblocks (assuming metadata and data block size is > the same) > > > My first question would be is ? Are you sure that Linux OS is > configured the same on all 4 NSD servers?. > > My second question would be do you know what your average file > size is if most of your files are smaller than your filesystem > block size, then you are always going to be performing writes > using groups of subblocks rather than a full block writes. > > Regards, > > Andrew > > > > On 24 Feb 2022, at 04:39, Alex Chekholko > wrote: > > ? Hi, Metadata I/Os will always be smaller than the usual data > block size, right? Which version of GPFS? Regards, Alex On > Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a > question which seems ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi, > > Metadata I/Os will always be smaller than the usual data block > size, right? > > Which version of GPFS? > > Regards, > > Alex > > On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: > > Dear all, > > sorry for asking a question which seems not directly GPFS > related: > > In a setup with 4 NSD servers (old-style, with storage > controllers in > the back end), 12 clients and 10 Seagate storage systems, > I do see in > benchmark tests that? just one of the NSD servers does > send smaller IO > requests to the storage? than the other 3 (that is, both > reads and > writes are smaller). > > The NSD servers form 2 pairs, each pair is connected to 5 > seagate boxes > ( one server to the controllers A, the other one to > controllers B of the > Seagates, resp.). > > All 4 NSD servers are set up similarly: > > kernel: 3.10.0-1160.el7.x86_64 #1 SMP > > HBA:?Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx > > driver : mpt3sas 31.100.01.00 > > max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, > as limited by > mpt3sas) for all sd devices and all multipath (dm) devices > built on top. > > scheduler: deadline > > multipath (actually we do have 3 paths to each volume, so > there is some > asymmetry, but that should not affect the IOs, shouldn't > it?, and if it > did we would see the same effect in both pairs of NSD > servers, but we do > not). > > All 4 storage systems are also configured the same way (2 > disk groups / > pools / declustered arrays, one managed by? ctrl A, one by > ctrl B,? and > 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 > NSDs). > > > GPFS BS is 8MiB , according to iohistory (mmdiag) we do > see clean IO > requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. > > The first question I have - but that is not my main one: I > do see, both > in iostat and on the storage systems, that the default IO > requests are > about 4MiB, not 8MiB as I'd expect from above settings > (max_sectors_kb > is really in terms of kiB, not sectors, cf. > https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt > ). > > But what puzzles me even more: one of the server compiles > IOs even > smaller, varying between 3.2MiB and 3.6MiB mostly - both > for reads and > writes ... I just cannot see why. > > I have to suspect that this will (in writing to the > storage) cause > incomplete stripe writes on our erasure-coded volumes > (8+2p)(as long as > the controller is not able to re-coalesce the data > properly; and it > seems it cannot do it completely at least) > > > If someone of you has seen that already and/or knows a > potential > explanation I'd be glad to learn about. > > > And if some of you wonder: yes, I (was) moved away from > IBM and am now > at KIT. > > Many thanks in advance > > Uwe > > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > > Uwe Falke > > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > > Tel: +49 721 608 28024 > Email: uwe.falke at kit.edu > www.scc.kit.edu > > > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > Karlsruhe Institute of Technology (KIT) > Steinbuch Centre for Computing (SCC) > Scientific Data Management (SDM) > Uwe Falke > Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 > D-76344 Eggenstein-Leopoldshafen > Tel: +49 721 608 28024 > Email:uwe.falke at kit.edu > www.scc.kit.edu > Registered office: > Kaiserstra?e 12, 76131 Karlsruhe, Germany > KIT ? The Research University in the Helmholtz Association > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Renar.Grunenberg at huk-coburg.de Mon Feb 28 12:23:55 2022 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Mon, 28 Feb 2022 12:23:55 +0000 Subject: [gpfsug-discuss] IO sizes In-Reply-To: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> References: <72c6ea70-6d00-5cc1-7f26-f5cb1aabbd7a@kit.edu> Message-ID: <7a29b404669942d193ad46c2632d6d30@huk-coburg.de> Hallo Uwe, are numactl already installed on that affected node? If it missed the numa scale stuff is not working. Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss-bounces at spectrumscale.org Im Auftrag von Uwe Falke Gesendet: Montag, 28. Februar 2022 10:17 An: gpfsug-discuss at spectrumscale.org Betreff: Re: [gpfsug-discuss] IO sizes Hi, Kumaran, that would explain the smaller IOs before the reboot, but not the larger-than-4MiB IOs afterwards on that machine. Then, I already saw that the numaMemoryInterleave setting seems to have no effect (on that very installation), I just have not yet requested a PMR for it. I'd checked memory usage of course and saw that regardless of this setting always one socket's memory is almost completely consumed while the other one's is rather empty - looks like a bug to me, but that needs further investigation. Uwe On 24.02.22 15:32, Kumaran Rajaram wrote: Hi Uwe, >> But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. IMHO, If GPFS on this particular NSD server was restarted often during the setup, then it is possible that the GPFS pagepool may not be contiguous. As a result, GPFS 8MiB buffer in the pagepool might be a scatter-gather (SG) list with many small entries (in the memory) resulting in smaller I/O when these buffers are issued to the disks. The fix would be to reboot the server and start GPFS so that pagepool is contiguous resulting in 8MiB buffer to be comprised of 1 (or fewer) SG entries. >>In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs >>smaller than 4MiB again at some point, so that is not a nice solution. It will be advised not to restart GPFS often in the NSD servers (in production) to keep the pagepool contiguous. Ensure that there is enough free memory in NSD server and not run any memory intensive jobs so that pagepool is not impacted (e.g. swapped out). Also, enable GPFS numaMemoryInterleave=yes and verify that pagepool is equally distributed across the NUMA domains for good performance. GPFS numaMemoryInterleave=yes requires that numactl packages are installed and then GPFS restarted. # mmfsadm dump config | egrep "numaMemory|pagepool " ! numaMemoryInterleave yes ! pagepool 282394099712 # pgrep mmfsd | xargs numastat -p Per-node process memory usage (in MBs) for PID 2120821 (mmfsd) Node 0 Node 1 Total --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 1.26 3.26 4.52 Stack 0.01 0.01 0.02 Private 137710.43 137709.96 275420.39 ---------------- --------------- --------------- --------------- Total 137711.70 137713.23 275424.92 My two cents, -Kums Kumaran Rajaram [cid:image001.png at 01D82CA6.6F82DC70] From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Uwe Falke Sent: Wednesday, February 23, 2022 8:04 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] IO sizes Hi, the test bench is gpfsperf running on up to 12 clients with 1...64 threads doing sequential reads and writes , file size per gpfsperf process is 12TB (with 6TB I saw caching effects in particular for large thread numbers ...) As I wrote initially: GPFS is issuing nothing but 8MiB IOs to the data disks, as expected in that case. Interesting thing though: I have rebooted the suspicious node. Now, it does not issue smaller IOs than the others, but -- unbelievable -- larger ones (up to about 4.7MiB). This is still harmful as also that size is incompatible with full stripe writes on the storage ( 8+2 disk groups, i.e. logically RAID6) Currently, I draw this information from the storage boxes; I have not yet checked iostat data for that benchmark test after the reboot (before, when IO sizes were smaller, we saw that both in iostat and in the perf data retrieved from the storage controllers). And: we have a separate data pool , hence dataOnly NSDs, I am just talking about these ... As for "Are you sure that Linux OS is configured the same on all 4 NSD servers?." - of course there are not two boxes identical in the world. I have actually not installed those machines, and, yes, i also considered reinstalling them (or at least the disturbing one). However, I do not have reason to assume or expect a difference, the supplier has just implemented these systems recently from scratch. In the current situation (i.e. with IOs bit larger than 4MiB) setting max_sectors_kB to 4096 might do the trick, but as I do not know the cause for that behaviour it might well start to issue IOs smaller than 4MiB again at some point, so that is not a nice solution. Thanks Uwe On 23.02.22 22:20, Andrew Beattie wrote: Alex, Metadata will be 4Kib Depending on the filesystem version you will also have subblocks to consider V4 filesystems have 1/32 subblocks, V5 filesystems have 1/1024 subblocks (assuming metadata and data block size is the same) My first question would be is ? Are you sure that Linux OS is configured the same on all 4 NSD servers?. My second question would be do you know what your average file size is if most of your files are smaller than your filesystem block size, then you are always going to be performing writes using groups of subblocks rather than a full block writes. Regards, Andrew On 24 Feb 2022, at 04:39, Alex Chekholko wrote: ? Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke wrote: Dear all, sorry for asking a question which seems ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, Metadata I/Os will always be smaller than the usual data block size, right? Which version of GPFS? Regards, Alex On Wed, Feb 23, 2022 at 10:26 AM Uwe Falke > wrote: Dear all, sorry for asking a question which seems not directly GPFS related: In a setup with 4 NSD servers (old-style, with storage controllers in the back end), 12 clients and 10 Seagate storage systems, I do see in benchmark tests that just one of the NSD servers does send smaller IO requests to the storage than the other 3 (that is, both reads and writes are smaller). The NSD servers form 2 pairs, each pair is connected to 5 seagate boxes ( one server to the controllers A, the other one to controllers B of the Seagates, resp.). All 4 NSD servers are set up similarly: kernel: 3.10.0-1160.el7.x86_64 #1 SMP HBA: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx driver : mpt3sas 31.100.01.00 max_sectors_kb=8192 (max_hw_sectors_kb=16383 , not 16384, as limited by mpt3sas) for all sd devices and all multipath (dm) devices built on top. scheduler: deadline multipath (actually we do have 3 paths to each volume, so there is some asymmetry, but that should not affect the IOs, shouldn't it?, and if it did we would see the same effect in both pairs of NSD servers, but we do not). All 4 storage systems are also configured the same way (2 disk groups / pools / declustered arrays, one managed by ctrl A, one by ctrl B, and 8 volumes out of each; makes altogether 2 x 8 x 10 = 160 NSDs). GPFS BS is 8MiB , according to iohistory (mmdiag) we do see clean IO requests of 16384 disk blocks (i.e. 8192kiB) from GPFS. The first question I have - but that is not my main one: I do see, both in iostat and on the storage systems, that the default IO requests are about 4MiB, not 8MiB as I'd expect from above settings (max_sectors_kb is really in terms of kiB, not sectors, cf. https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt). But what puzzles me even more: one of the server compiles IOs even smaller, varying between 3.2MiB and 3.6MiB mostly - both for reads and writes ... I just cannot see why. I have to suspect that this will (in writing to the storage) cause incomplete stripe writes on our erasure-coded volumes (8+2p)(as long as the controller is not able to re-coalesce the data properly; and it seems it cannot do it completely at least) If someone of you has seen that already and/or knows a potential explanation I'd be glad to learn about. And if some of you wonder: yes, I (was) moved away from IBM and am now at KIT. Many thanks in advance Uwe -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6469 bytes Desc: image001.png URL: From p.ward at nhm.ac.uk Mon Feb 28 16:40:08 2022 From: p.ward at nhm.ac.uk (Paul Ward) Date: Mon, 28 Feb 2022 16:40:08 +0000 Subject: [gpfsug-discuss] Interoperability of Transparent cloud tiering with other IBM Spectrum Scale features Message-ID: I am used to a SCALE solution with space management to a tape tier. Files can not be migrated unless they are backed up. Once migrated and are a stub file they are not backed up as a stub, and they are not excluded from backup. We used the Spectrum Protect BA client, not mmbackup. We have a new SCALE solution with COS, setup with TCT. I am expecting it to operate in the same way. Files can't be migrated unless backed up. Once migrated they are a stub and a don't get backed up again. We are using mmbackup. I migrated files before backup was setup. When backup was turned on, it pulled the files back. The migration policy was set to migrate files not accessed for 2 days. All data met this requirement. Migrations is set to run every 15 minutes, so was pushing them back quite quickly. The cluster was a mess of files going back and forth from COS. To stop this I changed the policy to 14 days. I set mmbackup to exclude migrated files. Things calmed down. I have now almost run out of space on my hot tier, but anything I migrate will expire from backup. The statement below is a bit confusing. HSM and TCT are completely different. I thought TCT was for cloud, and HSM for tape? Both can exist in a cluster but operate on different areas. This suggest to have mmbackup work with data migrated to a cloud tier, we should be using HSM not TCT? Can mmbackup with TCT do what HSM does? https://www.ibm.com/docs/en/spectrum-scale/5.0.5?topic=ics-interoperability-transparent-cloud-tiering-other-spectrum-scale-features Spectrum Protect (TSM) For the file systems that are managed by an HSM system, ensure that hot data is backed up to TSM by using the mmbackup command, and as the data gets cooler, migrate them to the cloud storage tier. This ensures that the mmbackup command has already backed up the cooler files that are migrated to the cloud. Has anyone set something up similar? Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [A picture containing drawing Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 5356 bytes Desc: image001.jpg URL: