[gpfsug-discuss] waiters and files causing waiters

Ryan Novosielski novosirj at rutgers.edu
Fri Oct 18 02:18:04 BST 2019


Found my notes on this; very similar to what Behrooz was saying. 

This here is from “mmfsadm dump waiters,selected_files”; as you can see here, we’re looking at thread 29168. Apparently below, “inodeFlushHolder” corresponds to that same thread in the case I was looking at.

You could then look up the inode with “tsfindinode -i <inode> <fsname>”, so like for the below, "tsfindinode -i 41538053 /gpfs/cache” on our system.

===== dump waiters ====
Current time 2019-05-01_13:48:26-0400
Waiting 0.1669 sec since 13:48:25, monitored, thread 29168 FileBlockWriteFetchHandlerThread: on ThCond 0x7F55E40014C8 (MsgRecordCondvar), reason 'RPC wait' for quotaMsgRequestShare on node 192.168.33.7 <c1n1>

===== dump selected_files =====
Current time 2019-05-01_13:48:36-0400

...

OpenFile:  4E044E5B0601A8C0:000000000279D205:0000000000000000 @ 0x1806AC5EAC8
 cach 1 ref 1 hc 2 tc 6 mtx 0x1806AC5EAF8
 Inode: valid eff token xw @ 0x1806AC5EC70, ctMode xw seq 170823
   lock state [ wf: 1 ] x [] flags [ ]
 Mnode: valid eff token xw @ 0x1806AC5ECC0, ctMode xw seq 170823
 DMAPI: invalid eff token nl @ 0x1806AC5EC20, ctMode nl seq 170821
 SMBOpen: valid eff token (A:RMA D:   ) @ 0x1806AC5EB50, ctMode (A:RMA D:   ) seq 170823
   lock state [ M(2) D: ] x [] flags [ ]
 SMBOpLk: valid eff token wf @ 0x1806AC5EBC0, ctMode wf Flags 0x30 (pfro+pfxw) seq 170822
 BR: @ 0x1806AC5ED20, ctMode nl Flags 0x10 (pfro) seq 170823
   treeP 0x18016189C08 C btFastTrack 0 1 ranges mode RO/XW:
   BLK [0,INF] mode XW node <403>
 Fcntl: @ 0x1806AC5ED48, ctMode nl Flags 0x30 (pfro+pfxw) seq 170823
   treeP 0x18031A5E3F8 C btFastTrack 0 1 ranges mode RO/XW:
   BLK [0,INF] mode XW node <403>
 inode 41538053 snap 0 USERFILE nlink 1 genNum 0x3CC2743F mode 0200100600: -rw-------
 tmmgr node <c1n1> (other)
 metanode <c1n403> (me) fail+panic count -1 flags 0x0, remoteStart 0 remoteCnt 0 localCnt 177 lastFrom 65535 switchCnt 0
 locks held in mode xw:
   0x1806AC5F238: 0x0-0xFFF tid 15954 gbl 0 mode xw rel 0
 BRL nXLocksOrRelinquishes 285
 vfsReference 1
 dioCount 0 dioFlushNeeded 1 dioSkipCounter 0 dioReentryThreshold 0.000000
 hasWriterInstance 1
 inodeFlushFlag 1 inodeFlushHolder 29168 openInstCount 1
 metadataFlushCount 2, metadataFlushWaiters 0/0, metadataCommitVersion 1
 bufferListCount 1 bufferListChangeCount 3
 dirty status: flushed dirtiedSyncNum 1477623
 SMB oplock state: nWriters 1
 indBlockDeallocLock:
   sharedLockWord 1 exclLockWord 0 upgradeWaitingS_W 0 upgradeWaitingW_X 0
 inodeValid 1
 objectVersion 240
 flushVersion 8086700 mnodeChangeCount 1
 block size code 5 (32 subblocksPerFileBlock)
 dataBytesPerFileBlock 4194304
 fileSize 0 synchedFileSize 0 indirectionLevel 1
 atime 1556732911.496160000
 mtime 1556732911.496479000
 ctime 1556732911.496479000
 crtime 1556732911.496160000
 owner uid 169589 gid 169589

> On Oct 10, 2019, at 4:43 PM, Damir Krstic <damir.krstic at gmail.com> wrote:
> 
> is it possible via some set of mmdiag --waiters or mmfsadm dump ? to figure out which files or directories access (whether it's read or write) is causing long-er waiters?
> 
> in all my looking i have not been able to get that information out of various diagnostic commands.
> 
> thanks,
> damir
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss



More information about the gpfsug-discuss mailing list