[gpfsug-discuss] Reasons for DiskLeaseThread Overloaded

Walter Sklenka Walter.Sklenka at EDV-Design.at
Tue Feb 14 15:44:30 GMT 2023


Hi!
I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version
Dump level: verbose
Build branch "5.1.6.1 ".

the messages started  from the beginning



From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Christian Vieser
Sent: Dienstag, 14. Februar 2023 15:34
To: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded


What version of Spectrum Scale is running there? Do these errors appear since your last version update?
Am 14.02.23 um 14:09 schrieb Walter Sklenka:
Dear Collegues!
May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads?
Is this a “local node” Problem or a network problem ?
I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load



Following is an excerpt from mmfs.log.latest

2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease.
2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------
2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds
2023-02-14_12:06:53.600+0100: [W]  counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8)
2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294):
2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0
2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0
2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster.
2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease.
2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster.
2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y
2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y
2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y
2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive
2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local <c0n1>:[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0
2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease.
2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster.
2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y
Mit freundlichen Grüßen
Walter Sklenka
Technical Consultant



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20230214/48433496/attachment-0002.htm>


More information about the gpfsug-discuss mailing list