[gpfsug-discuss] node lockups in gpfs > 4.1.1.14

Aaron Knister aaron.s.knister at nasa.gov
Fri Aug 4 16:02:04 BST 2017


I've narrowed the problem down to 4.1.1.16. We'll most likely be 
downgrading to 4.1.1.15.

-Aaron

On 8/4/17 4:00 AM, Aaron Knister wrote:
> Hey All,
> 
> Anyone seen any strange behavior running either 4.1.1.15 or 4.1.1.16?
> 
> We are mid upgrade to 4.1.1.16 from 4.1.1.14 and have seen some rather 
> disconcerting behavior. Specifically on some of the upgraded nodes GPFS 
> will seemingly deadlock on the entire node rendering it unusable. I 
> can't even get a session on the node (but I can trigger a crash dump via 
> a sysrq trigger).
> 
> Most blocked tasks are blocked are in cxiWaitEventWait at the top of 
> their call trace. That's probably not very helpful in of itself but I'm 
> curious if anyone else out there has run into this issue or if this is a 
> known bug.
> 
> (I'll open a PMR later today once I've gathered more diagnostic 
> information).
> 
> -Aaron
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list