[gpfsug-discuss] mmfsd segfault/signal 6 on dirop.C:4548 in GPFS 5.0.2.x

IBM Spectrum Scale scale at us.ibm.com
Wed Aug 21 18:10:47 BST 2019


As was noted this problem is fixed in the Spectrum Scale 5.0.3 release 
stream.  Regarding the version number format of 5.0.2.0/1 I assume that it 
is meant to convey version 5.0.2 efix 1.
   
Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.



From:   Ryan Novosielski <novosirj at rutgers.edu>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   08/21/2019 12:04 PM
Subject:        [EXTERNAL] [gpfsug-discuss] mmfsd segfault/signal 6 on 
dirop.C:4548 in GPFS    5.0.2.x
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



I posted this on Slack, but it’s serious enough that I want to make sure 
everyone sees it. Does anyone, from IBM or otherwise, have any more 
information about this/whether it was even announced anyplace? Thanks!

A little late, but we ran into a relatively serious problem at our site 
with 5.0.2.3 at our site. The symptom is a mmfsd crash/segfault related to 
fs/dirop.C:4548. We ran into this sporadically, but it was repeatable on 
the problem workload. From IBM Support:

2. This is a known defect.
The problem has been fixed through
D.1073563: CTM_A_XW_FOR_DATA_IN_INODE related assert in DirLTE::lock
A companion fix is
D.1073753: Assert that the lock mode in DirLTE::lock is strong enough


The rep further said "It's not an APAR since it's found in internal 
testing. It's an internal function at a place it should not assert but a 
part of the condition as the code path is specific to the 
DIR_UPDATE_LOCKMODE optimization code... The assert was meant for certain 
file creation code path, but the condition wasn't set strictly for that 
code path that some other code path could also run into the assert. So we 
cannot predict on which node it would happen.” 

The fix was setting disableAssert="dirop.C:4548, which can be done live. 
Anyone seen anything else about this anyplace? The bug is fixed in 5.0.3.x 
and was introduced in 5.0.2.0/1 (not sure what this version number means; 
I’ve seen them listed X.X.X.X.X.X, X.X.X-X.X, and others).

--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State           |         Ryan Novosielski - 
novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS 
Campus
||  \\    of NJ           | Office of Advanced Research Computing - MSB 
C630, Newark
     `'

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=2DWKJiKyweSkGrSB_31bZQerI4xIgc6Z_Pw7iTGZpH4&s=oLoaU67CVtDLGyv_LZO8AqZRU739wj1q-PysELBsBow&e= 






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190821/f919801f/attachment-0002.htm>


More information about the gpfsug-discuss mailing list