[gpfsug-discuss] Potential problems - leaving trace enabled in over-write mode?
Aaron Knister
aaron.s.knister at nasa.gov
Tue Mar 7 21:51:23 GMT 2017
Hi Bob,
I have the impression the biggest impact is to metadata-type operations
rather than throughput but don't quote me on that because I have very
little data to back it up. In the process of testing upgrading from GPFS
3.5 to 4.1 we ran fio on 1000 some nodes against an FS in our test
environment which sustained about 60-80k iops on the filesystem's
metadata LUNs. At one point I couldn't understand why I was struggling
to get about 13k iops and realized tracing was turned on on some subset
of nsd servers (which are also manager nodes). After turning it off the
throughput immediately shot back up to where I was expecting it to be.
Also during testing we were tracking down a bug for which I needed to
run tracing *everywhere* and then turn it off when one of the manager
nodes saw a particular error. I used a script IBM had sent me a while
back to help with this that I made some tweaks to. I've attached it in
case its helpful. In a nutshell the process looks like:
- start tracing everywhere (/usr/lpp/mmfs/bin/mmdsh -Nall
/usr/lpp/mmfs/bin/mmtrace start). Doing it this way avoids the need to
change the sdrfs file which depending on your cluster size may or may
not have some benefits.
- run a command to watch for the event in question that when triggered
runs /usr/lpp/mmfs/bin/mmdsh -Nall /usr/lpp/mmfs/bin/mmtrace stop
If the condition could present itself on multiple nodes within quick
succession (as was the case for me) you could wrap the mmdsh for
stopping tracing in an flock, using an arbitrary node that stores the
lock locally:
ssh $stopHost flock -xn /tmp/mmfsTraceStopLock -c
"'/usr/lpp/mmfs/bin/mmdsh -N all /usr/lpp/mmfs/bin/mmtrace stop'"
Wrapping it in an flock avoids multiple trace format format attempts.
-Aaron
On 3/7/17 3:32 PM, Oesterlin, Robert wrote:
> I’m considering enabling trace on all nodes all the time, doing
> something like this:
>
>
>
> mmtracectl --set --trace=def --trace-recycle=global
> --tracedev-write-mode=overwrite --tracedev-overwrite-buffer-size=256M
> mmtracectl --start
>
>
>
> My questions are:
>
>
>
> - What is the performance penalty of leaving this on 100% of the time on
> a node?
>
> - Does anyone have any suggestions on automation on stopping trace when
> a particular event occurs?
>
> - What other issues, if any?
>
>
>
>
>
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 507-269-0413
>
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
-------------- next part --------------
#!/usr/bin/ksh
stopHost=loremds20
mmtrace=/usr/lpp/mmfs/bin/mmtrace
mmtracectl=/usr/lpp/mmfs/bin/mmtracectl
# No automatic start of mmtrace.
# Second to sleep between checking.
secondsToSleep=2
# Flag to know when tripped or stopped
tripped=0
# mmfs log file to monitor
logToGrep=/var/log/messages
# Path to mmfs bin directory
MMFSbin=/usr/lpp/mmfs/bin
# Trip file. Will exist if trap is sprung
trapHasSprung=/tmp/mmfsTrapHasSprung
rm $trapHasSprung 2>/dev/null
# Start tracing on this node
#${mmtrace} start
# Initial count of expelled message in mmfs log
baseCount=$(grep "unmounted by the system with return code 301 reason code" $logToGrep | wc -l)
# do this loop while the trip file does not exist
while [[ ! -f $trapHasSprung ]]
do
sleep $secondsToSleep
# Get current count of expelled to check against the initial.
currentCount=$(grep "unmounted by the system with return code 301 reason code" $logToGrep | wc -l)
if [[ $currentCount > $baseCount ]]
then
tripped=1
/usr/lpp/mmfs/bin/mmdsh -N managernodes,quorumnodes touch $trapHasSprung
# cluster manager?
#stopHost=$(/usr/lpp/mmfs/bin/tslsmgr | grep '^Cluster manager' | awk '{ print $NF }' | sed -e 's/[()]//g')
ssh $stopHost flock -xn /tmp/mmfsTraceStopLock -c "'/usr/lpp/mmfs/bin/mmdsh -N all -f128 /usr/lpp/mmfs/bin/mmtrace stop noformat'"
fi
done
More information about the gpfsug-discuss
mailing list