[gpfsug-discuss] Potential problems - leaving trace enabled in over-write mode?

Sven Oehme oehmes at gmail.com
Wed Mar 8 14:15:55 GMT 2017


yes , but i would do this in stages given how large your system is.
pick one set of nodes (lets say) 100 out of 200 that do similar things and
turn tracing on there.
this will give you data you can compare between the 2 set of nodes.
let it run for a week and if the data with your real workload doesn't show
any significant degradation (which is what i expect) turn it on everywhere.
the one thing i am not 100% sure about is size of trace buffer as well as
the global cut config. what this means is if you apply the settings as
mentioned in this first post, if one node asserts in your cluster you will
cut a trace on all nodes that will write a 256M buffer into your dump file
location.
if you have a node thats in an assert loop (asserts, restarts , asserts)
this can cause significant load on all nodes. therefore i would probably
start without cutting a global trace and reduce the trace size to 64M.
i (and i am sure other dev folks) would be very interested in the outcome
as we have this debate on a yearly basis if we shouldn't just turn tracing
on by default, in the past performance was the biggest hurdle, this is
solved now (my claim) . next big questions is how well does that work on
larger scale systems with production workloads. as more feedback we will
get in this area as better we can make informed decision how and if it
could be turned on all the time and work harder on handling cases like i
mentioned above to mitigate the risks .

sven



On Wed, Mar 8, 2017 at 2:56 PM Oesterlin, Robert <
Robert.Oesterlin at nuance.com> wrote:

> As always, Sven comes in to back this up with real data :)
>
>
>
> To net this out, Sven – I should be able enable trace on my NSD servers
> running 4.2.2 without much impact, correct?
>
>
>
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
>
>
>
>
>
>
>
> *From: *<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sven
> Oehme <oehmes at gmail.com>
> *Reply-To: *gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> *Date: *Wednesday, March 8, 2017 at 7:37 AM
>
>
> *To: *gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>
> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Potential problems - leaving
> trace enabled in over-write mode?
>
>
>
> starting in version 3.4 we enhanced the trace code of scale significant.
> this went on release to release all the way up to 4.2.1. since 4.2.1 we
> made further improvements, but much smaller changes, more optimization ,
> e.g. reducing of trace levels verbosity, etc .
>
> with 4.2.2  we switched from blocking traces to in memory traces as the
> default trace infrastructure, this infrastructure was designed to be turned
> on all the time with minimal impact on performance.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170308/7ce07e4c/attachment-0002.htm>


More information about the gpfsug-discuss mailing list