[gpfsug-discuss] bizarre performance behavior

Aaron Knister aaron.s.knister at nasa.gov
Fri Feb 17 16:53:00 GMT 2017


Well, disabling the C1E state seems to have done the trick. I removed 
the kernel parameters I mentioned and set the cpu governer back to 
ondemand with a minimum of 1.2ghz. I'm now getting 6.2GB/s of reads 
which I believe is pretty darned close to theoretical peak performance.

-Aaron

On 2/17/17 10:52 AM, Aaron Knister wrote:
> This is a good one. I've got an NSD server with 4x 16GB fibre
> connections coming in and 1x FDR10 and 1x QDR connection going out to
> the clients. I was having a really hard time getting anything resembling
> sensible performance out of it (4-5Gb/s writes but maybe 1.2Gb/s for
> reads). The back-end is a DDN SFA12K and I *know* it can do better than
> that.
>
> I don't remember quite how I figured this out but simply by running
> "openssl speed -multi 16" on the nsd server to drive up the load I saw
> an almost 4x performance jump which is pretty much goes against every
> sysadmin fiber in me (i.e. "drive up the cpu load with unrelated crap to
> quadruple your i/o performance").
>
> This feels like some type of C-states frequency scaling shenanigans that
> I haven't quite ironed down yet. I booted the box with the following
> kernel parameters "intel_idle.max_cstate=0 processor.max_cstate=0" which
> didn't seem to make much of a difference. I also tried setting the
> frequency governer to userspace and setting the minimum frequency to
> 2.6ghz (it's a 2.6ghz cpu). None of that really matters-- I still have
> to run something to drive up the CPU load and then performance improves.
>
> I'm wondering if this could be an issue with the C1E state? I'm curious
> if anyone has seen anything like this. The node is a dx360 M4
> (Sandybridge) with 16 2.6GHz cores and 32GB of RAM.
>
> -Aaron
>

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list