[gpfsug-discuss] iowait?

Scott Fadden sfadden at us.ibm.com
Mon Aug 29 20:33:14 BST 2016


There is a known performance issue that can possibly cause longer than 
expected network time-outs if you are running iohist too often. So be 
careful it is best to collect it as a sample, instead of all of the time. 


Scott Fadden
Spectrum Scale - Technical Marketing 
Phone: (503) 880-5833 
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale



From:   Aaron Knister <aaron.s.knister at nasa.gov>
To:     <gpfsug-discuss at spectrumscale.org>
Date:   08/29/2016 11:09 AM
Subject:        Re: [gpfsug-discuss] iowait?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Nice! Thanks Bryan. I wonder what the implications are of setting it to 
something high enough that we could capture data every 10s. I figure if 
512 events only takes me to 1 second I would need to log in the realm of 
10k to capture every 10 seconds and account for spikes in I/O.

-Aaron

On 8/29/16 2:06 PM, Bryan Banister wrote:
> Try this:
>
> mmchconfig ioHistorySize=1024 # Or however big you want!
>
> Cheers,
> -Bryan
>
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron 
Knister
> Sent: Monday, August 29, 2016 1:05 PM
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] iowait?
>
> That's an interesting idea. I took a look at mmdig --iohist on a busy 
node it doesn't seem to capture more than literally 1 second of history.
> Is there a better way to grab the data or have gpfs capture more of it?
>
> Just to give some more context, as part of our monthly reporting 
requirements we calculate job efficiency by comparing the number of cpu 
cores requested by a given job with the cpu % utilization during that 
job's time window. Currently a job that's doing a sleep 9000 would show up 
the same as a job blocked on I/O. Having GPFS wait time included in iowait 
would allow us to easily make this distinction.
>
> -Aaron
>
> On 8/29/16 1:56 PM, Bryan Banister wrote:
>> There is the iohist data that may have what you're looking for, -Bryan
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron
>> Knister
>> Sent: Monday, August 29, 2016 12:54 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: Re: [gpfsug-discuss] iowait?
>>
>> Sure, we can and we do use both iostat/sar and collectl to collect disk 
utilization on our nsd servers. That doesn't give us insight, though, into 
any individual client node of which we've got 3500. We do log mmpmon data 
from each node but that doesn't give us any insight into how much time is 
being spent waiting on I/O. Having GPFS report iowait on client nodes 
would give us this insight.
>>
>> On 8/29/16 1:50 PM, Alex Chekholko wrote:
>>> Any reason you can't just use iostat or collectl or any of a number
>>> of other standards tools to look at disk utilization?
>>>
>>> On 08/29/2016 10:33 AM, Aaron Knister wrote:
>>>> Hi Everyone,
>>>>
>>>> Would it be easy to have GPFS report iowait values in linux? This
>>>> would be a huge help for us in determining whether a node's low
>>>> utilization is due to some issue with the code running on it or if
>>>> it's blocked on I/O, especially in a historical context.
>>>>
>>>> I naively tried on a test system changing schedule() in
>>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this:
>>>>
>>>> again:
>>>>   /* call the scheduler */
>>>>   if ( waitFlags & INTERRUPTIBLE )
>>>>     schedule();
>>>>   else
>>>>     io_schedule();
>>>>
>>>> Seems to actually do what I'm after but generally bad things happen
>>>> when I start pretending I'm a kernel developer.
>>>>
>>>> Any thoughts? If I open an RFE would this be something that's
>>>> relatively easy to implement (not asking for a commitment *to*
>>>> implement it, just that I'm not asking for something seemingly
>>>> simple that's actually fairly hard to implement)?
>>>>
>>>> -Aaron
>>>>
>>>
>>
>> --
>> Aaron Knister
>> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight
>> Center
>> (301) 286-2776
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>> ________________________________
>>
>> Note: This email is for the confidential use of the named addressee(s) 
only and may contain proprietary, confidential or privileged information. 
If you are not the intended recipient, you are hereby notified that any 
review, dissemination or copying of this email is strictly prohibited, and 
to please notify the sender immediately and destroy this email and any 
attachments. Email transmission cannot be guaranteed to be secure or 
error-free. The Company, therefore, does not make any guarantees as to the 
completeness or accuracy of this email or any attachments. This email is 
for informational purposes only and does not constitute a recommendation, 
offer, request or solicitation of any kind to buy, sell, subscribe, redeem 
or perform any type of transaction of a financial product.
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight 
Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> ________________________________
>
> Note: This email is for the confidential use of the named addressee(s) 
only and may contain proprietary, confidential or privileged information. 
If you are not the intended recipient, you are hereby notified that any 
review, dissemination or copying of this email is strictly prohibited, and 
to please notify the sender immediately and destroy this email and any 
attachments. Email transmission cannot be guaranteed to be secure or 
error-free. The Company, therefore, does not make any guarantees as to the 
completeness or accuracy of this email or any attachments. This email is 
for informational purposes only and does not constitute a recommendation, 
offer, request or solicitation of any kind to buy, sell, subscribe, redeem 
or perform any type of transaction of a financial product.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160829/f048d75c/attachment-0002.htm>


More information about the gpfsug-discuss mailing list