[gpfsug-discuss] DB2 (not DB2 PureScale) and Spectrum Scale

Achim Rehor Achim.Rehor at de.ibm.com
Fri Jun 11 16:26:43 BST 2021


one additional noticable change, that comes in Spectrum Scale 5.0.4.2+ and 
 is an enhancement to what Jim just touched below.

Direct IO of databases is often doing small IO into huge files. Even with 
very fast backend, the amount of IOs doing 4k or 64k IOs limits the 
bandwidth because of the sheer amount of IO.
Having seen this issue, we added a feature to Spectrum Scale, that batches 
small IO per timeslot, in order to lessen the number of IO against the 
backend, and thus improving write performance.

the new feature is tuned by the 
  dioSmallSeqWriteBatching = yes[no] 
and will batch all smaller IO, that is 
  dioSmallSeqWriteThreshold = [65536]
or smaller in size , and dump it to disk avery
  aioSyncDelay = 10 (usec).

That is, if the system recognizes 3 or more small Direct IOs and 
dioSmallSeqWriteThreshold is set, it will gather all these IOs within 
aioSyncDelay and do just one IO (per FS Blocksize) instead of hundreds of 
small IOs. 
For certain use cases this can dramatically improve performance. 

see 
https://www.spectrumscaleug.org/wp-content/uploads/2020/04/SSSD20DE-Spectrum-Scale-Performance-Enhancements-for-Direct-IO.pdf 
by Olaf Weiser

 
Mit freundlichen Grüßen / Kind regards

Achim Rehor
 
Remote Technical Support Engineer Storage 
IBM Systems Storage Support - EMEA Storage Competence Center (ESCC)
Spectrum Scale / Elastic Storage Server
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Am Weiher 24
65451 Kelsterbach
Phone: +49-170-4521194
E-Mail: Achim.Rehor at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Sebastian Krause
Geschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert 
Janzen, Markus Koerner, Christian Noll, Nicole Reimer
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 


gpfsug-discuss-bounces at spectrumscale.org wrote on 10/06/2021 15:42:18:

> From: Jim Doherty <jjdoherty at yahoo.com>
> To: "gpfsug-discuss at spectrumscale.org" 
<gpfsug-discuss at spectrumscale.org>
> Date: 10/06/2021 15:42
> Subject: [EXTERNAL] Re: [gpfsug-discuss] DB2 (not DB2 PureScale) and
> Spectrum Scale
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> 
> I think I found the document you are talking about. In general I 
> believe most of it still applies. I can make the following comments 
> on it about Spectrum Scale: 1 - There was an effort to simplify 
> Spectrum Scale tuning, and tuning of worker1Threads 
ZjQcmQRYFpfptBannerStart 
> This Message Is From an External Sender 
> This message came from outside your organization. 
> ZjQcmQRYFpfptBannerEnd
> I think I found the document you are talking about. In general I 
> believe most of it still applies. I can make the following comments 
> on it about Spectrum Scale: 
> 1 - There was an effort to simplify Spectrum Scale tuning, and 
> tuning of worker1Threads should be replaced by tuning workerThreads 
> instead. Setting workerThreads, will auto-tune about 20 different 
> Spectrum Scale configuration parameters (including worker1Threads) 
> behind the scene. 
> 2 - The Spectrum Scale pagepool parameter defaults to 1Gig now, but 
> the most important thing is to make sure that you can fit all the IO
> into the pagepool. So if you have 512 threads * 1 MB you will need 
> 1/2 Gig just to do disk IO, but if you use 4MB that becomes 512 * 4 
> = 2Gig just for disk IO. I would recommend setting the pagepool to 
> 2x the size of this if you are using direct IO so 1 Gig or 4 Gig for
> the example sizes I just mentioned. 
> 3 - One consideration that is important is sizing the initial DB2 
> database size correctly, and when the tablespace needs to grow, make
> sure it grows enough to avoid constantly increasing the tablespace.
> The act of growing a database throws GPFS into buffered IO which can
> be slower than directIO. If you need the database to grow all the 
> time, I would avoid using direct IO and use a larger GPFS pagepool 
> to allow it cache data. Using directIO is the better solution.
> 
> Jim Doherty
> 
> On Monday, June 7, 2021, 11:03:26 AM EDT, Wally Dietrich 
> <wallyd at us.ibm.com> wrote: 
> 
> Hi. Is there documentation about tuning DB2 to perform well when 
> using Spectrum Scale file systems? I'm interested in tuning both DB2
> and Spectrum Scale for high performance. I'm using a stretch cluster
> for Disaster Recover (DR). I've found a document, but the last 
> update was in 2013 and GPFS has changed considerably since then. 
> 
> Wally Dietrich
> wallyd at us.ibm.com
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> INVALID URI REMOVED
> 
u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-
> siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-
> 
M&m=0w6BrJDJDqZrylo3ICWwqF7uFCQ5smwrDGjZm8xpKjU&s=7CZY0jIPCvfodrfNQoZlx3N2Dh9n7m-5mQkP5zhzI-
> I&e= 





More information about the gpfsug-discuss mailing list