[gpfsug-discuss] IO500 SC’21 Call for Submission

Wed Sep 22 16:34:06 BST 2021

Stabilization period: Friday, 17th September - Friday, 1st October

Submission deadline: Monday, 1st November 2021 AoE

The IO500 [1] is now accepting and encouraging submissions for the 
upcoming 9th semi-annual IO500 list, in conjunction with SC'21. Once 
again, we are also accepting submissions to the 10 Node Challenge to 
encourage the submission of small-scale results. The new ranked lists 
will be announced via live-stream at a virtual session during "The IO500 
and the Virtual Institute of I/O" BoF [3]. We hope to see many new 
results.

What's New
Since ISC21, the IO500 follows a two-staged approach. First, there will 
be a two-week stabilization period during which we encourage the 
community to verify that the benchmark runs properly on a variety of 
storage systems. During this period the benchmark may be updated based 
upon feedback from the community. The final benchmark will then be 
released. We expect that submissions compliant with the rules made 
during the stabilization period will be valid as a final submission 
unless a significant defect is found.

We are now creating a more detailed schema to describe the hardware and 
software of the system under test and provide the first set of tools to 
ease capturing of this information for inclusion with the submission. 
Further details will be released on the submission page [2].

We are evaluating the inclusion of optional test phases for additional 
key workloads - split easy/hard find phases, 4KB and 1MB random 
read/write phases, and concurrent metadata operations. This is called an 
extended run. At the moment, we collect the information to verify that 
additional phases do not significantly impact the results of the 
standard IO500 run. We encourage every participant to submit results 
from both a standard run and an extended run to facilitate comparisons 
between the existing and new benchmark phases. In a future release, we 
may include some or all of these results as part of the standard 
benchmark. The extended results are not currently included in the 
scoring of any ranked list.
Background

The benchmark suite is designed to be easy to run and the community has 
multiple active support channels to help with any questions. Please note 
that submissions of all sizes are welcome; the site has customizable 
sorting, so it is possible to submit on a small system and still get a 
very good per-client score, for example. Additionally, the list is about 
much more than just the raw rank; all submissions help the community by 
collecting and publishing a wider corpus of data. More details below.

Following the success of the Top500 in collecting and analyzing 
historical trends in supercomputer technology and evolution, the IO500 
was created in 2017, published its first list at SC17, and has grown 
continually since then. The need for such an initiative has long been 
known within High-Performance Computing; however, defining appropriate 
benchmarks has long been challenging. Despite this challenge, the 
community, after long and spirited discussion, finally reached a 
consensus on a suite of benchmarks and a metric for resolving the scores 
into a single ranking.

The multi-fold goals of the benchmark suite are as follows:
Maximizing simplicity in running the benchmark suite
Encouraging optimization and documentation of tuning parameters for 
performance
Allowing submitters to highlight their "hero run" performance numbers
Forcing submitters to simultaneously report performance for challenging 
IO patterns.
Specifically, the benchmark suite includes a hero-run of both IOR and 
MDTest configured, however, possible to maximize performance and 
establish an upper-bound for performance. It also includes an IOR and 
MDTest run with highly constrained parameters forcing a difficult usage 
pattern in an attempt to determine a lower-bound. Finally, it includes a 
namespace search as this has been determined to be a highly sought-after 
feature in HPC storage systems that has historically not been 
well-measured. Submitters are encouraged to share their tuning insights 
for publication.

The goals of the community are also multi-fold:
Gather historical data for the sake of analysis and to aid predictions 
of storage futures
Collect tuning data to share valuable performance optimizations across 
the community
Encourage vendors and designers to optimize for workloads beyond "hero 
runs"
Establish bounded expectations for users, procurers, and administrators

10 Node I/O Challenge
The 10 Node Challenge is conducted using the regular IO500 benchmark, 
however, with the rule that exactly 10 client nodes must be used to run 
the benchmark. You may use any shared storage with any number of 
servers. When submitting for the IO500 list, you can opt-in for 
"Participate in the 10 compute node challenge only", then we will not 
include the results in the ranked list. Other 10-node node submissions 
will be included in the full list and in the ranked list. We will 
announce the result in a separate derived list and in the full list but 
not on the ranked IO500 list [2].

------------------------------------

Birds-of-a-feather
Once again, we encourage you to submit [2], to join our community, and 
to attend our BoF "The IO500 and the Virtual Institute of I/O" [3], 
where we will announce the new IO500 and 10 node challenge lists. The 
current list includes results from twenty different storage system types 
and 70 institutions. We hope that the upcoming list grows even more.

We look forward to answering any questions or concerns you might have.
[1] https://io500.org/
[2] https://io500.org/submission
[3] https://io500.org/pages/bof-sc21
-- 
The IO500 Committee