Writing a runscript¶

A runscript is a yaml script that defines the benchmark. The runscript will be validated with pykwalify. The schema can be seen in src/benchmarktool/runscript/schema.yml. An example runscript can be seen in benchmarks/examples/*/runscript.yml. Here's an overview of the keys:

base_dir The base directory of the benchmark. Example: benchmarks/examples/clasp.
output_dir The directory to where the output (scripts and results) are generated. The path is relative to base_dir. Example: output.
machines Description such as CPU and memory capabilities of the machine used for the benchmarks. Currently not used.
configs Description of the configuration to run the systems, i.e. template for the run script (might be shell scripts, etc).
systems Description of the system (tools/solvers) on which the benchmark instances will be run against.
system.measures Module path to the python function of the result parser that will be used to measure the result of this system, relative to base_dir. For example, a value of resultparser.sudokuresultparser will use the function sudokuresultparser defined in [base_dir]/resultparser.py.
system.settings Description of the system's various settings, such as the cmdline options, tag, etc.
jobs Description of various jobs, including its type and resource limits.
benchmarks Description of various benchmark instances through specifications. Currently the specification type can be either folder or doi.
projects Description of project, which can consists of many becnhmark runs by tags (runtags), or by manual specifications (runspecs).

Generating Scripts¶

# pipenv shell
./bgen benchmarks/.../runscript.yml

After running the command, shortly the scripts will be generated in $(base_dir)/$(output_dir)/$(project)/$(machine) according to the runscript.

The structure of the scripts generated will depend on the job, but generally the structure will be like this:

[base_dir]/[output_dir]/[project]/[machine]/results/
└── start.py
└── $(benchmark)
    ├── $(system)-$(system)version]-$(system)setting]-n$(system)setting.proc]
    │   ├── $(instance_file_name)
    │   │   └── run$(run_number)
    │   │       └── start.sh

Running the benchmark¶

Depending on the job, there should be a single entry point for running the benchmark in $(base_dir)/$(output_dir). For example, there will be a start.py for a Sequential Job. Just run this script to run the whole benchmark.

./benchmarks/.../output/.../start.py

Evaluating the results¶

# pipenv shell
./beval benchmarks/.../runscript.yml > evaluation.xml

This script will collect statistics from the runs like its time and memory usage, errors etc according to the result parser defined in the system measures.

Summarizing evaluations¶

./bconv -c < evaluation.xml > result.csv