This paper reports on an empirical performance analysis of
four modal theorem provers on benchmark suites of randomly generated
formulae.
The theorem provers tested are the Davis-Putnam-based procedure
KSAT, the tableaux-based system KRIS, the sequent-based Logics
Workbench, and a translation approach combined with the first-order
theorem prover SPASS.
Our benchmark suites are sets of multi-modal
formulae in a certain normal form randomly
generated according to the scheme of Giunchiglia and
Sebastiani [GS96a, GS96b].
We investigate the quality of the random modal formulae and show
that the scheme has some shortcomings, which may lead to mistaken
conclusions.
We propose improvements to the evaluation method and show that the
translation approach provides a viable alternative to the other
approaches.