The speed-up clearly depends on the amount of C-ification and on the statistical importance of C-ified code in the execution profile of a program (see figure 1). We have noticed between 10-20% speed increase for programs which take advantage of C-ified code moderately, As these programs spend only 20-30% of their time in C-ified sequences performances are expected to scale correspondingly when we extend this approach to the full BinProlog instruction set and implement low-level gcc direct jumps instead of function calls and anti-calls.
1: Performance of emulated (emBP) and partially
C-ified BinProlog 3.22 (C-BP)
compared to emulated (emSP) and native (natSP) SICStus 2.1_9
on a Sparc 10/20).
Code-sizes for C-ified BinProlog executables (dynamically linked on Sparcs with Solaris 2.3) are usually even smaller than `compact' Sicstus code which uses classical instruction folding (a few hundreds of opcodes) to speed-up the emulator.
The following table shows some code-size/execution-speed variations with respect to the threshold for the semi-ring (SEMI3) benchmark. Clearly, excessively small chunks can influence adversely not only on size but also on speed. Something like threshold=20, looks like a practical optimum for this program.
threshold: 0 4 8 20 30 1000 emBP emSP natSP size: (K) 34.5 32.2 29.9 16.3 13.1 12.9 4.8 22.0 31.9 speed: (ms) 1480 1430 1440 1450 1810 1790 1800 1810 1310