Recently I implemented in Vult a simpler way for me to measure the performance of the generated code. Running the command $ make perf
generates code for all languages and runs it for most of the examples.
The results are displayed in ms/s (milliseconds taken to render one second of audio). For most of the examples the C++ with fixed-point is the fastest. In a few instances is slower than C++ with floating point. I’m still investigating these cases. LuaJIT is slightly slower than C++ and then comes JavaScript.
When rendering the different implementations of Saw wave oscillators we get the following result (MacBook Air with 1,4 GHz Intel Core i5):
Algorithm | C++ float | C++ fixed | LuaJIT | JavaScript |
---|---|---|---|---|
Saw EPTR | 0.368 ms/s | 0.264 ms/s | 0.418 ms/s | 0.539 ms/s |
Saw PTR W=1 | 0.385 ms/s | 0.255 ms/s | 0.396 ms/s | 0.583 ms/s |
Saw PTR W=2 | 0.384 ms/s | 0.286 ms/s | 0.434 ms/s | 1.057 ms/s |
Saw R? | 0.821 ms/s | 0.284 ms/s | 0.446 ms/s | 1.523 ms/s |
Saw BLIT | 1.593 ms/s | 1.729 ms/s | 1.866 ms/s | 3.229 ms/s |
A few strange things can be spotted by looking at the table. The Saw PRT gets almost twice as slow for Js. I’m not gonna investigate that one here. The one that is more interesting is the Saw R? going from fixed-point to floating-point; it gets 3 to 4 times slower.
The implementation (part of it) looks as follows:
After playing a bit I found the problem. The line mem phase = (2.0 * inc + phase) % 2.0;
uses the mod
operation on a floating-point number in order to wrap the phase. When this is converted to fixed-point (integers) the mod
operation is performed on integers which is much faster than on floating-point.
By changing that line to the following lines:
we avoid calling fmod
which seems to be more expensive. This will make the fixed-point a little bit slower, but the floating-point becomes faster. The updated table after performing the change is as follows:
Algorithm | C++ float | C++ fixed | LuaJIT | JavaScript |
---|---|---|---|---|
Saw R? | 0.426 ms/s | 0.322 ms/s | 0.446 ms/s | 1.149 ms/s |
Avoid using %
(mod on floats) unless is strictly necessary.
EPTR : Efficient Polynomial Transition Regions
PTR : Polynomial Transition Regions
BLIT : Band-Limited Impulse Train
R? : I don’t know the name of this algorithm
Among these algorithms, the only one that produces real band-limited waveforms is the BLIT.