Performance Tips: the cost of allocation

Looking at the performance results of algorithms using arrays I noticed something very drastic. LuaJIT was terribly slow compared to the other languages.

Algorithm C++ float C++ fixed LuaJIT JavaScript
Ladder Euler 4.160 ms/s 5.600 ms/s 306.951 ms/s 20.63 ms/s
Ladder Heun 6.480 ms/s 10.210 ms/s 855.072 ms/s 64.650 ms/s
Rescomb 2.840 ms/s 1.870 ms/s 2.286 ms/s 9.01 ms/s
Delay 2.210 ms/s 9.340 ms/s 3.130 ms/s 6.370 ms/s

All these examples use arrays. The main difference is that the Ladder examples use small local arrays to functions, while the other examples use large arrays as mem variables. Here’s a snippet of the code:

fun euler(input, cut, res) { mem fh; mem p[4]; val dpt[4]; /// local array if(Util.change(cut)) { fh = tune(cut); } _ = ladder_step(input, fh, res, p, dpt); p[0] = p[0] + dpt[0]; p[1] = p[1] + dpt[1]; p[2] = p[2] + dpt[2]; p[3] = p[3] + dpt[3]; return p[3]; }

The Lua code uses the LuaJIT FFI to create C arrays. The motivation of using it was because, as show here, you can get better performance with C arrays compares to Lua arrays. This is because Lua does not have native arrays, they are tables indexed with integers.

For the line val dpt[4]; the generated Lua code looks as follows:

dpt = ffi.new("double[?]",4)

It looks like every time a local array is used, even for the simplest thing, a new object is allocated. When the object is not used, the memory needs to be collected. In comparison to C++, the code will look as follows:

double dpt[4];

The first thing I tried was changing the Lua code to use regular Lua arrays instead of the FFI.

Algorithm LuaJIT FFI arrays LuaJIT regular arrays
Ladder Euler 306.951 ms/s 3.541 ms/s
Ladder Heun 855.072 ms/s 7.193 ms/s
Rescomb 2.286 ms/s 3.794 ms/s
Delay 3.130 ms/s 3.715 ms/s

Even though for large arrays the performance we pay a little bit in performance, the overall gain is bigger.

Conclusion

When using LuaJIT, allocating small and short lived objects using FFI has very large impact in the performance of the code.