About the settings

The following selectors can be used to activate/deactivate benchmark's options. The benchmark table will be updated every time an option is enabled or not.

The search input uses regex, so you can use it to match specific patterns. For example, matching cos will match every benchmarks that contains cos in their name (such as acos or cosh). Using ^cos$ will strictly match benchmarks with name cos.

Concerning benchmark results

The data.cpe refers to the number of CPU cycles burnt to process one element. Lower is better.

The benchmarks below are not meant to measure the performances of a specific processor but rather to measure how the bSIMD functions behave under the different SIMD extensions provided by the hardware vendors. Therefore the processors specifications are not given below.

For example take the dec function for floats on PowerPC. As SIMD registers on PowerPC can hold 4 floats, a speedup of 4 is expected between the version using PowerPC's SIMD extension and its standard counterpart (the one obtained by writing fp -= 1.0f). As can be seen in the table below, the SIMD dec function takes 0.535153 CPEs while the usual standard one takes 2.13446 CPEs.

The speedups given by the table below are expected to be the same on all hardware providing the same SIMD extensions but this is not guaranteed by NUMSCALE. For example if you have a PowerPC chip on which the code fp -= 1.0f takes 1.5 CPEs, then you can expect that the bSIMD version will take roughly 1.5 / (2.13446 / 0.535153) = 0.376 CPEs.

Note that getting genuine benchmarks can be tricky as it depends heavily on the chip and compiler used. If you have any questions feel free to contact us.