The following selectors can be used to activate/deactivate benchmark's options. The benchmark table will be updated every time an option is enabled or not.
The search input uses regex, so you can use it to match specific patterns. For example,
cos will match every benchmarks that contains
their name (such as
will strictly match benchmarks with name
data.cpe refers to the number of CPU cycles burnt to process one
element. Lower is better.
The benchmarks below are not meant to measure the performances of a specific processor but rather to measure how the bSIMD functions behave under the different SIMD extensions provided by the hardware vendors. Therefore the processors specifications are not given below.
For example take the
dec function for
floats on PowerPC. As
SIMD registers on PowerPC can hold 4
floats, a speedup of 4 is expected
between the version using PowerPC's SIMD extension and its standard counterpart (the
one obtained by writing
fp -= 1.0f). As can be seen in the table below,
dec function takes 0.535153 CPEs while the usual standard one
takes 2.13446 CPEs.
The speedups given by the table below are expected to be the same on all
hardware providing the same SIMD extensions but this is not guaranteed by NUMSCALE. For
if you have a PowerPC chip on which the code
fp -= 1.0f takes 1.5 CPEs,
then you can expect that
the bSIMD version will take roughly 1.5 / (2.13446 / 0.535153) = 0.376 CPEs.
Note that getting genuine benchmarks can be tricky as it depends heavily on the chip and compiler used. If you have any questions feel free to contact us.