Performance of CORE-MATH functions (revision 81d5ea0) on an Intel Xeon Silver
4214, with GCC 13.2.0, compared to GNU libc 2.37, the Intel Math Library
from icx 2023.2.0 (with `-fp-model=strict -fno-fast-math` and
LLVM libc (revision 099dbb):

- binary32 functions: reciprocal throughput, latency (with revision 81d5ea0 of CORE-MATH)
- binary64 functions: reciprocal throughput, latency (with revision 81d5ea0 of CORE-MATH)
- new C23 functions: reciprocal throughput, latency (with revision 65a4a9d of CORE-MATH)

Available CORE-MATH functions (under MIT license for the stand-alone functions):

function | binary32 | binary64 | binary80 | binary128 |

acos | code | code | glibc | glibc (1) |

acosh | code | code | ||

acospi | code | |||

asin | code | code | ||

asinh | code | code | ||

asinpi | code | |||

atan | code | code | ||

atan2 | code | |||

atan2pi | code | |||

atanh | code | code | ||

atanpi | code | |||

cbrt | code | code (proof) | glibc | glibc |

cos | code | code | ||

cosh | code | code | ||

cospi | code | code | ||

erf | code | code | ||

erfc | code | code | ||

exp | code | code | ||

exp10 | code | code | ||

exp10m1 | code | code | ||

exp2 | code | code | ||

exp2m1 | code | code | ||

expm1 | code | code | ||

hypot | code | code | ||

log | code | code (with Gappa proof) | ||

log10 | code | code | ||

log10p1 | code | |||

log1p | code | code | ||

log2 | code | code | ||

log2p1 | code | code | ||

pow | code | code | ||

rsqrt | code | code | ||

sin | code | code | ||

sinh | code | code | ||

sinpi | code | code | ||

tan | code | code | ||

tanh | code | code | ||

tanpi | code | code |

Caption:

**glibc**: patch for GNU libc (details here on how to use it)**llvm**: pointer to a correctly-rounded implementation in llvm-libc**proof**: a paper proof of the correctness of the algorithm**reserved**: this implementation is reserved for a CORE-MATH contributor- (1) tested on hard-to-round cases |x| < 2
^{-34} - (2) only for rounding to nearest

Notes:

- correct rounding is claimed for all IEEE-754 rounding modes.
For univariate binary32 functions it is checked by exhaustive search.
For bivariate functions or
larger formats it is checked only with respect to known
hard-to-round cases. If you find an input which is not correctly
rounded, please send it to
`paul dot zimmermann @ inria dot fr`. - these implementations only care about correct rounding for regular numbers, they do not necessarily correctly deal with NaN, infinities or zero, set the underflow and overflow exceptions, nor the inexact flag.
- these implementations were tested on x86_64-linux, with and without the use of fma (fused multiply add).

Other correctly-rounded implementations:

- CRlibm this is a mirror of the original CRlibm project (which is no more available). CRlibm is no longer maintained, but contains very useful resources: provides binary64 exp, log, cos, sin, tan, cospi, sinpi, tanpi, atan, atanpi, cosh, sinh, log2, log10, asin, acos, asinpi, acospi, expm1, log1p, exp2, pow (only for rounding to nearest for pow).
- RLIBM provides binary32 log, log2, log10, exp, exp2, exp10, sinh, cosh, sinpi, cospi, sin, cos, tan, atan, asin, acos for all rounding modes.
- LLVM-libc provides some correctly rounded functions (see links below): claimed accuracy of LLVM functions
- JuliaIntervals provides elementary functions with directed rounding wrapping CRlibm.

function | binary32 | binary64 | binary80 | binary128 |

acos | rlibm llvm | crlibm | ||

acosh | llvm | |||

acospi | crlibm | |||

asin | rlibm llvm | crlibm | ||

asinh | llvm | |||

asinpi | crlibm | |||

atan | rlibm llvm | crlibm | ||

atan2 | ||||

atanh | llvm | crlibm | ||

atanpi | crlibm | |||

cbrt | ||||

cos | rlibm llvm,llvm | crlibm | ||

cosh | rlibm llvm llvm | crlibm | ||

cospi | rlibm | crlibm | ||

erf | llvm | |||

erfc | ||||

exp | rlibm llvm,llvm | crlibm, llvm | ||

exp10 | rlibm llvm | llvm | ||

exp2 | rlibm llvm llvm llvm | crlibm, llvm | ||

expm1 | llvm,llvm | crlibm,llvm | ||

hypot | llvm,llvm | llvm | ||

log | rlibm llvm | crlibm, llvm | ||

log10 | rlibm llvm | crlibm, llvm | ||

log1p | llvm | crlibm, llvm | ||

log2 | rlibm llvm | crlibm, llvm | ||

pow | crlibm (2) | |||

sin | rlibm llvm,llvm | crlibm | ||

sincos | llvm | |||

sinh | rlibm llvm llvm | crlibm | ||

sinpi | rlibm | crlibm | ||

tan | rlibm llvm,llvm | crlibm | ||

tanh | llvm llvm | |||

tanpi | crlibm |