Run
bash make.sh
to build; modify for non-gcc / non-avx2 platforms. The interface to this library is not stable yet. Potential future changes include naming of bitwise operators.
This library intends to provide functions for operating on vectors of numbers. The function signatures are based on BLAS L1 functions, and comprise of:
BMAS_{ITYPE}{name}(long n, {type* ptr, long inc_ptr}*)
- ITYPE can be one of
s d i8 i16 i32 i64 u8 u16 u32 u64
- n - number of elements to operate upon
- ptr, inc_ptr - one or more pairs of pointer to vector, and stride
For example, the function BMAS_ssin(n, float* in, long inc_in, float* out, long inc_out)
calculates the sin of the single-floating point numbers in the vector defined by (in, inc_in) and stores the result in the vector defined by (out, inc_out).
Exceptions for this pattern include BMAS_cast_{ITYPE}{OTYPE}
function for converting from ITYPE to OTYPE.
See the bmas.h for the list of currently provided functions.
Correctness is checked by the tests in numericals. Bitwise operators stand untested at the moment.
The actual computation is done by simd functions from hardware simd instructions, SLEEF and/or libmvec*. These are defined inside the simd directory.
*Currently, libmvec is used for single-float sine and cosine for AVX2, since they were found to be faster than their SLEEF counterparts.
- gcc uses arithmetic shift on signed values and logical shift on unsigned values.
long
has been used as equivalent to 8 bytes; perhaps this will be fixed in the future to account for machine-OS specificity
SSE and AVX512 support exists to a limited extent due to limited developer time.
BMAS_cast_{ITYPE}{OTYPE}
BMAS_{TYPE}copy
From \ To | float32 | float64 | int64 | int32 | int16 | int8 | uint64 | uint32 | uint16 | uint8 |
---|---|---|---|---|---|---|---|---|---|---|
float32 | + | + | - | - | - | - | - | - | - | - |
float64 | + | + | - | - | - | - | - | - | - | - |
int64 | + | + | + | - | - | - | - | - | - | - |
int32 | + | + | - | + | - | - | - | - | - | - |
int16 | + | + | - | - | + | - | - | - | - | - |
int8 | + | + | - | - | - | + | - | - | - | - |
uint64 | + | + | - | - | - | - | + | - | - | - |
uint32 | + | + | - | - | - | - | - | + | - | - |
uint16 | + | + | - | - | - | - | - | - | + | - |
uint8 | + | + | - | - | - | - | - | - | - | + |
Function \ Data type | float32 | float64 | int64 | int32 | int16 | int8 | uint64 | uint32 | uint16 | uint8 |
---|---|---|---|---|---|---|---|---|---|---|
add | + | + | + | + | + | + | - | - | - | - |
sub | + | + | + | + | + | + | - | - | - | - |
mul | + | + | + | + | + | + | + | + | + | + |
div | + | + | - | - | - | - | - | - | - | - |
abs (also fabs below) | - | - | + | + | + | + | - | - | - | - |
dot | + | + | + | + | + | + | - | - | - | - |
min | + | + | + | + | + | + | + | + | + | + |
max | + | + | + | + | + | + | + | + | + | + |
sum (horizontal) | + | + | + | + | + | + | - | - | - | - |
hmin (horizontal) | + | + | + | + | + | + | + | + | + | + |
hmax (horizontal) | + | + | + | + | + | + | + | + | + | + |
himin (index of min) | + | + | + | + | + | + | + | + | + | + |
himax (index of max) | + | + | + | + | + | + | + | + | + | + |
Function \ Data type | float32 | float64 | int64 | int32 | int16 | int8 | uint64 | uint32 | uint16 | uint8 |
lt | + | + | + | + | + | + | + | + | + | + |
le | + | + | + | + | + | + | + | + | + | + |
eq | + | + | + | + | + | + | + | + | + | + |
neq | + | + | + | + | + | + | + | + | + | + |
gt | + | + | + | + | + | + | + | + | + | + |
ge | + | + | + | + | + | + | + | + | + | + |
Function \ Data type (Bitwise) | float32 | float64 | int64 | int32 | int16 | int8 | uint64 | uint32 | uint16 | uint8 |
not | - | - | + | + | + | + | + | + | + | + |
and | - | - | + | + | + | + | + | + | + | + |
or | - | - | + | + | + | + | + | + | + | + |
xor | - | - | + | + | + | + | + | + | + | + |
andnot | - | - | + | + | + | + | + | + | + | + |
sll | - | - | - | - | - | - | + | + | + | + |
srl | - | - | - | - | - | - | + | + | + | + |
sra | - | - | + | + | + | + | - | - | - | - |
Function \ Data type | float32 | float64 |
---|---|---|
fabs | + | + |
trunc | + | + |
floor | + | + |
ceil | + | + |
round | + | + |
---------------------- | :-------: | :-------: |
sin | + | + |
cos | + | + |
tan | + | + |
asin | + | + |
acos | + | + |
atan | + | + |
sinh | + | + |
cosh | + | + |
tanh | + | + |
asinh | + | + |
acosh | + | + |
atanh | + | + |
---------------------- | :-------: | :-------: |
pow | + | + |
atan2 | + | + |
log | + | + |
log2 | + | + |
log10 | + | + |
log1p | + | + |
exp | + | + |
exp2 | + | + |
exp10 | + | + |
expm1 | + | + |
---------------------- | :-------: | :-------: |
I am not primarily a C developer, and thus might not abide by the C conventions. A simple example is I'm using a make.sh
instead of a Makefile
. This may change once (re)learning these things becomes a higher priority for me.
Code here includes code included (produced) by SLEEF and from stackoverflow.
SLEEF code included in ./simd/sleef/ and ./sleefinline_purec_scalar.h is licenced under BOOST v1.0:
Boost Software License - Version 1.0 - August 17th, 2003
Permission is hereby granted, free of charge, to any person or organization obtaining a copy of the software and accompanying documentation covered by this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following:
The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Code included from stackoverflow includes (but might not be limited to):
Stackoverflow material is usually under CC-by-SA 3.0 or 4.0. However, also see The MIT License – Clarity on Using Code on Stack Overflow and Stack Exchange.
Your best bet might be to contact wim and Z boson. But, thank you wim and Z boson!
Other projects face same potential issues.
In case you discover something I have included and not attributed, I'd be glad to be pointed out in an issue!