Releases: CNugteren/CLTune
Releases · CNugteren/CLTune
Version 2.7.0
Version 2.7.0
- CLTune now automatically ensures global size is a multiple of the local workgroup size
- Added
GetBestResult()
to the tuner's API to retrieve the best parameters programmatically - Changed
std::initalizer_list
in theAddParameters
API tostd::vector
- Fixed a bug in the simulated annealing search method
Version 2.6.0
Version 2.6.0
- Changed timing measurements to now also include the (varying) kernel launch overhead
- It is now possible to set OpenCL compiler options through the env variable CLTUNE_BUILD_OPTIONS
- Added support for compilation under Visual Studio 2013 (MSVC++ 12.0)
- Added an option to build a static version of the library
Version 2.5.0
Version 2.5.0
- Updated to version 8.0 of the CLCudaAPI header
- Made it possible to configure the number of times each kernel is run (to average results)
- Minor bugfixes
Version 2.4.0
Version 2.4.0
- Made it possible to run the unit-tests independently of the provided OpenCL kernel samples
- Added an option to compile in verbose mode for additional diagnostic messages (-DVERBOSE=ON)
- Now using version 6.0 of the CLCudaAPI header
- Fixed the RPATH settings on OSX
- Added Appveyor continuous integration and increased coverage of the Travis builds
Version 2.3.1
Version 2.3.1 (bug-fix release)
- Fixed a bug where an output buffer could not be used as input at the same time
- Fixed computing the validation error for half-precision fp16 data-types
Version 2.3.0
Version 2.3.0
- Added support for 'short' and 'cl_half' data-types as kernel buffer and scalar arguments
- Fixed a bug where failed results would still show up in the tuning results
- Made MSVC link the run-time libraries statically
Version 2.2.0
Version 2.2.0
- Added two new simpler samples of using the tuner (vector-add and convolution)
- Updated the general documentation
- Added API documentation
- Now using version 5.0 of the CLCudaAPI header
Version 2.1.0
Version 2.1.0
- Added exports to be able to create a DLL on Windows (thanks to Marco Hutter)
- Added command-line OpenCL platform selection in the examples (thanks to William J Shipman)
Version 2.0.0
Version 2.0.0
- Added support for machine learning models. These models can be trained on a small fraction of the
tuning configurations and can be used to predict the remainder. Two models are supported:- Linear regression
- A 3-layer neural network
- Now using version 4.0 of the CLCudaAPI header (previously known as Claduc)
- Added experimental support for CUDA kernels
- Added support for MSVC (Visual Studio) 2015
- Using Catch instead of GTest for unit-testing
- Various minor fixes
Version 1.7.0
Version 1.7.0
- Now using the Claduc C++11 interface to OpenCL (see https://github.com/CNugteren/Claduc)
- Added a method to print all tuning results in JSON-format to file