FLAME GPU 2.0.0-alpha.1
Pre-releaseFLAME GPU 2.0.0-alpha.1 is the second alpha pre-release of FLAME GPU 2.0.0.
As an alpha release, the API cannot be considered stable, as there will likely be breaking changes compared to the first stable 2.0.0 release, however, we aim to minimise the number of such changes.
FLAME GPU 2 is a complete rewrite of FLAME GPU 1, using modern templated CUDA C++ with CUDA C++ and Python interfaces available (requires NVRTC).
This alpha release requires:
- CUDA
>= 10.0
includingnvrtc
and a CUDA-capable NVIDIA GPU (Compute Capability >= 35) - A c++14 host compiler compatible with your CUDA installation, I.e. GCC or MSVC depending on platform
- CMake
>= 3.18
- git
For full version requirements, please see the Requirements section of the README.
Documentation and Support
Installing Pre-compiled Python Binary Wheels
To simplify use of the python binaries, Python wheels are offered with this release.
These are not yet available through a distribution channel.
To install pyflamegpu
2.0.0a0
, download the appropriate .whl
file for your platform, and install it into your python environment using pip. I.e.
python3 -m pip install --user pyflamegpu-2.0.0a1+cuda112-cp39-cp39-linux_x86_64.whl
CUDA 11.0
or CUDA 11.2
(or newer) including nvrtc
must be installed on your system containing a Compute Capability 3.5
or newer NVIDIA GPU.
Python binary wheels are available for x86_64 systems with:
- Linux with
glibc >= 2.17
(I.e. Ubuntu >= 13.04, CentOS/RHEL >= 7+, etc.) - Windows 10
- Python
3.6
,3.7
,3.8
and3.9
- CUDA
11.0
or11.2+
- Built with support for Compute Capabilities
35 52 60 70 80
- Built with support for Compute Capabilities
- Wheels with visualisation enabled or disabled.
- Note that Linux wheels do not package shared object dependencies at this time.
Wheel filenames are of the format pyflamegpu[_vis]-2.0.0a1+cuda<CUDA>-cp<PYTHON>-cp<PYTHON>-<platform>.whl
, where:
_vis
indicates visualisation support is included+cuda<CUDA>
encodes the CUDA version usedcp<PYTHON>
identifies the python version<platform>
identifies the OS/CPU Architecture
For Example
pyflamegpu-2.0.0a1+cuda110-cp38-cp38-linux_x86_64.whl
is a non-visualisation build, built with CUDA 11.0, for python 3.8 on Linux x86_64.pyflamegpu_vis-2.0.0a1+cuda112-cp39-cp39-win_amd64.whl
is a visualisation-enabled build, built with CUDA 11.2, for python 3.9 on Windows 64-bit.
Building FLAME GPU from Source
For instructions on building FLAME GPU from source, please see the Building FLAME GPU section of the README.
Deprecated Requirement Versions
Several CUDA and C++ compilers versions currently work but are marked as deprecated, with support to be removed in a future pre-release version of FLAME GPU.
This is due to planned adoption of C++17 features.
C++14 support is deprecated and will be removed in a future release.
This means that the following compiler/software versions are deprecated:
- CUDA
>= 10.0 && < 11.0
- C++14 host compilers:
- GCC
>=6 && < 7
- Visual Studio 2017 may or may not work.
- GCC
Known Issues
There are known issues with this release of FLAME GPU 2, which will be fixed where possible in future releases.
For a full list of issues please see the Issue Tracker.
- Performance regressions in CUDA 11.3+, due to changes in compiler register usage (#560).
- Segfault when using
flamegpu::DependencyGraph
via the default constructor (#555). This will require an API break to resolve. - Warnings and a loss of performance due to hash collisions in device code (#356)
- Multiple known areas where performance can be improved (e.g. #449, #402)
- Windows/MSVC builds using CUDA < 11.0 may encounter intermittent compiler failures. Please use CUDA 11.0+.
- This will be resolved by dropping CUDA 10 support in a future release.
- Windows/MSVC builds using CUDA 11.0 may encounter errors when performing incremental builds if the static library has been recompiled. If this presents itself, re-save any
.cu
file in your executable producing project and re-trigger the build. - Debug builds under linux with CUDA 11.0 may encounter cuda errors during
validateIDCollisions
. Consider using an alternate CUDA version if this is required (#569). - CUDA 11.0 with GCC 9 may encounter a segmentation fault during compilation of the test suite. Consider using GCC 8 with CUDA 11.0.
- Original releases of GCC 10.3.0 and 11.1.0 were incompatible with CUDA when using
<chrono>
, with a fix backported into these versions in some cases (575)- CMake will attempt to detect
<chrono>
support, and prevent compilation if an incompatible version is detected.
- CMake will attempt to detect
CUDAEnsemble::EnsembleConfig::devices
is not programtically acessible from python (#682)RunPlan
andRunPlanVector
operator+
/operator+=
overloads are not available in python (#679)- It is likely that a breaking change will be requied to resolve this.