Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed May 25, 2024
1 parent 6599c6d commit 5b4a466
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 19 deletions.
4 changes: 4 additions & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ denormal
dll
DirectX
EIGEN
EIGEN's
enqueue
enqueues
embeded
Expand All @@ -30,15 +31,18 @@ icc
Interoperation
interoperate
IPC
Lapack
latencies
libstdc
LOC
LUID
Malloc
malloc
multicore
NDRange
Numa
Nsight
preprocessor
PTX
rocTX
RTC
Expand Down
38 changes: 19 additions & 19 deletions docs/how-to/hip_porting_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ Often, it's useful to know whether the underlying compiler is HIP-Clang or nvcc.
// Compiled with nvcc (CUDA language extensions enabled)
```
Compiler directly generates the host code (using the Clang x86 target) and passes the code to another host compiler. Thus, they have no equivalent of the \__CUDA_ACC define.
Compiler directly generates the host code (using the Clang x86 target) and passes the code to another host compiler. Thus, they have no equivalent of the `__CUDACC__` define.
### Identifying Current Compilation Pass: Host or Device
Expand Down Expand Up @@ -216,28 +216,28 @@ The table below shows the full set of architectural properties that HIP supports
|Define (use only in device code) | Device Property (run-time query) | Comment |
|------- | --------- | ----- |
|32-bit atomics:||
|`__HIP_ARCH_HAS_GLOBAL_INT32_ATOMICS__` | hasGlobalInt32Atomics |32-bit integer atomics for global memory
|`__HIP_ARCH_HAS_GLOBAL_FLOAT_ATOMIC_EXCH__` | hasGlobalFloatAtomicExch |32-bit float atomic exchange for global memory
|`__HIP_ARCH_HAS_SHARED_INT32_ATOMICS__` | hasSharedInt32Atomics |32-bit integer atomics for shared memory
|`__HIP_ARCH_HAS_SHARED_FLOAT_ATOMIC_EXCH__` | hasSharedFloatAtomicExch |32-bit float atomic exchange for shared memory
|`__HIP_ARCH_HAS_FLOAT_ATOMIC_ADD__` | hasFloatAtomicAdd |32-bit float atomic add in global and shared memory
|`__HIP_ARCH_HAS_GLOBAL_INT32_ATOMICS__` | `hasGlobalInt32Atomics` |32-bit integer atomics for global memory
|`__HIP_ARCH_HAS_GLOBAL_FLOAT_ATOMIC_EXCH__` | `hasGlobalFloatAtomicExch` |32-bit float atomic exchange for global memory
|`__HIP_ARCH_HAS_SHARED_INT32_ATOMICS__` | `hasSharedInt32Atomics` |32-bit integer atomics for shared memory
|`__HIP_ARCH_HAS_SHARED_FLOAT_ATOMIC_EXCH__` | `hasSharedFloatAtomicExch` |32-bit float atomic exchange for shared memory
|`__HIP_ARCH_HAS_FLOAT_ATOMIC_ADD__` | `hasFloatAtomicAdd` |32-bit float atomic add in global and shared memory
|64-bit atomics: | |
|`__HIP_ARCH_HAS_GLOBAL_INT64_ATOMICS__` | hasGlobalInt64Atomics |64-bit integer atomics for global memory
|`__HIP_ARCH_HAS_SHARED_INT64_ATOMICS__` | hasSharedInt64Atomics |64-bit integer atomics for shared memory
|`__HIP_ARCH_HAS_GLOBAL_INT64_ATOMICS__` | `hasGlobalInt64Atomics` |64-bit integer atomics for global memory
|`__HIP_ARCH_HAS_SHARED_INT64_ATOMICS__` | `hasSharedInt64Atomics` |64-bit integer atomics for shared memory
|Doubles: | |
|`__HIP_ARCH_HAS_DOUBLES__` | hasDoubles |Double-precision floating point
|`__HIP_ARCH_HAS_DOUBLES__` | `hasDoubles` |Double-precision floating point
|Warp cross-lane operations: | |
|`__HIP_ARCH_HAS_WARP_VOTE__` | hasWarpVote |Warp vote instructions (any, all)
|`__HIP_ARCH_HAS_WARP_BALLOT__` | hasWarpBallot |Warp ballot instructions
|`__HIP_ARCH_HAS_WARP_SHUFFLE__` | hasWarpShuffle |Warp shuffle operations (shfl\_\*)
|`__HIP_ARCH_HAS_WARP_FUNNEL_SHIFT__` | hasFunnelShift |Funnel shift two input words into one
|`__HIP_ARCH_HAS_WARP_VOTE__` | `hasWarpVote` |Warp vote instructions (any, all)
|`__HIP_ARCH_HAS_WARP_BALLOT__` | `hasWarpBallot` |Warp ballot instructions
|`__HIP_ARCH_HAS_WARP_SHUFFLE__` | `hasWarpShuffle` |Warp shuffle operations (shfl\_\*)
|`__HIP_ARCH_HAS_WARP_FUNNEL_SHIFT__` | `hasFunnelShift` |Funnel shift two input words into one
|Sync: | |
|`__HIP_ARCH_HAS_THREAD_FENCE_SYSTEM__` | hasThreadFenceSystem |threadfence\_system
|`__HIP_ARCH_HAS_SYNC_THREAD_EXT__` | hasSyncThreadsExt |syncthreads\_count, syncthreads\_and, syncthreads\_or
|`__HIP_ARCH_HAS_THREAD_FENCE_SYSTEM__` | `hasThreadFenceSystem` |threadfence\_system
|`__HIP_ARCH_HAS_SYNC_THREAD_EXT__` | `hasSyncThreadsExt` |syncthreads\_count, syncthreads\_and, syncthreads\_or
|Miscellaneous: | |
|`__HIP_ARCH_HAS_SURFACE_FUNCS__` | hasSurfaceFuncs |
|`__HIP_ARCH_HAS_3DGRID__` | has3dGrid | Grids and groups are 3D
|`__HIP_ARCH_HAS_DYNAMIC_PARALLEL__` | hasDynamicParallelism |
|`__HIP_ARCH_HAS_SURFACE_FUNCS__` | `hasSurfaceFuncs` |
|`__HIP_ARCH_HAS_3DGRID__` | `has3dGrid` | Grids and groups are 3D
|`__HIP_ARCH_HAS_DYNAMIC_PARALLEL__` | `hasDynamicParallelism` |
## Finding HIP
Expand Down Expand Up @@ -312,7 +312,7 @@ If you pass "--stdlib=libc++" to hipcc, hipcc will use the libc++ library. Gene
When cross-linking C++ code, any C++ functions that use types from the C++ standard library (including std::string, std::vector and other containers) must use the same standard-library implementation. They include the following:
* Functions or kernels defined in HIP-Clang that are called from a standard compiler
* Functions defined in a standard compiler that are called from HIP-Clanng.
* Functions defined in a standard compiler that are called from HIP-Clang.
Applications with these interfaces should use the default libstdc++ linking.
Expand Down

0 comments on commit 5b4a466

Please sign in to comment.