Releases: openucx/ucx
Releases · openucx/ucx
v1.8.1-rc4
1.8.1-rc4 (July 7, 2020)
Features:
- Added binary release pipeline in Azure CI
Bugfixes:
- Multiple fixes in testing environment
- Fixes in InfiniBand DEVX transport
- Fixes in memory management for CUDA IPC transport
- Fixes for binutils 2.34+
- Fixes in RPM SPEC file and package generation
- Fixes for AMD ROCM build environment
v1.8.1-rc2
1.8.1-RC2 (July 4, 2020)
Features:
- Added binary release pipeline in Azure CI
Bugfixes:
- Multiple fixes in testing environment
- Fixes in InfiniBand DEVX transport
- Fixes in memory management for CUDA IPC transport
- Fixes for binutils 2.34+
- Fixes in RPM SPEC file and package generation
- Fixes for AMD ROCM build environment
v1.8.1-rc1
1.8.1-RC1 (June 23, 2020)
Features:
- Added binary release pipeline in Azure CI
Bugfixes:
- Multiple fixes in testing environment
- Fixes in InfiniBand DEVX transport
- Fixes in memory management for CUDA IPC transport
- Fixes for binutils 2.34+
- Fixes in RPM SPEC file and package generation
- Fixes for AMD ROCM build environment
v1.8.0
1.8.0 (April 3, 2020)
Features:
UCX Core
- Improved detection for DEVX support
- Improved TCP scalability
- Added support for ROCM to perftest
- Added support for different source and target memory types to perftest
- Added optimized memcpy for ROCM devices
- Added hardware tag-matching for CUDA buffers
- Added support for CUDA and ROCM managed memories
- Added support for client/server disconnect protocol over rdma connection manager
- Added support for striding receive queue for hardware tag-matching
- Added XPMEM-based rendezvous protocol for shared memory
- Added support shared memory communication between containers on same machine
- Added support for multi-threaded RDMA memory registration for large regions
- Added new test cases to Azure CI
UCX Java (API Preview)
- Added APIs for stream send/recv, tag probe, and connect request handle
- Added Java package (automatically published) to Maven central
Bugfixes:
- Multiple fixes in JUCX
- Fixes in UCP thread safety
- Fixes for most recent versions GCC, PGI, and ICC
- Fixes for CPU affinity on Azure instances
- Fixes in XPMEM support on PPC64
- Performance fixes in CUDA IPC
- Fixes in RDMA CM flows
- Multiple fixes in TCP transport
- Multiple fixes in documentation
- Fixes in transport lane selection logic
- Fixes in Java jar build
- Fixes in socket connection manager for Nvidia DGX-2 platform
v1.8.0-rc2
1.8.0-rc2 (TBD)
Features:
UCX Core
- Improved detection for DEVX support
- Improved TCP scalability
- Added support for ROCM to perftest
- Added support for different source and target memory types to perftest
- Added optimized memcpy for ROCM devices
- Added hardware tag-matching for CUDA buffers
- Added support for CUDA and ROCM managed memories
- Added support for client/server disconnect protocol over rdma connection manager
- Added support for striding receive queue for hardware tag-matching
- Added XPMEM-based rendezvous protocol for shared memory
- Added support shared memory communication between containers on same machine
- Added support for multi-threaded RDMA memory registration for large regions
- Added new test cases to Azure CI
UCX Java (API Preview)
- Added APIs for stream send/recv, tag probe, and connect request handle
- Added Java package (automatically published) to Maven central
Bugfixes:
- Multiple fixes in JUCX
- Fixes in UCP thread safety
- Fixes for most recent versions GCC, PGI, and ICC
- Fixes for CPU affinity on Azure instances
- Fixes in XPMEM support on PPC64
- Performance fixes in CUDA IPC
- Fixes in RDMA CM flows
- Multiple fixes in TCP transport
- Multiple fixes in documentation
- Fixes in transport lane selection logic
- Fixes in Java jar build
- Fixes in socket connection manager for Nvidia DGX-2 platform
v1.8.0-rc1
1.8.0-rc1
Features:
UCX Core
- Improved detection for DEVX support
- Improved TCP scalability
- Added support for ROCM to perftest
- Added support for different source and target memory types to perftest
- Added optimized memcpy for ROCM devices
- Added hardware tag-matching for CUDA buffers
- Added support for CUDA and ROCM managed memories
- Added support for client/server disconnect protocol over rdma connection manager
- Added support for striding receive queue for hardware tag-matching
- Added XPMEM-based rendezvous protocol for shared memory
- Added support shared memory communication between containers on same machine
- Added support for multi-threaded RDMA memory registration for large regions
UCX Java (API Preview)
- Added APIs for stream send/recv, tag probe, and connect request handle
- Added Java package (automatically published) to Maven central
Bugfixes:
- Multiple fixes in JUCX
- Fixes in UCP thread safety
- Fixes for most recent versions GCC, PGI, and ICC
- Fixes for CPU affinity on Azure instances
- Fixes in XPMEM support on PPC64
- Performance fixes in CUDA IPC
- Fixes in RDMA CM flows
- Multiple fixes in TCP transport
- Multiple fixes in documentation
v1.7.0
Features:
- Added support for multiple listening transports
- Added UCT socket-based connection manager transport
- Updated API for UCT component management
- Added API to retrieve the listening port
- Added UCP active message API
- Removed deprecated API for querying UCT memory domains
- Refactored server/client examples
- Added support for dlopen interception in UCM
- Added support for PCIe atomics
- Updated Java API: added support for most of UCP layer operations
- Updated support for Mellanox DevX API
- Added multiple UCT/TCP transport performance optimizations
- Optimized memcpy() for Intel platforms
- Added protection from non-UCX socket based app connections
- Improved search time for PKEY object
- Enabled gtest over IPv6 interfaces
- Updated Mellanox and Bull device IDs
- Added support for CUDA_VISIBLE_DEVICES
- Increased limits for CUDA IPC registration
Bugfixes:
- Multiple fixes in UCP, UCT, UCM libraries
- Multiple fixes for BSD and Mac OS systems
- Fixes for Clang compiler
- Fix CPU optimization configuration options
- Fix JUCX build on GPU nodes
- Fix in Azure release pipeline flow
- Fix in CUDA memory hooks management
- Fix in GPU memory peer direct gtest
- Fix in TCP connection establishment flow
- Fix in GPU IPC check
- Fix in CUDA Jenkins test flow
- Multiple fixes in CUDA IPC flow
- Fix adding missing header files
- Fix to prevent failures in presence of VPN enabled Ethernet interfaces
v1.7.0-rc2
Features:
- Added support for multiple listening transports
- Added UCT socket-based connection manager transport
- Updated API for UCT component management
- Added API to retrieve the listening port
- Added UCP active message API
- Removed deprecated API for querying UCT memory domains
- Refactored server/client examples
- Added support for dlopen interception in UCM
- Added support for PCIe atomics
- Updated Java API: added support for most of UCP layer operations
- Updated support for Mellanox DevX API
- Added multiple UCT/TCP transport performance optimizations
- Optimized memcpy() for Intel platforms
- Added protection from non-UCX socket based app connections
- Improved search time for PKEY object
- Enable gtest over IPv6 interfaces
- Updated Mellanox and Bull device IDs
- Added support for CUDA_VISIBLE_DEVICES
- Increased limits for CUDA IPC registration
Bugfixes:
- Multiple fixes in UCP, UCT, UCM libraries
- Multiple fixes for BSD and Mac OS systems
- Fixes for Clang compiler
- Fixes for CUDA IPC
- Fix CPU optimization configuration options
- Fix JUCX build on GPU nodes
- Fix in Azure release pipeline flow
- Fix in CUDA memory hooks management
- Fix in GPU memory peer direct gtest
- Fix in TCP connection establishment flow
- Fix in GPU IPC check
- Fix in CUDA Jenkins test flow
- Multiple fixes in CUDA IPC flow
- Fix adding missing header files
- Fix to prevent failures in presence VPN enabled Ethernet interfaces
v1.7.0-rc1
Features:
- Added support for multiple listening transports
- Added UCT socket-based connection manager transport
- Updated API for UCT component management
- Added API to retrieve the listening port
- Added UCP active message API
- Removed deprecated API for querying UCT memory domains
- Refactored server/client examples
- Added support for dlopen interception in UCM
- Added support for PCIe atomics
- Updated Java API: added support for most of UCP layer operations
- Updated support for Mellanox DevX API
- Added multiple UCT/TCP transport performance optimizations
- Optimized memcpy() for Intel platforms
Bugfixes:
- Multiple bugfixes in UCP, UCT, UCM libraries
- Multiple fixes for BSD and Mac OS systems
- Fixes for Clang compiler
- Fixes for CUDA IPC
- Fix CPU optimization configuration options