-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. #3446
Conversation
canonizer
commented
Jul 5, 2018
- added distributions to HostDeviceVector
- using HostDeviceVector for labels, weights and base margings in MetaInfo
- using HostDeviceVector for offset and data in SparsePage
- other necessary refactoring
- added distributions to HostDeviceVector - using HostDeviceVector for labels, weights and base margings in MetaInfo - using HostDeviceVector for offset and data in SparsePage - other necessary refactoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Can you confirm for me if the vectors in info get copied back and forth between host and device?
For example the label vector never changes, but when we alternately call label_.HostVector() and label_.DevicePointer() is this being copied?
Yes, the vectors are currently copied between the host and device even if not modified. If you want to avoid this, I can add const-only versions of methods that get host or GPU data. E.g., |
This seems like a good idea, otherwise they will be constantly syncing. |
- const versions added to calls that can trigger data transfers, e.g. DevicePointer() - updated the code that uses HostDeviceVector - objective functions now accept const HostDeviceVector<bst_float>& for predictions
- this means no copies are performed if both host and devices access the HostDeviceVector read-only
Done. |
- updated the lz4 plugin - added ConstDeviceSpan to HostDeviceVector - using device % dh::NVisibleDevices() for the physical device number, e.g. in calls to cudaSetDevice()
Is it worth to store some extra information in one of the GPU related class like number of SMs? Currently all the grid stride functions just use one grid. |
- replaced HostDeviceVector<unsigned int> with HostDeviceVector<int>
Codecov Report
@@ Coverage Diff @@
## master #3446 +/- ##
============================================
+ Coverage 50.39% 50.52% +0.13%
Complexity 188 188
============================================
Files 172 172
Lines 13986 14096 +110
Branches 457 457
============================================
+ Hits 7048 7122 +74
- Misses 6713 6749 +36
Partials 225 225
Continue to review full report at Codecov.
|
11f0bc6
to
7ec438e
Compare
- added distributions to HostDeviceVector - using HostDeviceVector for labels, weights and base margings in MetaInfo - using HostDeviceVector for offset and data in SparsePage - other necessary refactoring
- const versions added to calls that can trigger data transfers, e.g. DevicePointer() - updated the code that uses HostDeviceVector - objective functions now accept const HostDeviceVector<bst_float>& for predictions
- this means no copies are performed if both host and devices access the HostDeviceVector read-only
- updated the lz4 plugin - added ConstDeviceSpan to HostDeviceVector - using device % dh::NVisibleDevices() for the physical device number, e.g. in calls to cudaSetDevice()
- replaced HostDeviceVector<unsigned int> with HostDeviceVector<int>
0fb1ef1
to
32cb6f4
Compare
32cb6f4
to
2a04551
Compare
@canonizer I have rebased this. Can you please review before I merge? It was a complicated process so there are probably mistakes. I have a backup of your PR before the rebase if necessary. |
A couple of GPU tests are currently failing because I removed the behaviour of "wrapping around" when a number of devices is specified that is greater than the number of available devices. This was to be consistent with what @trivialfis did in his PR. We need to decide if we are going to "wrap" the device ordinal or throw some error. I'm thinking we should throw an error because if the user selected multiple GPUs they probably did not intend for it to simply run and multiple blocks on the same GPU. We could check for a sufficient available devices inside of the GPUDistribution or GPUSet. |
- added a mock set device handler; when set, it is called instead of cudaSetDevice()
Codecov Report
@@ Coverage Diff @@
## master #3446 +/- ##
===========================================
+ Coverage 50.97% 51.1% +0.12%
Complexity 188 188
===========================================
Files 176 176
Lines 14090 14200 +110
Branches 457 457
===========================================
+ Hits 7183 7257 +74
- Misses 6682 6718 +36
Partials 225 225
Continue to review full report at Codecov.
|
dmlc#3446) * Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. - added distributions to HostDeviceVector - using HostDeviceVector for labels, weights and base margings in MetaInfo - using HostDeviceVector for offset and data in SparsePage - other necessary refactoring * Added const version of HostDeviceVector API calls. - const versions added to calls that can trigger data transfers, e.g. DevicePointer() - updated the code that uses HostDeviceVector - objective functions now accept const HostDeviceVector<bst_float>& for predictions * Updated src/linear/updater_gpu_coordinate.cu. * Added read-only state for HostDeviceVector sync. - this means no copies are performed if both host and devices access the HostDeviceVector read-only * Fixed linter and test errors. - updated the lz4 plugin - added ConstDeviceSpan to HostDeviceVector - using device % dh::NVisibleDevices() for the physical device number, e.g. in calls to cudaSetDevice() * Fixed explicit template instantiation errors for HostDeviceVector. - replaced HostDeviceVector<unsigned int> with HostDeviceVector<int> * Fixed HostDeviceVector tests that require multiple GPUs. - added a mock set device handler; when set, it is called instead of cudaSetDevice()