Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span class. #3548

Merged
merged 3 commits into from
Aug 14, 2018
Merged

Span class. #3548

merged 3 commits into from
Aug 14, 2018

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Aug 2, 2018

Implement span class from ISO++20 for safer pointer indexing.

TODOs:

  • Fix convert and container constructor.
  • Fix as_bytes.
  • Structure tests for CUDA tests.
  • Fix errors in test farm.
  • Make an example use in xgboost.
  • Make it work with MSVC.

Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good in general. I'm wondering if we need all of the functionality here. We could possibly make this more light weight by removing the "as_bytes" functions and span vs span comparison functions. It is hard to see the use case for these in xgboost.

Although, maybe there is an argument for having a more complete implementation true to the standard.


XGBOOST_DEVICE reference operator[](index_type idx) const {
SPAN_CHECK(idx >= 0 && idx < size());
SPAN_CHECK(idx >= 0 && idx < size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate check here.


TEST(Span, Constructors) {
// Dynamic extent
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two tests are identical.

Span<float> s {nullptr, static_cast<Span<float>::index_type>(0)};
ASSERT_EQ(s.size(), 0);
ASSERT_EQ(s.data(), nullptr);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More duplicated tests.

@trivialfis
Copy link
Member Author

@RAMitchell I will re-visit all test cases, see what's been missing.
For the as_bytes, it did came across my mind. Maybe one needs to serialize a segment of memory, like a trained model? I don't know, so obliging to the standard is just me making a safe bet.

@trivialfis trivialfis force-pushed the span branch 2 times, most recently from 5699264 to 9ac5392 Compare August 5, 2018 16:03
@codecov-io
Copy link

codecov-io commented Aug 5, 2018

Codecov Report

Merging #3548 into master will increase coverage by 1.94%.
The diff coverage is 88.8%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3548      +/-   ##
============================================
+ Coverage     47.85%   49.79%   +1.94%     
  Complexity      188      188              
============================================
  Files           169      172       +3     
  Lines         13330    13868     +538     
  Branches        457      457              
============================================
+ Hits           6379     6906     +527     
- Misses         6726     6737      +11     
  Partials        225      225
Impacted Files Coverage Δ Complexity Δ
src/common/host_device_vector.h 100% <ø> (ø) 0 <0> (ø) ⬇️
src/gbm/gblinear.cc 13.79% <0%> (-0.08%) 0 <0> (ø)
src/common/host_device_vector.cc 54.16% <0%> (-2.36%) 0 <0> (ø)
src/tree/updater_histmaker.cc 2.82% <0%> (ø) 0 <0> (ø) ⬇️
src/tree/updater_colmaker.cc 1.5% <0%> (-0.01%) 0 <0> (ø)
src/c_api/c_api.cc 23.63% <0%> (-0.05%) 0 <0> (ø)
src/tree/updater_basemaker-inl.h 0% <0%> (ø) 0 <0> (ø) ⬇️
src/tree/updater_skmaker.cc 2.16% <0%> (ø) 0 <0> (ø) ⬇️
tests/cpp/data/test_simple_csr_source.cc 100% <100%> (ø) 0 <0> (ø) ⬇️
tests/cpp/data/test_sparse_page_dmatrix.cc 96.66% <100%> (ø) 0 <0> (ø) ⬇️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b7a1c5...dbd61c2. Read the comment docs.

@trivialfis
Copy link
Member Author

Not ready yet, seems tests is still not thorough enough.

@trivialfis trivialfis force-pushed the span branch 2 times, most recently from 3f3e0b5 to 0aa4f4e Compare August 6, 2018 15:34
@RAMitchell
Copy link
Member

@hcho3 as noted by @trivialfis there is actually an array_view class in dmlc-core. After a brief look it seems it is used by the nvvm/tv projects. We could replace array_view with our span which is more powerful, better tested and compliant with the upcoming standard. What do you think?

@hcho3
Copy link
Collaborator

hcho3 commented Aug 7, 2018

@RAMitchell You should file a pull request at dmlc-core and make a case for upgrading array_view with the span class. There seems to be a good amount of code using array_view already and it would take some work to replace all occurrences.

For the sake of lesser friction, you could simply add the span class to dmlc-core and have DMLC projects gracefully adopt it.

@trivialfis
Copy link
Member Author

trivialfis commented Aug 8, 2018

@RAMitchell Ready for another review. The Travis failed fetching packages in Python test, it should be fine on a good day.
EDIT: Jenkins been passed. The scary red "x" is caused by "ERROR: Queue task was cancelled" on a re-run.

The last commit is just a toy show case for using Span in kernel code. It could be reverted for a more formal integration if the rest of the code is suitable for merge.

EDIT: Later dealing with the DVec will take some time. I need to understand the whole algorithm for better integration.

@RAMitchell
Copy link
Member

RAMitchell commented Aug 9, 2018

Looks good. One more thing to consider, span will incur a performance overhead when using the operator[], these changes will have worsened performance in places like this:

for (bst_uint j = 0; j < inst.length; ++j) {

This can be fixed by changing this to a range based for loop, I believe this does not need to do any index checking.

Edit: Looks like you have actually already done this :)

@RAMitchell
Copy link
Member

DVec can be dealt with later. My idea is that we can gradually enforce the usage of span across the code base using the clang-tidy linter. The most important algorithm for us is gpu_hist as it is used a lot in production, so maybe we can start there when you are ready.

I think this should be good to merge when the CI tests pass.

@trivialfis
Copy link
Member Author

@RAMitchell So it passed. \0/

@trivialfis trivialfis force-pushed the span branch 3 times, most recently from e277399 to cdf3ccf Compare August 13, 2018 19:48
@trivialfis trivialfis changed the title [WIP] Span class. Span class. Aug 13, 2018
@trivialfis trivialfis force-pushed the span branch 2 times, most recently from b5ea229 to a474c07 Compare August 13, 2018 22:23
@RAMitchell RAMitchell merged commit 2c50278 into dmlc:master Aug 14, 2018
@trivialfis trivialfis deleted the span branch August 19, 2018 14:33
@lock lock bot locked as resolved and limited conversation to collaborators Nov 12, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants