Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[type] [refactor] Promote DataType to a class #1906

Merged
merged 13 commits into from
Oct 4, 2020

Conversation

yuanming-hu
Copy link
Member

@yuanming-hu yuanming-hu commented Sep 29, 2020

Related issue = #1905

This PR aims to promote DataType into an LLVM-style DataTypeNode pointer. Different types have different pointer addresses.

The changes are big, but they are necessary to move the system to another consistent state that passes all tests.

Currently, all tests pass, except those that are parameterized over data types and those that alter default_fp/ip:

FAILED tests/python/test_ad_basics.py::test_poly - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_minmax - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_pow - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_unary - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_atan2_f64 - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_trigonometric - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_frac - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_atan2 - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_ad_basics.py::test_pow_f64 - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_bit_operations.py::test_bit_shr[opengl] - RuntimeError: [opengl_data_types.h:opengl_data_type_name@20] Not supported.
FAILED tests/python/test_cast.py::test_cast_default_fp[dtype0] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_cast.py::test_cast_default_fp[dtype1] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_cast.py::test_cast_default_ip[dtype0] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_cast.py::test_cast_default_ip[dtype1] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_field.py::test_default_fp[dtype0] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_field.py::test_default_fp[dtype1] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_field.py::test_default_ip[dtype0] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_field.py::test_default_ip[dtype1] - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_linalg.py::test_polar_decomp - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_svd.py::test_svd - TypeError: can't pickle taichi_core.DataTypeNode objects
FAILED tests/python/test_unary_ops.py::test_f64_trig - TypeError: can't pickle taichi_core.DataTypeNode objects
ERROR tests/python/test_ad_if.py - TypeError: can't pickle taichi_core.DataTypeNode objects
ERROR tests/python/test_ad_if.py - TypeError: can't pickle taichi_core.DataTypeNode objects
ERROR tests/python/test_ad_if.py - TypeError: can't pickle taichi_core.DataTypeNode objects
ERROR tests/python/test_ad_if.py - TypeError: can't pickle taichi_core.DataTypeNode objects

This is likely because DataType used to be an enum that can be easily pickled, but now they are a pointer to DataTypeNode. @TH3CHARLie could you investigate this a bit and maybe push to this PR so that these tests are fixed? I guess one solution is to pass a string 'f32' instead of using ti.i32 which can no longer be pickled... No rush on this, and thank you for your help in advance! :-)
(tests/python/test_bit_operations.py::test_bit_shr[opengl] was already there before this change.)

Code formatting should be done right before merging to simplify review given the large change here.

[Click here for the format server]


Copy link
Collaborator

@archibate archibate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, so we'll have real vector types by DataTypeNode (instead of a combination of n scalar types)?

Copy link
Collaborator

@archibate archibate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you not renaming DataType -> DataTypeNode? The changeset is too big for a clam review.

I guess one solution is to pass a string 'f32' instead of using ti.i32 which can no longer be pickled...

Stop ad-hoc. There must be a better solution for this.

(tests/python/test_bit_operations.py::test_bit_shr[opengl] was already there before this change.)

That's because opengl doesn't support u32 type. I'll fix it.

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Sep 29, 2020

I partially agree with @archibate on fixing the failed tests. Passing a string would be a quick fix but I assume parameterization over data types would be one frequent code practice as taichi's type system's reform continues, so maybe we should figure out a better way(pass a type string gives people a feeling that the type system is half-baked, that's what I learned from mypy since they tried this approach) to make it consistent, both in these tests and future user code.

@yuanming-hu
Copy link
Member Author

I partially agree with @archibate on fixing the failed tests. Passing a string would be a quick fix but I assume parameterization over data types would be one frequent code practice as taichi's type system's reform continues, so maybe we should figure out a better way(pass a type string gives people a feeling that the type system is half-baked, that's what I learned from mypy since they tried this approach) to make it consistent, both in these tests and future user code.

In the long run, we should definitely use a more systematic solution. I'm not if a simple and systematic solution exists at this point, so it's fine to hack a bit here to move things forward. Ultimately we need to implement serialization/deserialization of types, but that's clearly a lot of work.

Of course, if you do find a simple and systematic solution, please go ahead :-) That's why we need you here.

@TH3CHARLie
Copy link
Collaborator

I partially agree with @archibate on fixing the failed tests. Passing a string would be a quick fix but I assume parameterization over data types would be one frequent code practice as taichi's type system's reform continues, so maybe we should figure out a better way(pass a type string gives people a feeling that the type system is half-baked, that's what I learned from mypy since they tried this approach) to make it consistent, both in these tests and future user code.

In the long run, we should definitely use a more systematic solution. I'm not if a simple and systematic solution exists at this point, so it's fine to hack a bit here to move things forward. Ultimately we need to implement serialization/deserialization of types, but that's clearly a lot of work.

Of course, if you do find a simple and systematic solution, please go ahead :-) That's why we need you here.

You are right. Then give me two days to think about some better workaround, if I failed to do so then I think it makes much sense to move things forward. It's hard to have an omnipotent design at first when the requirements haven't been 100% clear.

@yuanming-hu
Copy link
Member Author

yuanming-hu commented Sep 29, 2020

I partially agree with @archibate on fixing the failed tests. Passing a string would be a quick fix but I assume parameterization over data types would be one frequent code practice as taichi's type system's reform continues, so maybe we should figure out a better way(pass a type string gives people a feeling that the type system is half-baked, that's what I learned from mypy since they tried this approach) to make it consistent, both in these tests and future user code.

In the long run, we should definitely use a more systematic solution. I'm not if a simple and systematic solution exists at this point, so it's fine to hack a bit here to move things forward. Ultimately we need to implement serialization/deserialization of types, but that's clearly a lot of work.
Of course, if you do find a simple and systematic solution, please go ahead :-) That's why we need you here.

You are right. Then give me two days to think about some better workaround, if I failed to do so then I think it makes much sense to move things forward. It's hard to have an omnipotent design at first when the requirements haven't been 100% clear.

Thanks! No rush on this. It may still take a few days for me to consolidate the new type system design - this PR is just an attempt to prove that we can indeed migrate smoothly, and to test what issues we will run into, such as the pickling one above. If you need some Zoom discussions, please let me know :-)

@codecov
Copy link

codecov bot commented Oct 1, 2020

Codecov Report

Merging #1906 into master will increase coverage by 0.00%.
The diff coverage is 0.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1906   +/-   ##
=======================================
  Coverage   43.86%   43.86%           
=======================================
  Files          45       45           
  Lines        6190     6199    +9     
  Branches     1099     1101    +2     
=======================================
+ Hits         2715     2719    +4     
- Misses       3306     3311    +5     
  Partials      169      169           
Impacted Files Coverage Δ
python/taichi/lang/util.py 32.93% <0.00%> (ø)
python/taichi/lang/__init__.py 42.11% <0.00%> (-0.36%) ⬇️
python/taichi/lang/impl.py 66.84% <0.00%> (+0.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18bb5ed...2a0e9ce. Read the comment docs.

@yuanming-hu
Copy link
Member Author

A lot of thanks to @TH3CHARLie who solved the pickling issue. Now this PR is ready for review. The next step would be rename DataType etc using @taichi-gardener :-)

@yuanming-hu yuanming-hu marked this pull request as ready for review October 1, 2020 19:51
Copy link
Collaborator

@TH3CHARLie TH3CHARLie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@k-ye k-ye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't follow too much on this! Left a few comments, hopefully they make sense :)

py::class_<DataType>(m, "DataType")
.def(py::self == py::self)
.def(py::pickle(
[](const DataType &dt) { return py::make_tuple((std::size_t)dt); },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since dt represents a type, when I first saw this, I got confused by thinking this is taking the size of the type. I'd suggest that we have an explicit function here, rather than doing operator size_t(). That way it is also easier to code search... :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice idea. It would be better to have a function which does the exact same thing(or just call operator size_t())

Copy link
Member

@k-ye k-ye Oct 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to have a function

Yeah +1. I'd suggest to be explicit and not to have operator size_t() at all. This kind of implicitness could very likely go in an unintended manner. (Similarly, we try to add explicit at constructors that take in a single parameter). Maybe https://stackoverflow.com/a/22164767/12003165 and http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/aw/meyerscddemo/DEMO/MEC/MC2_FR.HTM could be some good arguments on this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, sorry about the confusion here. The operator size_t() was introduced simply to allow the new DataType to still behave like an enum (which it used to be) in some cases. One example is:

std::size_t operator()(taichi::lang::JITEvaluatorId const &id) const
noexcept {
return ((std::size_t)id.op | ((std::size_t)id.ret << 8) |
((std::size_t)id.lhs << 16) | ((std::size_t)id.rhs << 24) |
((std::size_t)id.is_binary << 31)) ^
(std::hash<std::thread::id>{}(id.thread_id) << 32);
}

Since it leads to confusion, I'll use hash() instead. This will slightly increase the changeset of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Thanks!

taichi/lang_util.h Outdated Show resolved Hide resolved
taichi/lang_util.cpp Show resolved Hide resolved
Copy link
Member

@k-ye k-ye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yuanming-hu
Copy link
Member Author

Interesting, so we'll have real vector types by DataTypeNode (instead of a combination of n scalar types)?

@archibate Exactly! That needs work though, so I don't expect that to happen immediately :-)

else if (type == DataType::f64 || type == DataType::i64) {
return 3;
} else {
TI_NOT_IMPLEMENTED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include these OFT nits in an already-error-prone PR, they increased review difficulty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree off-the-topic changes should be mostly avoided, note that this change is not really off the topic. The old switch-case implementation no longer works after refactoring, and the modifications are necessary for the build to pass.

The same for other places.

REGISTER_DATA_TYPE(u64, uint64);
REGISTER_DATA_TYPE(gen, generic);
REGISTER_DATA_TYPE(unknown, unknown);
#define REGISTER_DATA_TYPE(i, j) else if (t == DataType::i) return #j
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, switch-or-if nits can be put iapr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants