Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lang] [refactor] Deprecate "as_vector=True" in Matrix.to_numpy/to_torch #1046

Merged
merged 36 commits into from
May 28, 2020

Conversation

archibate
Copy link
Collaborator

@archibate archibate commented May 24, 2020

Related issue = #940 #833 #1016

This PR also makes the stage 3 of #923 easier to implement.

[Click here for the format server]

I strongly argue about the ultra-massive-centralized-multiplexer in ti.Matrix.__init__:

def __init__(self, n=1, m=1, dt=None, shape=None,
             empty=False, layout=None, needs_grad=False,
             keep_raw=False, rows=None, cols=None):
   ... # 100+ lines of if-else-if's, within just **one function**

Functions should be short and sweet, and do just one thing. They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well.

... said the Linux Kernel Coding Style, and I believe the 80x24 rule also works in python too.

Another measure of the function is the number of local variables. They shouldn’t exceed 5-10, or you’re doing something wrong. Re-think the function, and split it into smaller pieces. A human brain can generally easily keep track of about 7 different things, anything more and it gets confused. You know you’re brilliant, but maybe you’d like to understand what you did 2 weeks from now.

Not to say lobal variables, just the arguments alone, there are 11.


Also note that the API initializing a ti.Matrix is highly centralized, GAMES201 students can easily get confused by error messages when they're not using the correct argument specification:

ti.Matrix(n, m, dt, shape)  # matrix tensor
ti.Matrix(n, dt, shape)  # vector tensor
ti.Matrix([[a, b], [c, d]])  # taichi-scope matrix
ti.Matrix([a, b])  # taichi-scope vector
ti.Matrix(cols=[u, v])  # taichi-scope matrix, with col vectors
ti.Matrix(rows=[u, v])  # taichi-scope matrix, with row vectors
ti.Matrix(n, m, empty=True)  # empty matrix

Don't you think a better design could be:

ti.Matrix.var(n, m, dt, shape)  # matrix tensor
ti.Vector.var(n, dt, shape)  # vector tensor
ti.Matrix([[a, b], [c, d]])  # taichi-scope matrix
ti.Vector([a, b])  # taichi-scope vector
ti.Matrix.cols([u, v])  # taichi-scope matrix, with col vectors
ti.Matrix.rows([u, v])  # taichi-scope matrix, with row vectors
ti.Matrix.empty(n, m)  # empty matrix

to give ti.Matrix.__init__ less duty just like currently ti.Matrix.ones and ti.Matrix.zeros does?

Or at least we can make ti.Matrix.__init__ serve as a router just redirecting to the actual functions like ti.Matrix.init_as_local and ti.Matrix.init_as_var, instead of putting those implementations inside of ti.Matrix.__init__ if-else-if's?

Note that we should always try our best to make a well-polished matrix/vector API before Taichi v1.0.0 comes.


I'm putting a lot of lines of text just to convey you trust my refactor decision, so please take a good look and tl;dr's are unwanted.

@archibate archibate changed the title [Lang] Separate ti.Vector from ti.Matrix (stage 1) [lang][refactor] Separate ti.Vector from ti.Matrix (stage 1) May 24, 2020
@archibate archibate added the GAMES201 GAMES 201 students' wishlist label May 24, 2020
@yuanming-hu yuanming-hu mentioned this pull request May 24, 2020
18 tasks
@archibate
Copy link
Collaborator Author

Hello? Please feel free to say No if you don't like this change.

@yuanming-hu
Copy link
Member

I actually like your idea of having multiple initializers!

Whether we should separate vectors from matrices still needs discussions. I personally prefer not to do it:

  • We want to hold 1x1 matrices with the class Matrix only, instead of either Matrix or Vector. The latter will lead to a lot of confusion.
  • Using a single Matrix class can hold both row vectors and column vectors. If we create an extra Vector class, we have to further distinguish row vectors from column vectors
  • If we separate Vector from Matrix, can Matrix hold nx1 or 1xn matrices (vectors)?
    • If it can, then users might misuse Matrix to hold vectors
    • If it cannot, then a lot of branching will be needed for functions returning nxm matrices where n or m might be 1

Generally speaking, for regularity, I prefer not to do the separation.

@archibate
Copy link
Collaborator Author

archibate commented May 24, 2020

If it can, then users might misuse Matrix to hold vectors

They don't want to use Matrix as vector, since Matrix.dot: Name not defined.

My worry is, will Matrix.dot and Matrix.outer_product be confusing to people? They are actually only for Vectors.

Another issue is, as_vector=True sounds so strange, why I defined as ti.Vector while still needs to specify this? And what if we add more taichi_classes like 四元数.to_numpy()?

@archibate archibate self-assigned this May 24, 2020
@yuanming-hu
Copy link
Member

yuanming-hu commented May 24, 2020

Good points.

They don't want to use Matrix as vector, since Matrix.dot: Name not defined.
My worry is, will Matrix.dot and Matrix.outer_product be confusing to people? They are actually only for Vectors.

Maybe we can solve this by moving dot and outer_product to ti?

Another issue is, as_vector=True sounds so strange, why I defined as ti.Vector while still needs to specify this? And what if we add more taichi_classes like 四元数.to_numpy()?

as_vector actually brings you consistency. We don't want dimensionality (shape) of the returned numpy array to depend on how many rows/columns you have in Matrix.

@archibate
Copy link
Collaborator Author

archibate commented May 24, 2020

as_vector actually brings you consistency. We don't want dimensionality (shape) of the returned numpy array to depend on how many rows/columns you have in Matrix.

Not this point. My point is ti.Vector is a complete different class from ti.Matrix, just like ti.Quaternion.to_numpy doesn't have to be the same shape with ti.Matrix.to_numpy.

@archibate
Copy link
Collaborator Author

Also note that we will have to use [[1, 2]] or [[1], [2]] for printing row/col vectors in #923.
So we want:

print('u =', u, as_vector=True)

for that? See? as_vector is really a bad design.

@yuanming-hu
Copy link
Member

Thanks for the inputs.

If you really hate as_vector, how about this: let's make to_numpy sets as_vector as True by default when the Matrix with only one column.

as_vector actually brings you consistency. We don't want dimensionality (shape) of the returned numpy array to depend on how many rows/columns you have in Matrix.

By changing the default as_vector we give up consistency here for convenience, which is fine. There's no perfect solution here.

Not this point. My point is ti.Vector is a complete different class from ti.Matrix, just like ti.Quaternion.to_numpy doesn't have to be the same shape with ti.Matrix.to_numpy.

I really like your idea of having more constructors, but I still don't think Vector is completely different from Matrix. Quaternion should clearly be a different class since it has different arithmetic rules from Vector and Matrix.

So, does removing as_vector but not separating Vector from Matrix sound reasonable to you? We can, of course, deprecate ti.Matrix.dot and use ti.dot instead.

@yuanming-hu
Copy link
Member

@FantasyVR please feel free to share your inputs given you are interested in this topic :-)

@archibate
Copy link
Collaborator Author

@FantasyVR please feel free to share your inputs given you are interested in this topic :-)

Yes, we need more discussion on this, to determine this ultra-critical design by just you-and-me can be biased.

If you really hate as_vector, how about this: let's make to_numpy sets as_vector as True by default when the Matrix with only one column.

So we also print as [0, 1] in #923 if self.m == 1? What if self.n == self.m == 1? Is that printed as a scalar or 1D-col-vector or 1D-row-vector?

@archibate
Copy link
Collaborator Author

I just come up with an perfect idea to keep both consistency and simplicity:

def Vector(n):
  return Matrix(n, 1, is_vector=True)

...
def dot(u, v):
  assert u.is_vector

@archibate archibate changed the title [lang][refactor] Separate ti.Vector from ti.Matrix (stage 1) [Lang][refactor] deprecate "as_vector=True" May 24, 2020
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! Most changes look great. Just two minor places to fix.

else:
dim_ext = (self.n, self.m)
as_vector = self.m == 1 and not keep_dims
dim_ext = (self.n, ) if as_vector else (self.n, self.m)
Copy link
Member

@yuanming-hu yuanming-hu May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's skip self.m as well if it value is 1. len(dim_ext)=

  • 0 for 1x1 matrices
  • 1 for nx1 or 1xn matrices (n != 1)
  • 2 for nxm matrices (n, m != 1)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I should change matrix_to_ext_arr to treat the useless 1x1 matrices?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems that we need a keep_dims parameter in matrix_to_ext_arr as well.

Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special treating 1x1 mats sounds unreasonable to me..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few people would use 1x1 matrices anyway. Let's just make the behavior consistent: dimensionality of size 1 should be skipped.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wdym? Do we consider 1x1 mat as 1D vector?

Copy link
Member

@yuanming-hu yuanming-hu May 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if the matrix is 1x1, then the numpy array, when keep_dims=False, should have the same shape as the matrix shape, instead of the matrix shape extended by (1, 1).

python/taichi/lang/matrix.py Outdated Show resolved Hide resolved
@archibate archibate requested a review from yuanming-hu May 27, 2020 07:25
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Let's make keep_dims behave as what most people would expect.

tests/python/test_numpy_io.py Show resolved Hide resolved
@@ -61,7 +61,7 @@ def substep():
w = [
0.5 * (1.5 - fx)**2, 0.75 - (fx - 1.0)**2, 0.5 * (fx - 0.5)**2
]
new_v = ti.Vector.zero(ti.f32, 2)
new_v = ti.Matrix.zero(ti.f32, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
new_v = ti.Matrix.zero(ti.f32, 2)
new_v = ti.Vector.zero(ti.f32, 2)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>           new_v = ti.Vector.zero(ti.f32, 2)
E           AttributeError: 'function' object has no attribute 'zero'

Thanks to the fact that Vector is not a subclass of Matrix, but a initializer wrapper function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I use Vector.zero = Matrix.zero solved the problem for now before we finally come to ti.linalg.zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. We should not use the subclass approach. Let's use add different initializers later.

@archibate archibate requested a review from yuanming-hu May 28, 2020 03:36
@archibate
Copy link
Collaborator Author

Hello? We should mt asap, 2pvcw/ #1051, & mk GAMES201 happy.

Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this for now, given the changeset is getting large, but we should fully implement keep_dims later. Thank you!

@yuanming-hu yuanming-hu changed the title [Lang][refactor] deprecate "as_vector=True" [Lang] [refactor] Deprecate "as_vector=True" May 28, 2020
@yuanming-hu yuanming-hu changed the title [Lang] [refactor] Deprecate "as_vector=True" [Lang] [refactor] Deprecate "as_vector=True" in to_numpy/to_torch May 28, 2020
@yuanming-hu yuanming-hu changed the title [Lang] [refactor] Deprecate "as_vector=True" in to_numpy/to_torch [Lang] [refactor] Deprecate "as_vector=True" in Matrix.to_numpy/to_torch May 28, 2020
@archibate archibate merged commit c193e4b into taichi-dev:master May 28, 2020
@yuanming-hu yuanming-hu mentioned this pull request May 31, 2020
@xumingkuan
Copy link
Contributor

img = img.to_numpy(as_vector=True)

@archibate This line raises an error now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GAMES201 GAMES 201 students' wishlist
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants