Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ClassCallMetaClass using Cython #12808

Closed
hivert opened this issue Apr 4, 2012 · 66 comments
Closed

Optimize ClassCallMetaClass using Cython #12808

hivert opened this issue Apr 4, 2012 · 66 comments

Comments

@hivert
Copy link

hivert commented Apr 4, 2012

When a class C is an instance of ClasscallMetaclass, then for any constructor call, the system currently looks for __classcall__ and __classcall_private__ in C. This adds quite an overhead and this could be cached, assuming no one modifies C (which seems reasonable). Improving this speeds up, in particular, all call to the constructor of a subclass of UniqueRepresentation, eg. many parents, all categories...

Apply

CC: @nthiery @simon-king-jena

Component: misc

Keywords: classcall UniqueRepresentation

Author: Florent Hivert, Simon King

Reviewer: Simon King, Florent Hivert

Merged: sage-5.1.beta0

Issue created by migration from https://trac.sagemath.org/ticket/12808

@hivert hivert added this to the sage-5.0 milestone Apr 4, 2012
@hivert
Copy link
Author

hivert commented Apr 4, 2012

comment:1

Before

sage: %timeit ZZ in Rings()
625 loops, best of 3: 1.13 µs per loop

After the preliminary patch

sage: %timeit ZZ in Rings()
625 loops, best of 3: 448 ns per loops

@hivert
Copy link
Author

hivert commented Apr 5, 2012

comment:2

I'm finishing the patch, I've added a large optimization for the other branch
where there is no __classcall__

Here are the results in the basis case (no __classcall__)::

sage: class Rien(object):
...       pass
sage: from sage.misc.classcall_metaclass import ClasscallMetaclass
sage: class NOCALL(object):
...      __metaclass__ = ClasscallMetaclass
...      pass

Standard Python class:

sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.7 ms per loop

Currently:

sage: %timeit [NOCALL() for i in range(10000)]
25 loops, best of 3: 15.9 ms per loop

Cython version:

sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 3.59 ms per loop

Cython optimized version:

sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 1.76 ms per loop

So I'm 5% only slower than normal class.

Here are the results in the basis case (no __classcall__)::

sage: class CALL(object):
...       __metaclass__ = ClasscallMetaclass
...       @staticmethod
...       def __classcall_private__(cls, arg):
...           arg = arg + arg
...           return arg

Currently:

sage: %timeit [CALL(i) for i in range(10000)]
25 loops, best of 3: 8.7 ms per loop

Cython version:

sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.33 ms per loop

Cython optimized version (no mesurable difference from the preceding):

sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.34 ms per loop

Here I'm twice as fast as before.

I'm cleanup the patch (doctests) an will post it shortly.

@simon-king-jena
Copy link
Member

comment:3

Sorry, I can not do any serious work (reviewing), as I am in the middle of holidays. But it looks promising!

@simon-king-jena
Copy link
Member

comment:4

Hi Florent!

I know the patch is preliminary, but here are two comments:

In the new file classcall_metaclass_cy.pyx, it is not explained what __classcall__ and __classcall_private__ do, and when to implement which. Actually, I have never used __classcall_private__ myself.

In the old python version of classcall_metaclass, you move __call__ to __call__disable. Why do you not completely remove the slow Python call method?

@simon-king-jena
Copy link
Member

comment:5

Also, why not going all the way and remove the python version completely? Or at least: Why not cythonizing (and caching?) the __contains__ method?

@simon-king-jena
Copy link
Member

comment:6

Or ideally, one could try to provide a generalised metaclass framework in Sage. Namely, one could think of having different small metaclasses, each providing a particular feature. For example:

  • We have NestedMetaclass.
  • We already have __classcall__ (perhaps resulting in a unique parent structure or in a dynamic class).
  • At Optional cythonised cached hash for Python classes #11794, I suggest a metaclass that provides a fast cached hash.
  • Just "brainstorming": Perhaps it would be a useful feature for debugging to print which Python methods are called of a particular instance. I could imagine a metaclass that could be used to temporarily switch that feature on.

And if we have different "small" metaclasses then it would be useful to combine them. But that's a problem for Python: If A, B and C have different metaclasses, you can not simply have a class definition like

    class D(A,B,C): pass

This is (I guess) why ClasscallMetaclass inherits from NestedMetaclass. But Ticket #11794 provides examples for a different and more skalable solution: Make metaclasses dynamic!

So, what I mean by a generalised metaclass framework in Sage is:

  • Implement a base class SageMetaclass, from which all metaclasses (such as NestedMetaclass and ClasscallMetaclass and FastHashMetaclass) are derived.
  • Make it so that if A, B, C are classes with different metaclasses (all derived from SageMetaclass) can be used as base classes for a new class. In that way, the features provided by metaclasses could be freely combined.

But I guess that's a topic for a different ticket...

@simon-king-jena
Copy link
Member

comment:7

PS: If that works, one could also get rid of the DynamicClasscallMetaclass, which seems like a hack that became necessary since one can not freely combine base classes with different metaclasses.

@hivert
Copy link
Author

hivert commented Apr 20, 2012

comment:8

Hi Simon,

Thanks for all those remarks and sorry for not having finished this one as fast as I wanted.

Replying to @simon-king-jena:

Also, why not going all the way and remove the python version completely? Or at least: Why not cythonizing (and caching?) the __contains__ method?

Obviously. Please see my patch on sage combinat's queue which already adresses all those remarks. I'm currently doing intensive timing to be sure that we have the fastest way.

Florent

@hivert
Copy link
Author

hivert commented Apr 20, 2012

comment:9

Replying to @simon-king-jena:

This is (I guess) why ClasscallMetaclass inherits from NestedMetaclass. But Ticket #11794 provides examples for a different and more skalable solution: Make metaclasses dynamic!

Yes it is. We discussed last week with Nicolas of switching the inheritance
order here. That is using only NestedMetaclass for Categories. I still have
to check that this doesn't break pickling of Parent with a nested Element.

So, what I mean by a generalised metaclass framework in Sage is:

  • Implement a base class SageMetaclass, from which all metaclasses (such as NestedMetaclass and ClasscallMetaclass and FastHashMetaclass) are derived.
  • Make it so that if A, B, C are classes with different metaclasses (all derived from SageMetaclass) can be used as base classes for a new class. In that way, the features provided by metaclasses could be freely combined.

But I guess that's a topic for a different ticket...

+1 for this and for keeping it for a different ticket.

@hivert
Copy link
Author

hivert commented Apr 20, 2012

comment:10

Hi Simon,

I just uploaded an updated patch. I still have to lauch the tests (but I've to run for a train now) and to check the documentation. The code together with the timings should hopefully be the final version here.
I'll put the needs review after the tests and doc rereading.

Thanks for your remarks.

Florent

@simon-king-jena
Copy link
Member

comment:12

Could it be that the patch was created by something like "hg remove sage/misc/classcall_metaclass.py" followed by "hg add sage/misc/classcall_metaclass.pyx"?

I am no mercurial expert, but I was told that one should better use "hg rename sage/misc/classcall_metaclass.py sage/misc/classcall_metaclass.pyx".

@simon-king-jena
Copy link
Member

comment:13

While we are at it: Shouldn't we also try whether changing unique_representation.py into unique_representation.pyx yields a speedup (but not changing "class" into "cdef class", I guess that won't work)?

@simon-king-jena
Copy link
Member

comment:14

I have created an alternative patch (using hg rename). It is smaller than the original patch, but should result in identical code. So, either of the patches could be used, but I guess the smaller patch is easier to read (because one sees more easily the additional differences between classcall_metaclass.py and classcall_metaclass.pyx)

Apply trac_12808-classcall_speedup2.patch

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member

comment:15

Good news: All doctests pass.

I am astonished that the C-level way of calling type.__call__ is so much faster. That suggests to cythonise sage.structure.dynamic_class as well, so that dynamic classes are created more quickly! Shall this be done on a different ticket?

@simon-king-jena
Copy link
Member

comment:16

Probably the dynamic class cythonisation should be on a different ticket. However, it seems that the change to cython will not be so difficult: I simply renamed dynamic_class.py into dynamic_class.pyx (and added two lines to module_list.py), and the test suite seems to pass (most of it is already finished). And then, one can apply your trick with type.__call__.

@simon-king-jena
Copy link
Member

comment:17

I see that you added _included_private_doc_. Could you elaborate on it? I just searched the sources for that name, but it does not occur anywhere except in your patch. So, what does it do?

@nthiery
Copy link
Contributor

nthiery commented Apr 22, 2012

comment:18

Replying to @simon-king-jena:

Probably the dynamic class cythonisation should be on a different ticket. However, it seems that the change to cython will not be so difficult: I simply renamed dynamic_class.py into dynamic_class.pyx (and added two lines to module_list.py), and the test suite seems to pass (most of it is already finished). And then, one can apply your trick with type.__call__.

Thanks for trying this out! It's cool that it works.

Now, do you mind postponing to a later ticket after #11935? I am doing a couple small changes to dynamic_class in my upcoming reviewer's patch (hopefuly tomorrow).

@simon-king-jena
Copy link
Member

comment:19

Note that there is a conflict with #12215, so, either of the patches needs to be rebased. But I guess we first see what patch will be reviewed first...

@simon-king-jena
Copy link
Member

comment:20

Some further questions (in addition to my question about the rôle of _included_private_doc_):

Why is __all__ = ['ClasscallType', 'ClasscallMetaclass', 'typecall', 'timeCall'] added? Isn't importing also possible without that? Or is it needed to make from sage.misc.classcall_metaclass import * work?

Why is there a two-step cythonisation? I mean, why is there a cdef class ClasscallType implementing the special methods, and then ClasscallMetaclass defined by double inheritance from ClasscallType and NestedClassMetaclass? Wouldn't it be better/easier/faster to cdef NestedClassMetaclass as well, and then cdef ClasscallMetaclass directly, without having the ClasscallType?

I guess the typecall function will be useful in other modules (e.g., when cythonising dynamic_class). But it only is def. Shouldn't it rather be cdef inline?

Concerning the timing tools provided in the module: Is there still no "central" location for all aspects of timing? I thought there were occasional discussions on sage-devel about a benchmark/timing framework.

@hivert
Copy link
Author

hivert commented Apr 23, 2012

comment:21

Hi Simon,

Thanks for your in depth review.

Replying to @simon-king-jena:

Some further questions (in addition to my question about the rôle of _included_private_doc_):

This is a tentative inclusion of the doc of the special methods (see
this thread on sage devel.

Why is __all__ = ['ClasscallType', 'ClasscallMetaclass', 'typecall', 'timeCall'] added? Isn't importing also possible without that? Or is it needed to make from sage.misc.classcall_metaclass import * work?

I just want to hide the few class I wrote here for timing (CRef, C2, C3,
C2C). With this line, they are neither imported with import *, nor documented.

Why is there a two-step cythonisation? I mean, why is there a cdef class
ClasscallType implementing the special methods, and then
ClasscallMetaclass defined by double inheritance from ClasscallType and
NestedClassMetaclass? Wouldn't it be better/easier/faster to cdef
NestedClassMetaclass as well, and then cdef ClasscallMetaclass directly,
without having the ClasscallType?

There is this discussion about cleaning up the way metaclass are defined and
used in Sage. I just wanted to keep features separate. Should we decide to
merge or rewrite the metaclass architecture, I'd rather to keep this for a
different ticket.

I guess the typecall function will be useful in other modules (e.g., when
cythonising dynamic_class). But it only is def. Shouldn't it rather be cdef inline?

You mean cpdef inline (we may want to call it from Python) ? I had the
impression from C that this inline doesn't do anything if the function is
defined in a different module. More precisely, in C/C++ inline only work if
the function is defined in .h instead of .c. Does Cython do
anything for that ?

Concerning the timing tools provided in the module: Is there still no
"central" location for all aspects of timing? I thought there were
occasional discussions on sage-devel about a benchmark/timing framework.

Not that I know of.

Florent

@hivert
Copy link
Author

hivert commented Apr 23, 2012

comment:22

I forgot: thanks for creating a new patch. I didn't use rename because at some point of my experiment, I was having the two files *.py and *_c.pyx with the *.py importing the other. Then I copied the documentation from *.py to *_c.pyx until the *.py becomes nearly empty. At the end, I renamed the *_c.pyx into the *.pyx and removed the *.py.

How did you manage to get a patch using rename at the end ?

Florent

@simon-king-jena
Copy link
Member

comment:23

Replying to @hivert:

Why is __all__ = ['ClasscallType', 'ClasscallMetaclass', 'typecall', 'timeCall'] added? Isn't importing also possible without that? Or is it needed to make from sage.misc.classcall_metaclass import * work?

I just want to hide the few class I wrote here for timing (CRef, C2, C3,
C2C). With this line, they are neither imported with import *, nor documented.

So, it is not for making something importable, but for excluding something from automatic import? Cool!

Why is there a two-step cythonisation?

There is this discussion about cleaning up the way metaclass are defined and
used in Sage.

You mean the sage-combinat-devel thread I started?

I just wanted to keep features separate.

Well, I actually think cythoning NestedClassMetaclass is less intrusive than your patch.

Compare: You have to add a new cdef class ClasscallType and change the inheritance of ClasscallMetaclass. I suggest to change NestedClassMetaclass into a cdef class, keep the inheritance of ClasscallMetaclass, and avoid the addition of ClasscallType.

I think I will test it...

You mean cpdef inline (we may want to call it from Python) ?

Yes, I forgot the letter "p".

I had the
impression from C that this inline doesn't do anything if the function is
defined in a different module. More precisely, in C/C++ inline only work if
the function is defined in .h instead of .c. Does Cython do
anything for that ?

I expected it to be inlined, if it is defined cpdef inline in the pxd file, and then cimported into another cython file. But I don't know if that is really done, actually.

@simon-king-jena
Copy link
Member

comment:24

Replying to @hivert:

How did you manage to get a patch using rename at the end ?

Manually. Namely: Apply your patch, copy the touched/new files to a temporary directory, remove your patch, hg qnew, then hg rename, then move the copy of your file over to the file that has just been created by hg rename, and then hg qrefresh.

@hivert
Copy link
Author

hivert commented Apr 23, 2012

comment:25

Replying to @simon-king-jena:

[...]

So, it is not for making something importable, but for excluding something from automatic import? Cool!

Yep ! see
importing * from a package

Why is there a two-step cythonisation?

There is this discussion about cleaning up the way metaclass are defined and
used in Sage.

You mean the sage-combinat-devel thread I started?

Yes ! An some face to face discussion we had with Nicolas.

I just wanted to keep features separate.

Well, I actually think cythoning NestedClassMetaclass is less intrusive than your patch.

Compare: You have to add a new cdef class ClasscallType and change the inheritance of ClasscallMetaclass. I suggest to change NestedClassMetaclass into a cdef class, keep the inheritance of ClasscallMetaclass, and avoid the addition of ClasscallType.

I think I will test it...

Except that at some point Nicolas suggested to keep NestedClassMetaclass for
Categories... The truth is that I've currently no idea on the good way
metaclass should be organized in Sage. So I tried to avoid any interface
changes. I was even surprised that the Cythonizing work so well without
breaking anything in Sage. Now If you think there is a better/faster
way. Please give it a try.

I had the
impression from C that this inline doesn't do anything if the function is
defined in a different module. More precisely, in C/C++ inline only work if
the function is defined in .h instead of .c. Does Cython do
anything for that ?

I expected it to be inlined, if it is defined cpdef inline in the pxd file, and then cimported into another cython file. But I don't know if that is really done, actually.

I'll look at the compiled result.

@simon-king-jena
Copy link
Member

Reviewer: Simon King

@simon-king-jena
Copy link
Member

comment:48

I forgot: Since your patch looks fine to me (independent on whether we eventually decide to cdef ClasscallMetaclass directly) and since all tests pass and the documentation looks fine, I give your part of the work a positive review.

@hivert
Copy link
Author

hivert commented Apr 26, 2012

comment:49

Replying to @simon-king-jena:

I forgot: Since your patch looks fine to me (independent on whether we eventually decide to cdef ClasscallMetaclass directly) and since all tests pass and the documentation looks fine, I give your part of the work a positive review.

Ok ! For your part I've a question: is there any reason why do you call
sys_module in nested_pickle whereas you call sys.module in
NestedClassMetaclass.__init__ ?

Otherwise things looks good !

Florent

@simon-king-jena
Copy link
Member

Attachment: trac_12808_nested_class_cython.patch.gz

Cythonise nested classes

@simon-king-jena
Copy link
Member

comment:50

Replying to @hivert:

Ok ! For your part I've a question: is there any reason why do you call
sys_module in nested_pickle whereas you call sys.module in
NestedClassMetaclass.__init__ ?

Thanks for spotting it! I have updated the nested_class patch.

Apply trac_12808-classcall_speedup-fh.patch trac_12808_nested_class_cython.patch trac_12808-classcall_cdef.patch

@hivert
Copy link
Author

hivert commented Apr 26, 2012

Changed reviewer from Simon King to Simon King, Florent Hivert

@hivert
Copy link
Author

hivert commented Apr 26, 2012

comment:51

Then it's a positive review for me !

I also created #12886 as a followup.

Thanks Simon.

@jdemeyer
Copy link

comment:52

Please decide which patches have to be applied.

@hivert

This comment has been minimized.

@hivert
Copy link
Author

hivert commented Apr 26, 2012

comment:53

Replying to @jdemeyer:

Please decide which patches have to be applied.

Sorry ! I forgot to remove Simon's question. It's decided.

Florent

@nthiery
Copy link
Contributor

nthiery commented Apr 30, 2012

comment:55

There is a trivial conflict between this ticket and #12215 (in unique_representation.py). Which one should go first?

@simon-king-jena
Copy link
Member

comment:56

I guess this one should go in first,: After all, #12215 has not been reviewed, yet.

@nthiery
Copy link
Contributor

nthiery commented Apr 30, 2012

comment:57

Replying to @simon-king-jena:

I guess this one should go in first,: After all, #12215 has not been reviewed, yet.

Ok. May I let you handle the rebase?

@simon-king-jena
Copy link
Member

comment:58

Replying to @nthiery:

Replying to @simon-king-jena:

I guess this one should go in first,: After all, #12215 has not been reviewed, yet.

Ok. May I let you handle the rebase?

OK.

@jdemeyer
Copy link

jdemeyer commented May 6, 2012

Merged: sage-5.1.beta0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants