Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use weak references to cache homsets #11521

Closed
jpflori opened this issue Jun 18, 2011 · 208 comments
Closed

Use weak references to cache homsets #11521

jpflori opened this issue Jun 18, 2011 · 208 comments

Comments

@jpflori
Copy link

jpflori commented Jun 18, 2011

Originally, this ticket was about the following memory leak when computing with elliptic curves:

sage: K = GF(1<<55,'t')
sage: a = K.random_element()
sage: while 1:
....:     E = EllipticCurve(j=a); P = E.random_point(); 2*P; 

This example is in fact solved by #715. However, while working on that ticket, another leak has been found, namely

sage: for p in prime_range(10^5):
....:     K = GF(p)
....:     a = K(0)
....:     
sage: import gc
sage: gc.collect()
0

So, I suggest to start with #715 and solve the second memory leak on top of it. It seems that a strong cache for homsets is to blame. I suggest to use the weak TripleDict instead, which were introduced in #715.

To be merged with #715. Apply

Depends on #12969
Depends on #715

Dependencies: #12969; to be merged with #715

CC: @jpflori @nthiery

Component: coercion

Keywords: sd35

Author: Simon King, Nils Bruin

Reviewer: Jean-Pierre Flori, Nils Bruin, Simon King

Merged: sage-5.5.beta0

Issue created by migration from https://trac.sagemath.org/ticket/11521

@jpflori jpflori added this to the sage-4.8 milestone Jun 18, 2011
@jpflori
Copy link
Author

jpflori commented Jul 1, 2011

comment:1

After looking at #10548, I might have a better idea of the culprit:

sage: import gc
sage: from sage.schemes.elliptic_curves.ell_finite_field import EllipticCurve_finite_field
sage: K = GF(1<<60,'t')
sage: j = K.random_element()
sage: for i in xrange(100):
....:     E = EllipticCurve(j=j); P = E.random_point(); 2*P;
....:     
sage: gc.collect()
440
sage: len([x for x in gc.get_objects() if  isinstance(x,EllipticCurve_finite_field)])
100

@jpflori
Copy link
Author

jpflori commented Jul 1, 2011

comment:2

So this could be just #715 .

@jpflori jpflori assigned robertwb and unassigned rlmill Jul 1, 2011
@jpflori
Copy link
Author

jpflori commented Jul 1, 2011

comment:3

This is definitely #715.

Resetting the coercion cache and calling gc.collect() deletes the cached elements.

I guess weak refs should be used in the different TripleDict objects of CoercionModel_cache_maps.

So this should be closed as duplicate/won't fix.

@zimmermann6
Copy link

comment:5

Jean-Pierre, why did you change the status to "needs review"? There is no patch to review.

Also, how to you reset the coercion cache? I would be interested if you have a workaround for the
memory leak in:

for p in prime_range(10^8):
   k = GF(p)

Paul

@jpflori
Copy link
Author

jpflori commented Oct 13, 2011

comment:6

As far as I'm concerned, this is nothing but a concrete example of the vague #715.
So I guess I put it to "needs review" so that it could be closed as "duplicate/won't fix".
Not sure it was the right way to do it.

I seem to remember that I had some workarounds to delete some parts of the cache (so that I could perform my computations), but not all of them.
In fact there are several dictionnaries hidden in different places.
It was quite a while ago, but I'll have another look at it at some point.
Anyway, using weak references for all these dictionnaries seems to be a quite non trivial task.
Moreover it should also not slow things down too much to be viable...

@zimmermann6
Copy link

comment:7

for the computations I need to perform, which need to create about 200000 prime fields, this
memory leak makes it impossible to perform it with Sage, which eats all the available memory.

I would be satisfied if I had a magic command to type to explicitly free the memory used by
every k=GF(p).

Paul

@jpflori
Copy link
Author

jpflori commented Oct 14, 2011

comment:8

I'm having a look at your problem right now. Here are some hints to study the problem, mostly stolen from #10548.

I put it here for the record, and so that i can go faster next time.

First, if I only create the finite fields and do nothing with them, I do not seem to get a memleak. Some fields are not garbage collected immediately, but calling gc.collect() does the trick.

sage: import gc
sage: for p in prime_range(10**5):
....:    k = GF(p)
....:
sage: del p, k
sage: gc.collect()
1265
sage: from sage.rings.finite_rings.finite_field_prime_modn import FiniteField_prime_modn as FF
sage: L = [x for x in gc.get_objects() if isinstance(x, FF)]
sage: len(L)
1
sage: L
[Finite Field of size 2]

Of course I guess you actually do something with your finite fields.

So here is a small example causing the fields to stay cached.

sage: import gc
sage: for p in prime_range(10**5):
....:    k = GF(p)
....:
sage: del p, k
sage: gc.collect()
0

The zero here is bad and indeed

sage: from sage.rings.finite_rings.finite_field_prime_modn import FiniteField_prime_modn as FF
sage: L = [x for x in gc.get_objects() if isinstance(x, FF)]
sage: len(L)
9592 

If you want to know where it comes from you can use the objgraph python module (on my debian I just installed the python-objgraph package, updated sys.path in Sage to include '/usr/lib/python2.6/dist-packages')

sage: sys.path.append('/usr/lib/python2.6/dist-packages')
sage: import inspect, objgraph
sage: objgraph.show_backrefs(L[-1],filename='/home/jp/br.png',extra_ignore=[id(L)])

And look at the png or use

sage: brc = objgraph.find_backref_chain(L[-1],inspect.ismodule,max_depth=15,extra_ignore=[id(L)])
sage: map(type,brc)
[<type 'module'>, <type 'dict'>, <type 'dict'>,...
sage: brc[0]
<module 'sage.categories.homset'...

So the culprit is "_cache" in sage.categories.homset which has a direct reference to the finite fields in its keys.

The clean solution woul be to used weakref in the keys (let's say something as WeakKeyDirectory in python).

However, resetting cache should be a (potentially partial) workaround (and could kill your Sage?).

sage: sage.categories.homset.cache = {}
sage: gc.collect()
376595

It also seems to be enough if I do "a = k.random_element(); a = a+a" in the for loop, but not if I add "a = 47*a+3".

I'm currently investigating that last case.

@jpflori
Copy link
Author

jpflori commented Oct 14, 2011

comment:9

For info, using "k(47)*a+k(3)" is solved,  so the problem left really comes from coercion and action of integers.

sage: cm = sage.structure. get_coercion_model()
sage: cm.reset_cache()

does not help.

@jpflori
Copy link
Author

jpflori commented Oct 14, 2011

comment:10

First, the second example above is missing the line "k(1);" in the for loop, otherwise it does nothing more than the first example.

Second, I guess the remaining references to the finite fields are in the different lists and dictionnaries of the integer ring named _coerce_from_list, _convert_from_list etc.

You can not directly access them from Python level, but there a function _introspect_coerce() (defined in parent.pyx) which returns them.

@jpflori
Copy link
Author

jpflori commented Oct 14, 2011

comment:11

In fact, everything is in _*_hash.

And to conclude, I'd say that right now you can not directly delete entries in this dictionaries from the Python level.

So for minimum changes, you should eitheir avoid coercion, or make the dictionaries "cdef public" in parent.pxd to be able to explicitely delete every created entries (be aware that it could be written in different places for example in ZZ but also in QQ and CC and I don't know where...).

And I could also have missed other dictionaries used by Sage.

@zimmermann6
Copy link

comment:12

Jean-Pierre, I cannot reproduce that:

sage: sys.path.append('/usr/lib/python2.6/dist-packages')
sage: import inspect, objgraph
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)

/users/caramel/zimmerma/Adm/Strass/SED/2011/<ipython console> in <module>()

ImportError: No module named objgraph

Paul

@jpflori
Copy link
Author

jpflori commented Oct 14, 2011

comment:13

Did you "apt-get install python-objgraph" on your system?

If yes, is it a version for python 2.6 ?

@jpflori
Copy link
Author

jpflori commented Oct 14, 2011

comment:14

The path I gave above might also be different on your system...

As the package manager.

@jpflori
Copy link
Author

jpflori commented Nov 7, 2011

comment:15

Any progress on your side?

If you found any other dicitonaries leading to cahing problems, it would be great to mention them here for the record.

Hence the day someone will finally decide to tackle ticket #715, it will speed up the search of the culprit.

@zimmermann6
Copy link

comment:16

Replying to @jpflori:

Any progress on your side?

no time so far. I will look at this during the SageFlint days in December, unless someone else has
some time before.

Paul

@sagetrac-johanbosman
Copy link
Mannequin

sagetrac-johanbosman mannequin commented Nov 14, 2011

comment:17

Replying to @jpflori:

As far as I'm concerned, this is nothing but a concrete example of the vague #715.
So I guess I put it to "needs review" so that it could be closed as "duplicate/won't fix".
Not sure it was the right way to do it.

If you think it should be closed, then I think you should set the milestone to duplicate/wontfix. Otherwise, it is probably better to change the status back to 'new'.

@zimmermann6
Copy link

comment:18

with Sage 4.7.2 I get the following:

sage: for p in prime_range(10^5):
....:     K = GF(p)
....:     a = K(0)
....:     
sage: import gc
sage: gc.collect()
0
sage: from sage.rings.finite_rings.finite_field_prime_modn import \
....: FiniteField_prime_modn as FF
sage: L = [x for x in gc.get_objects() if isinstance(x, FF)]
sage: len(L), L[0], L[len(L)-1]
(9592, Finite Field of size 2, Finite Field of size 99767)

thus whenever we use the finite field we get a memleak.
(If I remove the a=K(0) line, I get only two elements in L, for p=2 and p=99991.)

Paul

@zimmermann6
Copy link

comment:19

I confirm (cf comment [comment:8]) that if I comment out the line

    _cache[(X, Y, category)] = weakref.ref(H)

in categories/homset.py, then the memory leak from comment [comment:18] disappears.

Paul

@simon-king-jena
Copy link
Member

comment:20

I think we have a different problem here.

The finite fields themselves should be cached. The cache (see GF._cache) uses weak references, which should be fine.

Also, there are special methods zero_element and one_element which should do caching, because zero and one are frequently used and should not be created over and over again.

However, it seems that all elements of a finite field are cached - and that's bad!

sage: K = GF(5)
sage: K(2) is K(2)
True
sage: K.<a> = GF(17^2)
sage: K(5) is K(5)
True

I see absolutely no reason why all 17^2 elements should be cached.

Fortunately, we have no caching for larger finite fields:

sage: K.<a> = GF(17^10)
sage: K(5) is K(5)
False

@simon-king-jena
Copy link
Member

comment:21

In the following example, there is no memory leak that would be caused by the sage.categories.homset cache:

sage: len(sage.categories.homset._cache.keys())
100
sage: for p in prime_range(10^5):
....:     K = GF(p)
....:     
sage: len(sage.categories.homset._cache.keys())
100

However, when you do a conversion K(...) then a convert map is created, and apparently is cached:

sage: for p in prime_range(10^5):
....:     K = GF(p)
....:     a = K(0)
....:
sage: len(sage.categories.homset._cache.keys())
9692

The homset cache does use weak references. Hence, it is surprising that the unused stuff can not be garbage collected. Apparently there is some direct reference involved at some point.

I am very stronglyagainst removing the cache of sage.categories.homset. Namely, elliptic curve code uses finite fields and maps involving finite fields a lot, and killing the cache is likely to cause a massive regression.

However, since we actually have coercion maps (not just any odd map), I expect that the direct reference comes from the coercion model. I suggest to look into the coercion code and use weak references there.

By the way, I don't know why the status is "needs review". I think it clearly is "needs work".

@simon-king-jena
Copy link
Member

comment:185

Replying to @jdemeyer:

Replying to @nbruin:

All doctests pass for me with this, so back to positive review.

I think somebody needs to review that patch. Simon, can you do that?

attachment: trac_11521_doctestfix.patch looks fine to me. I'll try to build the latest Sage prerelease on bsd.math with these changes and do tests.

@simon-king-jena
Copy link
Member

Changed reviewer from Jean-Pierre Flori, Nils Bruin to Jean-Pierre Flori, Nils Bruin, Simon King

@simon-king-jena
Copy link
Member

Changed author from Simon King to Simon King, Nils Bruin

@simon-king-jena
Copy link
Member

comment:186

It works! All tests pass on bsd.math, and I also did sage -t devel/sage/sage/rings/polynomial/ separately, to verify that the ignored attribute error does not occur.

The new patch looks fine. Hence, it is a positive review.

@jdemeyer
Copy link

jdemeyer commented Oct 6, 2012

comment:187

I confirm that doctests pass.

@jdemeyer
Copy link

Merged: sage-5.5.beta0

@jdemeyer
Copy link

jdemeyer commented Nov 3, 2012

Changed merged from sage-5.5.beta0 to none

@jdemeyer
Copy link

jdemeyer commented Nov 3, 2012

comment:189

Sorry to bring bad news, but a trial sage-5.5.beta1 caused a Segmentation Fault in sage/schemes/elliptic_curves/ell_number_field.py on OS X 10.4 PPC. Removing #715 and #11521 made the problem go away. This really feels like a déjà vu, but I'm afraid I need to remove the patch from sage-5.5.beta0.

@jdemeyer jdemeyer reopened this Nov 3, 2012
@jdemeyer
Copy link

jdemeyer commented Nov 4, 2012

Merged: sage-5.5.beta0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants