Memory leak in V2.0.2 #10

Gabi120 · 2011-02-09T16:37:58Z

Infinitly submitting jobs results in a big memory leak. The client memory consumption keeps growing and growing when the the queue is full

while True: gm_client.submit_job("task1", "some data ", background=True, wait_until_complete=False, poll_timeout=0.020)

chrisvaughn · 2011-11-25T13:26:10Z

Is there any update on this issue? I'm experiencing the same thing. I have a daemon that fetches jobs from a database and submits them to gearman. After a week of running the daemon is consuming 1GB of memory and I have to restart it.

daniyalzade · 2012-03-05T19:33:52Z

i am also having the same issue, and since i am adding jobs into queue fairly frequently, the leak reaches significant size fairly quickly..

chrisvaughn · 2012-03-05T19:40:49Z

My application is a daemon that runs forever and the memory really adds up fast so I had to work around it.
I set the reference to None and create a new connection every X jobs that I create. This has worked well for me with no noticeable performance issues. Not very elegant but I'm staying 10MB memory usage instead of climbing to GBs now.

while True:
    if job_count == max_job_count:
      gm_client = None
      gm_client = gearman.GearmanClient([gearman_server])

daniyalzade · 2012-03-05T20:21:46Z

hah, unfortunately, i also just came with a very similar approach. I keep track of the last time I have instantiated the gearman client, and I re-instantiate it if more than X minutes have passed. Since, I get the jobs at a pretty steady rate, these 2 approaches pretty much amount to the same :) [Thanks for the info, btw]

Also, just as a further info, I am pretty sure that the memory leak is related to the 'wait_until_jobs_accepted' method in client.py. This method ensures that the gearman server acknowledges its receipt of the tasks, but it somehow does not dispose of the tasks properly afterwards.

patricklucas · 2012-03-05T21:01:21Z

I'm taking a look at this now - I'll update if I find anything.

patricklucas · 2012-03-05T23:00:06Z

daniyalzade: you're pretty much correct

There are two leaks of GearmanJobRequest objects:

When a job is first sent, GearmanClient.establish_request_connection() is called, which adds to the request_to_rotating_connection_queue dict using the request as the key. However, the only time items are removed from that dict is inside wait_until_jobs_completed(), and only if the blocking request did not time out. If wait_until_complete is set to False, or it is set to True and the request times out, then it remains in request_to_rotating_connection_queue forever.

In addition, in the instance of GearmanClientCommandHandler, an entry is added to handle_to_request_map, mapping the job handle to the request object. These entries are only removed by _unregister_request() on complete or failure events.

The fix for the first problem is relatively easy: if wait_until_completed is False, then at the end of submit_multiple_requests, each request should be purged from request_to_rotating_connection_queue

        time_remaining = stopwatch.get_time_remaining()
        if wait_until_complete and bool(time_remaining != 0.0):
            processed_requests = self.wait_until_jobs_completed(processed_requests, poll_timeout=time_remaining)
        else:
            # Remove jobs from the rotating connection queue to avoid a leak
            for current_request in processed_requests:
                self.request_to_rotating_connection_queue.pop(current_request, None)

However the command handler object currently isn't accessible at all from the client. Perhaps calls to send_job_request should have an extra parameter which indicates whether the request should be unregistered upon reaching the CREATED state.

In any case, it seems that the "don't wait until completed" behavior of python-gearman needs to be completely reevaluated, since currently there is no care taken to ensure request objects are cleaned up in the often-used non-blocking scenario.

eskil · 2012-03-20T01:35:34Z

Merged fix. A loop of backgrounded jobs is steady on RSS now, before it was growing rapidly.

patricklucas mentioned this issue Mar 9, 2012

Use weakrefs to avoid leaks #20

Merged

eskil closed this as completed Mar 20, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in V2.0.2 #10

Memory leak in V2.0.2 #10

Gabi120 commented Feb 9, 2011

chrisvaughn commented Nov 25, 2011

daniyalzade commented Mar 5, 2012

chrisvaughn commented Mar 5, 2012

daniyalzade commented Mar 5, 2012

patricklucas commented Mar 5, 2012

patricklucas commented Mar 5, 2012

eskil commented Mar 20, 2012

Memory leak in V2.0.2 #10

Memory leak in V2.0.2 #10

Comments

Gabi120 commented Feb 9, 2011

chrisvaughn commented Nov 25, 2011

daniyalzade commented Mar 5, 2012

chrisvaughn commented Mar 5, 2012

daniyalzade commented Mar 5, 2012

patricklucas commented Mar 5, 2012

patricklucas commented Mar 5, 2012

eskil commented Mar 20, 2012