Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory of producer retry queue #396

Merged
merged 1 commit into from
Apr 5, 2015
Merged

Improve memory of producer retry queue #396

merged 1 commit into from
Apr 5, 2015

Conversation

eapache
Copy link
Contributor

@eapache eapache commented Mar 25, 2015

The old version just used a simple slice as a queue, using append and slice[1:].
This had correct (amortized O(1)) running time, but did not behave well with the
garbage collector. In particular:

  • elements removed from the slice were still pointed to by the underlying array,
    preventing them from being collected until the array was resized
  • the array would only be resized when its current capacity was "full", even if
    it had a substantial amount of free space it was wasting

Switch instead to my queue implementation (which I wrote a while ago to solve
these exact problems somewhere else). It uses a ringbuffer that is guaranteed to
stay with a factor of 2 size of the actual number of messages stored, it
generates a lot less garbage when the input and output rates are about the same,
and it nils removed elements in the underlying array so the garbage collector
can pick them up at its leisure.

The only possible concern is that it uses interface{} but the usage is so
restrained that type-safety is trivial and the performance will not be noticable
given everything else the producer has to do.

@Shopify/kafka

The old version just used a simple slice as a queue, using append and slice[1:].
This had correct (amortized O(1)) running time, but did not behave well with the
garbage collector. In particular:
- elements removed from the slice were still pointed to by the underlying array,
  preventing them from being collected until the slice was resized
- the slice would only be resized when its current capacity was "full", even if
  it had a substantial amount of free space it was wasting

Switch instead to my queue implementation (which I wrote a while ago to solve
these exact problems somewhere else). It uses a ringbuffer that is guaranteed to
stay with a factor of 2 size of the actual number of messages stored, it
generates a lot less garbage when the input and output rates are about the same,
and it nils removed elements in the underlying array so the garbage collector
can pick them up at its leisure.

The only possible concern is that it uses `interface{}` but the usage is so
restrained that type-safety is trivial and the performance will not be noticable
given everything else the producer has to do.
@wvanbergen
Copy link
Contributor

How this this compare to container/heap?

@eapache
Copy link
Contributor Author

eapache commented Mar 30, 2015

A queue is not a heap... and container/heap is just an interface, not an implementation.

@wvanbergen
Copy link
Contributor

I find it interesting go doesn't have an efficient queue in its standard library. I guess it just part of the general lack of container types due to the lack of generics. Oh well.

Anyway, this is 👍 for me, if the queue implementation is properly tested.

@eapache
Copy link
Contributor Author

eapache commented Mar 30, 2015

@eapache
Copy link
Contributor Author

eapache commented Mar 30, 2015

There is http://golang.org/pkg/container/list/ that we could use, but it allocates a new struct for each element which makes it somewhat less efficient (in terms of both memory and gc)

@eapache
Copy link
Contributor Author

eapache commented Mar 30, 2015

If we'd rather stick with the stdlib though, I can't imagine the overhead would be too terrible (again, at least compared to all the other things that the producer has to do).

@wvanbergen
Copy link
Contributor

Given that it's only 89 lines of go, using this lib is OK I think 👍

@yagnik
Copy link

yagnik commented Apr 5, 2015

This code lgtm, I'm blindly trusting your queue implementation though.

@eapache
Copy link
Contributor Author

eapache commented Apr 5, 2015

Hmm, now I am tempted to switch to container/list just on the principle of "write less code".

@eapache
Copy link
Contributor Author

eapache commented Apr 5, 2015

Eh, the queue implementation is also used (and thus pretty heavily tested implicitly) in https://github.com/eapache/channels and it's not like it's going to change much.

eapache added a commit that referenced this pull request Apr 5, 2015
Improve memory of producer retry queue
@eapache eapache merged commit d7cc796 into master Apr 5, 2015
@eapache eapache deleted the fix-retry-queue branch April 5, 2015 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants