Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autobatching support #57

Merged
merged 3 commits into from
Apr 17, 2021
Merged

Autobatching support #57

merged 3 commits into from
Apr 17, 2021

Conversation

pfreixes
Copy link
Collaborator

@pfreixes pfreixes commented Apr 14, 2021

Autobatching provides you a way for fetching multiple keys using a single command, batching happens transparently behind the scenes without bothering the caller.

Get´s are piled up until the next loop iteration. Once the next loop iteration is reached all get´s are transmitted using the
same Memcached operation.

The total number of get´s transmited within the same operation are limited, if more get´s are provided within the previous loop iteration they will be chunked on different operations.

Behind the scenes the connection pool provided by the client is still used, connections will be used if they are free, or new ones will be created if there is still room. Otherwise each batch will need to wait until a connection is released.

Autobatching can boost up the throughput x2/x3.

How it is used

Example of how the Autobatching is enabled

client = await emcache.create_client([emcache.MemcachedHostAddress('localhost', 11211)], autobatch=True)
await autopipeline.get(b"key")

What´s missing:

  • Documentation
  • Some tests at client.py level

Implements this feature #46

Copy link

@vangheem vangheem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

My one comment is maybe autopipeline isn't the best wording to use; however, I'm not sure "batching" would be better and I don't have a better idea.

Would it be a good idea to just make regular get do this pipelining behind the scenes?

emcache/autopipeline.py Outdated Show resolved Hide resolved
emcache/autopipeline.py Outdated Show resolved Hide resolved
@pfreixes
Copy link
Collaborator Author

@vangheem thanks for the review.

Yes agree that autopipeline is not the best one, and it´s to coupled to the Redis world .... what about autobatching?

I was considering using the normal get and getspath, but my concern was on the restriction on having all get´s of a batch being semantically equal, so all of them would need to ask for return_flags or ask for the casvalue. The PR tried to address this by providing a factory which forces the user to choose how the keys are fetched behind the scenes, treating all of them equal later on.

But it comes with some cognitive burden and a more fragmented API, also raises some misalignments. For example, using the traditional gets from the client - when cas is required - but using the get when the autopipeline is used but coming from a Autopipeline object that was built using the cas=True argument. Weird.

Im about to change my mind again and go for you proposal, considering that the number of combinations - different kind of batch operations - in terms of extra arguments is limited, I would say 4 at most. So each time that the user would execute a get or a gets the operation would be routed to one or the other depending the arguments provided. So the unique real change would be how the Emcache client would be constructed, which could provide an autobatching parameter for enabling this feature.

WDYT?

@vangheem
Copy link

So each time that the user would execute a get or a gets the operation would be routed to one or the other depending the arguments provided

Yes, seems reasonable. Most of the time the user will be doing the same type of operations each time.

So the unique real change would be how the Emcache client would be constructed, which could provide an autobatching parameter for enabling this feature

I think that's a good idea. I like it.

Autobatching provides you a way for fetching multiple keys using a
single command, batching happens transparently behind the scenes
without bothering the caller.

Get´s are piled up until the next loop iteration. Once the
next loop iteration is reached all get´s are transmitted using the
same Memcached operation.

The total number of get´s transmited within the same oepration are limited,
if more get´s are provided within the previous loop iteration they will be chunked
on different operations.

Behind the scenes the connection pool provided by the client is still used, connections
will be used if they are free, or new ones will be created if there is still room. Otherwise
each batch will need to wait until a connection is released.

Autobatching can boost up the throughput x2/x3.
@pfreixes pfreixes changed the title Autopipeline support Autobatching support Apr 15, 2021
@pfreixes pfreixes merged commit af8f626 into master Apr 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants