Autobatching support #57

pfreixes · 2021-04-14T21:42:23Z

Autobatching provides you a way for fetching multiple keys using a single command, batching happens transparently behind the scenes without bothering the caller.

Get´s are piled up until the next loop iteration. Once the next loop iteration is reached all get´s are transmitted using the
same Memcached operation.

The total number of get´s transmited within the same operation are limited, if more get´s are provided within the previous loop iteration they will be chunked on different operations.

Behind the scenes the connection pool provided by the client is still used, connections will be used if they are free, or new ones will be created if there is still room. Otherwise each batch will need to wait until a connection is released.

Autobatching can boost up the throughput x2/x3.

How it is used

Example of how the Autobatching is enabled

client = await emcache.create_client([emcache.MemcachedHostAddress('localhost', 11211)], autobatch=True)
await autopipeline.get(b"key")

What´s missing:

Documentation
Some tests at client.py level

Implements this feature #46

vangheem

Very nice!

My one comment is maybe autopipeline isn't the best wording to use; however, I'm not sure "batching" would be better and I don't have a better idea.

Would it be a good idea to just make regular get do this pipelining behind the scenes?

emcache/autopipeline.py

pfreixes · 2021-04-15T19:55:32Z

@vangheem thanks for the review.

Yes agree that autopipeline is not the best one, and it´s to coupled to the Redis world .... what about autobatching?

I was considering using the normal get and getspath, but my concern was on the restriction on having all get´s of a batch being semantically equal, so all of them would need to ask for return_flags or ask for the casvalue. The PR tried to address this by providing a factory which forces the user to choose how the keys are fetched behind the scenes, treating all of them equal later on.

But it comes with some cognitive burden and a more fragmented API, also raises some misalignments. For example, using the traditional gets from the client - when cas is required - but using the get when the autopipeline is used but coming from a Autopipeline object that was built using the cas=True argument. Weird.

Im about to change my mind again and go for you proposal, considering that the number of combinations - different kind of batch operations - in terms of extra arguments is limited, I would say 4 at most. So each time that the user would execute a get or a gets the operation would be routed to one or the other depending the arguments provided. So the unique real change would be how the Emcache client would be constructed, which could provide an autobatching parameter for enabling this feature.

WDYT?

vangheem · 2021-04-15T20:27:27Z

So each time that the user would execute a get or a gets the operation would be routed to one or the other depending the arguments provided

Yes, seems reasonable. Most of the time the user will be doing the same type of operations each time.

So the unique real change would be how the Emcache client would be constructed, which could provide an autobatching parameter for enabling this feature

I think that's a good idea. I like it.

Autobatching provides you a way for fetching multiple keys using a single command, batching happens transparently behind the scenes without bothering the caller. Get´s are piled up until the next loop iteration. Once the next loop iteration is reached all get´s are transmitted using the same Memcached operation. The total number of get´s transmited within the same oepration are limited, if more get´s are provided within the previous loop iteration they will be chunked on different operations. Behind the scenes the connection pool provided by the client is still used, connections will be used if they are free, or new ones will be created if there is still room. Otherwise each batch will need to wait until a connection is released. Autobatching can boost up the throughput x2/x3.

pfreixes force-pushed the autopipeline branch from 8aeffc9 to 2c96abe Compare April 14, 2021 21:43

pfreixes mentioned this pull request Apr 14, 2021

[Idea] Support for auto pipelining #46

Closed

pfreixes requested a review from lferran April 14, 2021 21:45

vangheem approved these changes Apr 14, 2021

View reviewed changes

emcache/autopipeline.py Outdated Show resolved Hide resolved

emcache/autopipeline.py Outdated Show resolved Hide resolved

pfreixes force-pushed the autopipeline branch from 2c96abe to 9b1b9b6 Compare April 15, 2021 21:29

pfreixes changed the title ~~Autopipeline support~~ Autobatching support Apr 15, 2021

pfreixes added 2 commits April 17, 2021 21:50

Add more coverage

b6a5806

Adds documentation

ea015dd

pfreixes merged commit af8f626 into master Apr 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autobatching support #57

Autobatching support #57

pfreixes commented Apr 14, 2021 •

edited

Loading

vangheem left a comment

pfreixes commented Apr 15, 2021

vangheem commented Apr 15, 2021

Autobatching support #57

Autobatching support #57

Conversation

pfreixes commented Apr 14, 2021 • edited Loading

How it is used

What´s missing:

vangheem left a comment

Choose a reason for hiding this comment

pfreixes commented Apr 15, 2021

vangheem commented Apr 15, 2021

pfreixes commented Apr 14, 2021 •

edited

Loading