Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #1252: Fix creating a lot of new Jedis instances on unstable cluster, fix slots clearing without filling #1253

Merged
merged 2 commits into from
Jul 11, 2016

Conversation

Spikhalskiy
Copy link
Contributor

I started to investigate Issue #1252 and PING-PONG counts especially.

The most interesting things I found in stack traces on our runtime:

    at redis.clients.jedis.BinaryJedis.ping(BinaryJedis.java:106)
    at redis.clients.jedis.JedisSlotBasedConnectionHandler.getConnection(JedisSlotBasedConnectionHandler.java:41)
    at redis.clients.jedis.JedisSlotBasedConnectionHandler.getConnectionFromSlot(JedisSlotBasedConnectionHandler.java:64)
    at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:115)
    at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:30)
    at redis.clients.jedis.JedisCluster.setex(JedisCluster.java:292)

Looks absolutely abnormal - we have nothing for slot and fallback to new Jedis to random node.

Here you can find my changes and fixes for place, that could cause this issue + some refactoring, remove of some duplicate map lookups/keys building.

System behavior before and after applying this changes:
Before
After

List<Object> slots = jedis.clusterSlots();

// We should clear slots after getting result about cluster slots, because if we got exception or timeout in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most important part here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, when we fix JedisSlotBasedConnectionHandler#getConnectionFromSlot, this comment becomes invalid.

And IMO, clearing slots first or later is up to amount of slot changeset. For example, if most of slots are changed, it would be better to try discovering slots again instead of relying on outdated slot information, which results in MOVED again. If only a little of slots are changed, we should take latter. It would be not a big deal cause we don't do random connect later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JedisSlotBasedConnectionHandler#getConnectionFromSlot, this comment becomes invalid.

Not completely.

  1. First thread started to discover. Clear nodes. Release lock while going to next node.
  2. On this time second thread went to JedisSlotBasedConnectionHandler#getConnectionFromSlot, very possible that we could just use old pool, but we see empty collection and connect to discovering party.
    I would prefer to don't publish empty collection while we can leave old one.

And IMHO, clearing slots first or later is up to amount of slot changeset

I'm not sure if I got you. We clear it completely anyways. I just moved clearing after external code. So, to avoid situation of exception in external code, which leads to publishing empty slots collection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And IMHO, clearing slots first or later is up to amount of slot changeset

I'm not sure if I got you. We clear it completely anyways. I just moved clearing after external code. So, to avoid situation of exception in external code, which leads to publishing empty slots collection.

Imagine first trial of clusterSlots() failed (so lock is released), and next one calls getConnectionFromSlot(). If we don't clear slots, it will get Jedis instance from cache which could be invalidated (cause we fails to update so doesn't catch up latest information) or not (node for slot is not changed).

Former will try to request wrong node, and receive MOVED, which also initiate slot cache update. If we clear cache first, it doesn't need to request but just update slot cache immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HeartSaVioR Got you. And I'm not sure what is better... It could get right node and don't stuck at all. In worst case - we will get same result with rediscovering call. I would prefer to give a chance and think that more slots is still on place. But if you think that it's better to publish clear state - can revert this part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to hold a lock during trying all nodes.
I see what you said, we release the write lock before completing slot cache update so there's race again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be more stable behavior and makes sense. I will take a look how to implement it.

@Spikhalskiy
Copy link
Contributor Author

I'm going to put it to prod load today, will let you know if it helps.

@Spikhalskiy Spikhalskiy changed the title Issue #1252: Fix creating lot of new Jedis instances on unstable cluster, fix slots clearing without filling Issue #1252: Fix creating a lot of new Jedis instances on unstable cluster, fix slots clearing without filling Apr 6, 2016
@Spikhalskiy Spikhalskiy force-pushed the issue-1252-1 branch 3 times, most recently from d9c7d52 to b24cb3a Compare April 6, 2016 15:25
@Spikhalskiy
Copy link
Contributor Author

@HeartSaVioR reworked with one lock for trying all nodes; will squash before merging

@marcosnils
Copy link
Contributor

I'll check this and #1252 later today.

@HeartSaVioR
Copy link
Contributor

@Spikhalskiy Github introduces new feature - 'squash before merging' so all you need to consider is rebasing to catch up with master.
I'll review later today.

@Spikhalskiy
Copy link
Contributor Author

I don't like this feature, because it merges without merge commits :) So, it's impossible to continue branch and merge once more. But you merge in multiple branches - so it's cherry-pick in any case.

… cluster, fix slots clearing without filling
@Spikhalskiy
Copy link
Contributor Author

Spikhalskiy commented Apr 6, 2016

This helped. Got rid of this type of error flow, things became much better. Waiting it in the upstream for further improvements.

System behavior before and after applying this change:
Before
After

But found one more broken flow. After finalizing and merge this PR will start to think about implementation.

Shortly (this trace based on this code state):

    at redis.clients.jedis.BinaryJedis.ping(BinaryJedis.java:106)
    at redis.clients.jedis.JedisSlotBasedConnectionHandler.getConnection(JedisSlotBasedConnectionHandler.java:41)
    at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:113)
    at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
    at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:30)
    at redis.clients.jedis.JedisCluster.setex(JedisCluster.java:292)

We get timeout. After that we decide to try random node 0_o. It's already strange. After that we get random Jedis with PING-PONG. On this random node we, of course, get MOVED and... start cluster rediscovering. And all of this with just one timeout.

This could be fixed:

  1. Why do we try random node on timeout?
  2. Why with Ping-Pong? No reason to don't believe Jedis from random pool here. It's waste of time and additional load.
  3. Why start rediscovering if you tried random node intentionally and of course didn't find key there?

All of this looks suspicious and could be fixed. Suggestions where to start?

@Spikhalskiy
Copy link
Contributor Author

Could we finalize it today to make further fixing easier?

@HeartSaVioR
Copy link
Contributor

@Spikhalskiy Even better. LGTM.
@marcosnils Could you review new changeset?

@HeartSaVioR HeartSaVioR added this to the 2.8.2 milestone Apr 7, 2016
@HeartSaVioR
Copy link
Contributor

Please note that I didn't look deeply at your last comment. Sorry I don't have time to concentrate. I'll try to have a look and comment later.

@Spikhalskiy
Copy link
Contributor Author

@HeartSaVioR Thank you! Sure, take your time. When I take a look deeply at code and possible solutions I will share my suggestions too in separate PR.

@marcosnils
Copy link
Contributor

@Spikhalskiy @HeartSaVioR I'll take care of this today after work.

amazing job!.

@Spikhalskiy
Copy link
Contributor Author

@HeartSaVioR @marcosnils Any updates here?

for (Object slotInfoObj : slots) {
List<Object> slotInfo = (List<Object>) slotInfoObj;

if (slotInfo.size() <= MASTER_NODE_INDEX) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this? Masters could not have replicas, right?

Copy link
Contributor Author

@Spikhalskiy Spikhalskiy Jul 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's old code if (slotInfo.size() <= 2) {, which is absolutely not related to PR. You can ask original author.

@xetorthio
Copy link
Contributor

Checked your code and I understand it, but I have hard time to relate it to the issue because of all the refactoring. Would you mind describing briefly what this code is fixing?

@Spikhalskiy
Copy link
Contributor Author

Spikhalskiy commented Jul 8, 2016

@xetorthio For your understanding, I separated PR to two commits for review. The main part is the first commit: I underline 2 main places there by comments (#1253 (comment) and #1253 (comment)).
Second commit is refactoring and moving the lock to the top level suggested here #1253 (comment)

The most critical here shortly - if we have nothing alive for the slot - we just rediscover cluster state. Not trying to establish a random connection to a random node of the cluster with predicted MOVED after that and same rediscovering, but only after that. It's what I remember.

@xetorthio
Copy link
Contributor

Got you! After your clarifications and checking again the code, _I think it is OK to merge this_.

I would make further changes down the road since the code (not yours) is hard to understand. Mainly clusterSlots command return is extremely ugly, making it super hard to use and even worse to under the code.

Another suggestion for future improvements is to do the cluster slot renewal in a background thread, which should make the locking unnecessary.

Created issues to discuss this: #1346 and #1347

@marcosnils marcosnils merged commit 69d4080 into redis:master Jul 11, 2016
marcosnils pushed a commit that referenced this pull request Jul 19, 2016
…uster, fix slots clearing without filling (#1253)

* Issue #1252: Fix creating lot of new Jedis instances on unstable cluster, fix slots clearing without filling

* Issue #1252: Acquire one long lock for trying all nodes when rediscover cluster

Conflicts:
	src/main/java/redis/clients/jedis/JedisClusterInfoCache.java
marcosnils pushed a commit that referenced this pull request Jul 19, 2016
…uster, fix slots clearing without filling (#1253)

* Issue #1252: Fix creating lot of new Jedis instances on unstable cluster, fix slots clearing without filling

* Issue #1252: Acquire one long lock for trying all nodes when rediscover cluster

Conflicts:
	src/main/java/redis/clients/jedis/JedisClusterInfoCache.java
@marcosnils
Copy link
Contributor

Downmerged to 2.8 and 2.9 respectively. Thx again everyone involved to make this happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants