-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oplog processing stops working after MongoDB replica set election #19905
Comments
Same issue, I have a case with support but I have no answer. Although I think this happens if you have the Mongo with PSA (Primary-Secondary-Arbiter) architecture. With PSS (Primary-Secondary-Secondary) it doesn't happen. |
@rodrigok @sampaiodiego this one is super important for us to fix. looks like something in our new oplog / change streams processing stuff.
alternatively local cluster just do res.stepDown() |
Yes, I only have a single replica-set locally and didn't want to set up multi-node replica set just to verify this. But if you can set up your replica set to function the same way that Atlas does, then you should be able to reproduce locally. |
I can configrm I have the exact same issue. Messages keep being marked unread although I've read them. I decided to investigate to fix forward and implement change stream instead of native oplog. The result of the investigation is this bug report: |
UPDATE: Using the workaround above (USE_NATIVE_OPLOG=true) I haven't had any problems whatsoever with my RocketChat connection to Atlas MongoDB since I posted this issue almost a month ago. So the workaround seems to be very stable :) |
Hello @dkoo761 you talked about adding readPreference=primaryPreferred on your oplog uri + keep changestream mode. Does that help with mongo altas? |
@cschockaert If I remember right, I don't think readPreference=primaryPreferred made any difference. My workaround of setting USE_NATIVE_OPLOG=true actually turns off the newer oplog processing that uses change streams. |
Change streams is a better way of watching the database for realtime data updates, but it has some drawbacks comparing to Oplog. You can see them in this document https://www.mongodb.com/docs/manual/administration/change-streams-production-recommendations/ I consider this the most important difference:
So it is not in fact a rocket.chat issue not being able to properly connect to MongoDB to receive real time data, but it is actually MongoDB not sending that data because of some concern. So with this in mind, on Rocket.Chat 5.3.0 we introduced a new endpoint If for some reason you're not confident enough on your MongoDB set up to provide reliable change streams, you can use the environment variable Starting on Rocket.Chat 5.0 we don't recommend using |
FYI, the environment variable is But in any case, it still does not work. Post-failover, the "stream-room-messages" and "stream-notify-user" collections cease to come over the websocket. |
Description:
Oplog tailing / processing stops working correctly after a MongoDB replica set election, resulting in real-time data not being delivered to the browser.
This means:
When using a 3rd party MongoDB provider, such as Atlas, replica set elections can happen quite frequently (a few times a week). Also, there is often no error message in the RocketChat logs indicating something went wrong and standard HTTP web monitoring tools also wouldn't notify the site admin since the issue is specific to the oplog.
It appears that this started failing after the new oplog processing using change streams was added in 3.7.0.
WORKAROUND: Setting environment variable USE_NATIVE_OPLOG=true prevents this issue from happening.
I previously left a detailed analysis in this RC forum post: https://forums.rocket.chat/t/messages-not-displaying-and-greyed-out/2035
Steps to reproduce:
Expected behavior:
The message should turn black after posting it. Other users viewing that channel should see the message appear immediately.
Actual behavior:
The message is saved to the DB but appears greyed out to the user that posted it. After refreshing their browser, the user that posted the message will then see that it now appears black. But if they post another message, the same problem will occur.
Other users viewing that channel can't see the new message until they refresh their browser.
Server Setup Information:
Client Setup Information
Additional context
Relevant logs:
The text was updated successfully, but these errors were encountered: