Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

synpase showing unhealthy - might have something to do with "could not serialize access due to concurrent update" #9869

Closed
chagai95 opened this issue Apr 22, 2021 · 21 comments
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@chagai95
Copy link
Contributor

Description

synapse showing unhealthy

Steps to reproduce

just happened twice

Version information

  • Homeserver:

If not matrix.org: chagai.website

  • Version: server_version":"1.30.1","python_version":"3.8.8"

  • Install method:

  • Platform: ansible playbook with docker on centos 7
@chagai95
Copy link
Contributor Author

#9635

Perhaps related? Anyway I updated to the newest version, see it gets me any further...

@richvdh
Copy link
Member

richvdh commented Apr 23, 2021

what does "showing unhealthy" mean?

please use ``` to format your logs so that they are legible

@richvdh richvdh added A-Spaces Hierarchical organization of rooms X-Needs-Info This issue is blocked awaiting information from the reporter and removed A-Spaces Hierarchical organization of rooms labels Apr 23, 2021
@chagai95
Copy link
Contributor Author

Docker ps shows me unhealthy. Sorry about the logs they were not showing with ``` for some reason so I pasted them like this...

@chagai95
Copy link
Contributor Author

Anyway it was offline for short periods yesterday but now I updated the version and it seems fine

@ShadowJonathan
Copy link
Contributor

(could you maybe wrap the logs in a "details" section? using the following icon in the github composer:)
image

@chagai95

This comment has been minimized.

@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Apr 23, 2021

Hmm, this looks like a variant of #9635, but AppService-related, what appservices do you have hooked into synapse?

@chagai95
Copy link
Contributor Author

I could not find that icon... I managed to now use the ``` and I hid it as spam, hopefully you can still see it?

@ShadowJonathan
Copy link
Contributor

I could not find that icon... I managed to now use the ``` and I hid it as spam, hopefully you can still see it?

Its fine as-is, dw 👍

@chagai95
Copy link
Contributor Author

app_service_config_files:
- /matrix-appservice-slack-registration.yaml
- /matrix-mautrix-telegram-registration.yaml
- /matrix-mautrix-whatsapp-registration.yaml

@chagai95
Copy link
Contributor Author

with the newest synapse I'm still getting these errors

imeout=30000' clientproto='HTTP/1.0' site='8008'>, already disconnected.
Apr 23 18:04:49 localhost.localdomain matrix-synapse[17842]: 2021-04-23 16:04:49,697 - synapse.logging.context - 70 - WARNING - PUT-2990 - Re-starting finished log context PUT-2990
Apr 23 18:04:49 localhost.localdomain matrix-synapse[17842]: 2021-04-23 16:04:49,735 - synapse.storage.txn - 538 - WARNING - runWithConnection - [TXN OPERROR] {set_type_stream_id_for_appservice-4728b} could not serialize access due to concurrent update
Apr 23 18:04:49 localhost.localdomain matrix-synapse[17842]: 0/5
Apr 23 18:04:49 localhost.localdomain matrix-synapse[17842]: 2021-04-23 16:04:49,885 - synapse.storage.txn - 538 - WARNING - runWithConnection - [TXN OPERROR] {complete_appservice_txn-472b3} could not serialize access due to concurrent update
Apr 23 18:04:49 localhost.localdomain matrix-synapse[17842]: 0/5
Apr 23 18:04:50 localhost.localdomain matrix-synapse[17842]: 2021-04-23 16:04:50,118 - synapse.storage.txn - 538 - WARNING - runWithConnection - [TXN OPERROR] {complete_appservice_txn-473cc} could not serialize access due to concurrent update
Apr 23 18:04:50 localhost.localdomain matrix-synapse[17842]: 0/5

@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Apr 23, 2021

Either;

  • Your database is very slow
  • Your appservices are hammering synapse as if its a wall of nails
    Actually, this is probably not true, as the TXN errors are happening when appservices are being notified of read receipts, or presence updates.
  • There's a proper bug that's being lost throughout all the noise.

I dug out a traceback from that log you posted;

(Full traceback)
Traceback (most recent call last):

File "/usr/local/lib/python3.8/site-packages/synapse/metrics/background_process_metrics.py", line 208, in run
  return await maybe_awaitable(func(*args, **kwargs))
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/appservice.py", line 249, in _notify_interested_services_ephemeral
  await self.store.set_type_stream_id_for_appservice(
File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/appservice.py", line 434, in set_type_stream_id_for_appservice
  await self.db_pool.runInteraction(
File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 660, in runInteraction
  result = await self.runWithConnection(
File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 743, in runWithConnection
  return await make_deferred_yieldable(
File "/usr/local/lib/python3.8/site-packages/twisted/python/threadpool.py", line 238, in inContext
  result = inContext.theWork() # type: ignore[attr-defined]
File "/usr/local/lib/python3.8/site-packages/twisted/python/threadpool.py", line 254, in
  inContext.theWork = lambda: context.call( # type: ignore[attr-defined]
File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 118, in callWithContext
  return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 83, in callWithContext
  return func(*args, **kw)
File "/usr/local/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
  compat.reraise(excValue, excTraceback)
File "/usr/local/lib/python3.8/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
  return function(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/twisted/python/compat.py", line 403, in reraise
  raise exception.with_traceback(traceback)
File "/usr/local/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
  result = func(conn, *args, **kw)
File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 738, in inner_func
  return func(db_conn, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 532, in new_transaction
  r = func(cursor, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/appservice.py", line 428, in set_type_stream_id_for_appservice_txn
  txn.execute(
File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 294, in execute
  self._do_execute(self.txn.execute, sql, *args)
File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 320, in _do_execute
  return func(sql, *args)
psycopg2.errors.SerializationFailure: could not serialize access due to concurrent update

Could you maybe enable DEBUG logs for synapse.handlers.appservice? And upload the logs via a seperate file. Small warning; this will be extremely verbose and will probably generate many gigabytes of docker or otherwise application logs.

@chagai95
Copy link
Contributor Author

sure, will do, thanks! btw I turned off slack so that's one less to worry about and I suspect something is wrong with the telegram bridge, do the logs of that also help?

@chagai95
Copy link
Contributor Author

I added it like this, hope this is correct
image

@chagai95
Copy link
Contributor Author

Also why would my db be slow and how would I check that? I'm running postgres in a container (ansible playbook)

@chagai95
Copy link
Contributor Author

I can't see any DEBUG messages so I'm assuming I did something wrong....

@chagai95
Copy link
Contributor Author

chagai95 commented Apr 23, 2021

changes this as well but still no DEBUG logs...

image

@ShadowJonathan
Copy link
Contributor

I added it like this, hope this is correct

That does look correct, yeah.

I can't see any DEBUG messages so I'm assuming I did something wrong....

Hmm, there should be a log file by itself somewhere? I haven't worked with the deb file, only containers, so I can't 123 remember where it'd be.

@chagai95
Copy link
Contributor Author

Ok thanks! I'll guess I'll just wait until I notice a few gb somewhere 🤣

@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Apr 24, 2021

Ah, it seems the logs are located in var/log/matrix-synapse. Edit: Nvm, through DM it became clear that this is a docker container, which i should have realised before 🤦

I've also sent you a DM to your backup account, so i can more quickly help get to the root cause of this.

@richvdh
Copy link
Member

richvdh commented Apr 29, 2021

I'm going to go ahead and close this, since it's unclear what the problem is. It doesn't seem to be a problem with Synapse itself but possibly a slow or overloaded database.

@richvdh richvdh closed this as completed Apr 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

3 participants