Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Times Out with Pad List #25

Open
ArcticSnowman opened this issue Nov 5, 2021 · 13 comments
Open

Query Times Out with Pad List #25

ArcticSnowman opened this issue Nov 5, 2021 · 13 comments

Comments

@ArcticSnowman
Copy link

ArcticSnowman commented Nov 5, 2021

How do I get past this timeout issue trying to list the pads?

Fatal MySQL error: Error: Query inactivity timeout
   at Query.<anonymous> (/opt/etherpad-lite/src/node_modules/mysql/lib/protocol/Protocol.js:160:17)
   at Query.emit (events.js:375:28)
   at Query.emit (domain.js:470:12)
   at Query._onTimeout (/opt/etherpad-lite/src/node_modules/mysql/lib/protocol/sequences/Sequence.js:124:8)
    at Timer._onTimeout (/opt/etherpad-lite/src/node_modules/mysql/lib/protocol/Timer.js:32:23)
    at listOnTimeout (internal/timers.js:557:17)
   at processTimers (internal/timers.js:500:7)
    --------------------
    at Pool.query (/opt/etherpad-lite/src/node_modules/mysql/lib/Pool.js:199:23)
   at /opt/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:46:20
    at new Promise (<anonymous>)
    at exports.Database._query (/opt/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:44:20)
    at exports.Database.findKeys (/opt/etherpad-lite/src/node_modules/ueberdb2/databases/mysql_db.js:137:34)
    at exports.Database.findKeys (/opt/etherpad-lite/src/node_modules/ueberdb2/lib/CacheAndBufferLayer.js:338:44)
    at bound findKeysCallbackified (util.js:292:5)
    at exports.Database.findKeys (/opt/etherpad-lite/src/node_modules/ueberdb2/index.js:107:53)
    at internal/util.js:308:30
    at new Promise (<anonymous>)
@JohnMcLear
Copy link
Member

JohnMcLear commented Nov 5, 2021

Can you tell us a little about your environment? Why are you experiencing this problem? What is your MySQL config? OS, versions etc. etc.

@ArcticSnowman
Copy link
Author

ArcticSnowman commented Nov 5, 2021

@JohnMcLear - I did not get the template when I clicked 'New Issue'

I was just going to https://etherpad.example.com/admin/pads and waiting for it to load the list of pads.

Etherpad 1.8.14
MySQL 5.7 is a Galera cluster

MySQL has about 25+ millions rows
Running a the etherpad docker container in a Kubernetes cluster

Version number: 1.8.14

Latest available version: 1.8.14

Git sha: 571b37b

Installed plugins
ep_adminpads2
ep_cursortrace
ep_default-pad-text
ep_etherpad-lite
ep_font_size
ep_headings2
ep_jira_links
ep_markdown
ep_previewimages
ep_prompt_for_name
ep_spellcheck
ep_standups

Installed parts
ep_standups/ep_standups
ep_spellcheck/ep_spellcheck
ep_prompt_for_name/prompt_for_name
ep_previewimages/ep_previewimages
ep_markdown/markdown
ep_jira_links/main
ep_headings2/main
ep_font_size/main
ep_etherpad-lite/openapi
ep_etherpad-lite/adminsettings
ep_etherpad-lite/adminplugins
ep_etherpad-lite/admin
ep_etherpad-lite/tests
ep_etherpad-lite/socketio
ep_etherpad-lite/errorhandling
ep_etherpad-lite/importexport
ep_etherpad-lite/apicalls
ep_etherpad-lite/webaccess
ep_etherpad-lite/padreadonly
ep_etherpad-lite/padurlsanitize
ep_etherpad-lite/specialpages
ep_etherpad-lite/i18n
ep_etherpad-lite/stats
ep_etherpad-lite/static
ep_etherpad-lite/express
ep_etherpad-lite/Minify
ep_etherpad-lite/DB
ep_default-pad-text/ep_defaultPadText
ep_cursortrace/cursortrace
ep_adminpads2/ep_adminpads2

Installed hooks
Server-side hooks
collectContentPost:
ep_headings2:
main:
ep_headings2/static/js/shared
collectContentPre:
ep_font_size:
main:
ep_font_size/static/js/shared
ep_headings2:
main:
ep_headings2/static/js/shared
createServer:
ep_etherpad-lite:
express:
ep_etherpad-lite/node/hooks/express
eejsBlock_adminMenu:
ep_adminpads2:
ep_adminpads2:
ep_adminpads2
eejsBlock_dd_format:
ep_font_size:
main:
ep_font_size/index
eejsBlock_dd_view:
ep_spellcheck:
ep_spellcheck:
ep_spellcheck/spellcheck
eejsBlock_editbarMenuLeft:
ep_font_size:
main:
ep_font_size/index
ep_headings2:
main:
ep_headings2/index
eejsBlock_exportColumn:
ep_markdown:
markdown:
ep_markdown/index
eejsBlock_indexWrapper:
ep_standups:
ep_standups:
ep_standups
eejsBlock_mySettings:
ep_markdown:
markdown:
ep_markdown/index
ep_spellcheck:
ep_spellcheck:
ep_spellcheck/spellcheck
eejsBlock_scripts:
ep_markdown:
markdown:
ep_markdown/index
exportHtmlAdditionalTagsWithData:
ep_font_size:
main:
ep_font_size/exportHTML
expressCloseServer:
ep_etherpad-lite:
socketio:
ep_etherpad-lite/node/hooks/express/socketio
expressConfigure:
ep_etherpad-lite:
webaccess:
ep_etherpad-lite/node/hooks/express/webaccess
expressCreateServer:
ep_adminpads2:
ep_adminpads2:
ep_adminpads2
ep_etherpad-lite:
admin:
ep_etherpad-lite/node/hooks/express/admin
adminplugins:
ep_etherpad-lite/node/hooks/express/adminplugins
adminsettings:
ep_etherpad-lite/node/hooks/express/adminsettings
apicalls:
ep_etherpad-lite/node/hooks/express/apicalls
errorhandling:
ep_etherpad-lite/node/hooks/express/errorhandling
i18n:
ep_etherpad-lite/node/hooks/i18n
importexport:
ep_etherpad-lite/node/hooks/express/importexport
openapi:
ep_etherpad-lite/node/hooks/express/openapi
padreadonly:
ep_etherpad-lite/node/hooks/express/padreadonly
padurlsanitize:
ep_etherpad-lite/node/hooks/express/padurlsanitize
socketio:
ep_etherpad-lite/node/hooks/express/socketio
specialpages:
ep_etherpad-lite/node/hooks/express/specialpages
static:
ep_etherpad-lite/node/hooks/express/static
tests:
ep_etherpad-lite/node/hooks/express/tests
ep_markdown:
markdown:
ep_markdown/express
ep_standups:
ep_standups:
ep_standups
getLineHTMLForExport:
ep_font_size:
main:
ep_font_size/exportHTML
ep_headings2:
main:
ep_headings2/index
handleMessage:
ep_cursortrace:
cursortrace:
ep_cursortrace/handleMessage
import:
ep_markdown:
markdown:
ep_markdown/index
padCreate:
ep_adminpads2:
ep_adminpads2:
ep_adminpads2
ep_default-pad-text:
ep_defaultPadText:
ep_default-pad-text/index:padCreate
padInitToolbar:
ep_font_size:
main:
ep_font_size/index
padRemove:
ep_adminpads2:
ep_adminpads2:
ep_adminpads2
restartServer:
ep_etherpad-lite:
express:
ep_etherpad-lite/node/hooks/express
shutdown:
ep_etherpad-lite:
DB:
ep_etherpad-lite/node/db/DB
express:
ep_etherpad-lite/node/hooks/express
Minify:
ep_etherpad-lite/node/utils/Minify
stats:
ep_etherpad-lite/node/stats
socketio:
ep_adminpads2:
ep_adminpads2:
ep_adminpads2
ep_etherpad-lite:
adminplugins:
ep_etherpad-lite/node/hooks/express/adminplugins
adminsettings:
ep_etherpad-lite/node/hooks/express/adminsettings
socketio:
ep_etherpad-lite/node/handler/PadMessageHandler
stylesForExport:
ep_font_size:
main:
ep_font_size/exportHTML
ep_headings2:
main:
ep_headings2/index
Client-side hooks
aceAttribsToClasses:
ep_font_size:
main:
ep_font_size/static/js/index
ep_headings2:
main:
ep_headings2/static/js/index
aceCreateDomLine:
ep_font_size:
main:
ep_font_size/static/js/index
ep_jira_links:
main:
ep_jira_links/static/js/index:aceCreateDomLine
ep_previewimages:
ep_previewimages:
ep_previewimages/static/js/index:aceCreateDomLine
aceDomLineProcessLineAttributes:
ep_headings2:
main:
ep_headings2/static/js/index
aceEditEvent:
ep_cursortrace:
cursortrace:
ep_cursortrace/static/js/main:aceEditEvent
ep_headings2:
main:
ep_headings2/static/js/index
aceEditorCSS:
ep_cursortrace:
cursortrace:
ep_cursortrace/static/js/css
ep_font_size:
main:
ep_font_size/static/js/index
ep_headings2:
main:
ep_headings2/static/js/index
ep_markdown:
markdown:
ep_markdown/static/js/markdown
aceGetFilterStack:
ep_jira_links:
main:
ep_jira_links/static/js/index:aceGetFilterStack
ep_previewimages:
ep_previewimages:
ep_previewimages/static/js/index:aceGetFilterStack
aceInitialized:
ep_font_size:
main:
ep_font_size/static/js/index
ep_headings2:
main:
ep_headings2/static/js/index
ep_prompt_for_name:
prompt_for_name:
ep_prompt_for_name/static/js/index
aceInitInnerdocbodyHead:
ep_cursortrace:
cursortrace:
ep_cursortrace/static/js/main:aceInitInnerdocbodyHead
aceRegisterBlockElements:
ep_headings2:
main:
ep_headings2/static/js/index
collectContentPre:
ep_font_size:
main:
ep_font_size/static/js/shared
ep_headings2:
main:
ep_headings2/static/js/shared
handleClientMessage_CUSTOM:
ep_cursortrace:
cursortrace:
ep_cursortrace/static/js/main
postAceInit:
ep_cursortrace:
cursortrace:
ep_cursortrace/static/js/main:postAceInit
ep_font_size:
main:
ep_font_size/static/js/index
ep_headings2:
main:
ep_headings2/static/js/index
ep_markdown:
markdown:
ep_markdown/static/js/markdown
ep_spellcheck:
ep_spellcheck:
ep_spellcheck/static/js/spellcheck:postAceInit
postToolbarInit:
ep_font_size:
main:
ep_font_size/static/js/index

@JohnMcLear
Copy link
Member

JohnMcLear commented Nov 5, 2021

MySQL is finicky about data structure and is easy to deploy in a non-optimal state. With that in mind, can you show results of describing the database and tables of your etherpad database? afaik the command is something like describe etherpad-lite but I forget.

I'm wondering if you have MyISAM when InnoDB would be required, but that's just a hunch atm.

I'm wondering if ep_adminpads2 findKeys command is optimized for lots of rows in a database.

We do have testing for large dataset queries afaik https://github.com/ether/ueberDB/blob/master/test/lib/mysql.sql#L38

We also support pooling: https://github.com/ether/ueberDB/blob/master/test/test_mysql.js#L42

@ArcticSnowman
Copy link
Author

image

@ArcticSnowman
Copy link
Author

Running the SQL direct

select `key` from store where `key` like 'pad:%' and `key` not like '%:%:%';

Takes about 30 seconds and returns 19763 rows.

@JohnMcLear
Copy link
Member

JohnMcLear commented Nov 5, 2021

Cc @rhansen

Should findKeys be this inefficient here? Is it returning pad contents for each pad or padid? If should just be key which should be very low computational cost no?

Can/should adminpads build its own pad database record on each new / delete / fork / copy event to make this query low cost?

@rhansen
Copy link
Member

rhansen commented Dec 5, 2021

The database is probably taking so long because it has to examine every pad:% key and filter out every pad:%:% key. If there are a ton of pad:%:% keys—which there probably are because each rev is saved as `pad:${padId}:revs:${revId}`—then this will take a while.

I'm not sure how to fix this without changing the db schema or adding complex features to ueberdb.

Instead of storing revisions at `pad:${padId}:revs:${revId}`, we could instead store them at `padrev:${padId}:${revId}` and simply query all keys matching pad:%. This should be much more efficient.

To scale to an arbitrary number of pads, we would need a solution that allows us to fetch keys a page at a time (e.g., 0 through 49, then 50 through 99, and so on). One way to do this is to add cursor support to ueberdb. Another way, which might not be possible, is to add a generic WHERE-like feature that can limit results based on properties of the JSON value (without fetching the value from the db server first!) along with an ORDER BY-like feature.

I think it would probably be better for admins of large Etherpad deployments to shard pads across multiple Etherpad instances and databases (e.g., the reverse proxy uses rendezvous hashing on the pad ID to pick the appropriate Etherpad backend).

@ArcticSnowman
Copy link
Author

One short coming of the sharding approach is that it breaks plugins like ep_padadmin2, as the list of pads you get would be only those on the instance that the result to /admin/pads get serviced.

@webzwo0i
Copy link
Member

webzwo0i commented Jan 3, 2022

Instead of introducing padrev prefix it might be easier to serialize a list of all pad ids when pads are created/deleted and update a single db entry like allpads.

@rhansen
Copy link
Member

rhansen commented Jan 6, 2022

That might work as a temporary solution, but there are some issues:

  • How do we initially populate the value from existing pads? For a large deployment the query will time out and might cause Etherpad to crash.
  • It would be harder to scale Etherpad. For example, I know of one user that runs multiple Etherpad instances that all share the same database. This use case isn't officially supported but mostly works right now; if each instance attempted to maintain the allpads list without distributed coordination then the value would quickly become corrupt.
  • How do we guarantee that the list is always in sync with the actual list of pads? Some cases to worry about: HTTP API, plugin actions, and crashes between pad creation/deletion and list update.

@webzwo0i
Copy link
Member

Actually, I'm mostly concerned about the size of the db entry and the frequency it gets updated. This is only useful for very large instances with a large number of pads and a lot of pad creations (which result in a large number of pads). So updating such a huge entry probably causes more trouble then it solves.

regarding 1: on first startup after this code is live, check if the allpads entry exists and if not, populate the database. Etherpad shouldn't crash of course...

regarding 2: I know, but iirc I never saw public code (except john's proxy/sharding work). Those setups can cause some trouble, e.g. with ueberdb's buffer. Maybe we should start collecting a list of problems to take care of when running large instances. What I mean is after you scale your database cluster, setup distribution to different nodes based on padid, setup failover/HA for every group of nodes, what to expect if you don't setup different databases for every node group. Is there anything that Etherpad should do to better support this?

regarding 3: Unless we drop support for any non-transactional databases, we probably won't be able to handle the crash case. "Normal" pad creation via API or UI shouldn't be too complicated to get right. When you say plugin actions, do you mean plugins that directly write to database?

Instead of adding one allpads entry, we could add multiple padname:padid entries on pad creation (and remove them on deletion), ie introduce a new prefix padname or similar - but only if you agree that we can handle your concerns.

@xshadow
Copy link

xshadow commented Apr 30, 2022

The database is probably taking so long because it has to examine every pad:% key and filter out every pad:%:% key. If there are a ton of pad:%:% keys—which there probably are because each rev is saved as `pad:${padId}:revs:${revId}`—then this will take a while.

I'm not sure how to fix this without changing the db schema or adding complex features to ueberdb.

Instead of storing revisions at `pad:${padId}:revs:${revId}`, we could instead store them at `padrev:${padId}:${revId}` and simply query all keys matching pad:%. This should be much more efficient.

To scale to an arbitrary number of pads, we would need a solution that allows us to fetch keys a page at a time (e.g., 0 through 49, then 50 through 99, and so on). One way to do this is to add cursor support to ueberdb. Another way, which might not be possible, is to add a generic WHERE-like feature that can limit results based on properties of the JSON value (without fetching the value from the db server first!) along with an ORDER BY-like feature.

I think it would probably be better for admins of large Etherpad deployments to shard pads across multiple Etherpad instances and databases (e.g., the reverse proxy uses rendezvous hashing on the pad ID to pick the appropriate Etherpad backend).

This is correct, on large instance, this query is taking too long hit the query timeout in ueberdb. After optimizing the mysql (increasing the key_buffer_size, join_buffer_size and query_cache_limit), I could speed up the query by 25% from 75 seconds to 53seconds.

The query itself:

MariaDB [etherpad]> SELECT COUNT(`key`) FROM store WHERE `key` LIKE 'pad:%' AND `key` NOT LIKE '%:%:%';
+--------------+
| COUNT(`key`) |
+--------------+
|        18923 |
+--------------+
1 row in set (53.713 sec)

All entries in the database:

MariaDB [etherpad]> SELECT COUNT(`key`) FROM store;
+--------------+
| COUNT(`key`) |
+--------------+
|    100787441 |
+--------------+

As a side note: the etherpad instance runs on MYISAM engine and not in InnoDB, due to performance recommendations from the wiki [1][2]. Is this still the recommended engine to use for more performance, or would InnoDB be the better choice?

[1] - https://github.com/ether/etherpad-lite/wiki/Converting-from-InnoDB-to-MyISAM
[2] - https://github.com/ether/etherpad-lite/wiki/Optimizing-Etherpad

@SamTV12345
Copy link
Member

I'm curious. Did you update to the latest version of Etherpad? Did something change? I updated many drivers so it could work now faster. Maybe Scylladb https://www.scylladb.com/ might be interesting. That is also used in Fortune 500 companies when performance is key and because it is already a key value store it might be even faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants