Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User dashboard takes a long time to load since 1.17 #21611

Closed
mpeter50 opened this issue Oct 26, 2022 · 36 comments · Fixed by #23532
Closed

User dashboard takes a long time to load since 1.17 #21611

mpeter50 opened this issue Oct 26, 2022 · 36 comments · Fixed by #23532
Labels
issue/needs-feedback For bugs, we need more details. For features, the feature must be described in more detail type/bug

Comments

@mpeter50
Copy link
Contributor

Description

When I updated from 1.16.5 to 1.17.0, the main page of gitea (the user dashboard, hosting the contribution activity thing, some recent events and the repositories list) started to take a very long time to load.

It only takes a long time (80 - 100 s) if I didn't load it recently. If I waited for it to load, and then reloaded it, it would load quickly.
While I'm waiting for the dashboard to load, I see both increased database process CPU load and increased iowait.
I did not update the database server at the time of updating Gitea.

The logs were made slightly after restarting Gitea. I've waited with opening the dashboard until repository indexing finished.

Gitea Version

v1.17.3

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

https://gist.github.com/mpeter50/1fa49a4e9c9f05536baedef0668e8c92

Screenshots

No response

Git Version

 2.36.2

Operating System

Raspbian

How are you running Gitea?

In self-built Docker container, built according to the Docker.rootless Dockerfile.

Database

MySQL

@silverwind
Copy link
Member

#21045 may be related, but also maybe not as I didn't see such massive load times, maybe an 200-400ms extra for the heatmap data.

Do you also see it on https://yourhost/youruser?tab=activity?

@mpeter50
Copy link
Contributor Author

No, this page loaded pretty quickly, along with the heatmap.

@wxiaoguang
Copy link
Contributor

Check your action table if it is large, it could be emptied if you do not need the activity logs.

Maybe related to:

@mpeter50
Copy link
Contributor Author

Do you mean the mentioned database table in the database server, or somewhere else?

Also, this is about the section marked on the following picture, right?

image

Oh I wanted to ask about it for some time, but always forgot it. I would really like if this could be filtered, for example so that commit sync events are not shown, because they happen so frequently (this is not a problem) that they hide everything else (this is a problem).
But, that is out of scope for this issue, I'll look up if theres a similar feature request.

So well, yeah, I don't need most of these.

But then also, on 1.16.5 it did not take much time to load this page.
I'm not familiar with the codebase, but could you please take a quick look if you have modified the action query in a way that is now less efficient?

@lunny
Copy link
Member

lunny commented Oct 29, 2022

It's very strange there is no router log in your file.

2022/10/26 21:28:11 ...it/repo_attribute.go:416:func2() [E] [635988d9] Unable to open checker for 0ff2b64c3ed34350d72aeb68cd0e372fc5a0a47a. Error: failed to run attr-check. Error: context deadline exceeded
        Stderr:
2022/10/26 21:29:55 ...s/context/context.go:219:HTML() [D] [63598a60] Template: user/dashboard/dashboard

@mpeter50
Copy link
Contributor Author

That is because the log I made was made by a non-default logger, developer_debug here:

image

Would having router logs help? If so, let me know and I'll make a new log file that also includes the router log.

@lunny
Copy link
Member

lunny commented Oct 30, 2022

That is because the log I made was made by a non-default logger, developer_debug here:

image

Would having router logs help? If so, let me know and I'll make a new log file that also includes the router log.

Yes, router log will be helpful

@mpeter50
Copy link
Contributor Author

mpeter50 commented Oct 30, 2022

I'm having a hard time in figuring out how to set up the router logger to log into the developer_debug file logger.
Could you please help with it?

I'll try to describe what did I try. After every change, I used the "Log Configuration" section of the https://myinstance.lan/admin/config page to check if Gitea has started using my logger, and also I checked the log files in case they included router logs after loading the mentioned page.
I used these resources to read about how to configure logging:


This is what I had when creating the log gist linked in the issue:

[log]
MODE = console,developer_debug

[log.console]
COLORIZE = true

[log.developer_debug]
LEVEL = debug
MODE = file
FILE_NAME = /tmp/gitea/developer_debug.log
# 1 << 18 == 4.718.592 B - 4,5 MB
# forrás: https://github.com/go-gitea/gitea/blob/abd59cd5cdd94dae844d3fcf7bc89e9d455b363c/custom/conf/app.example.ini#L591
MAX_SIZE_SHIFT = 18

The following is how I tried to change the above.

I tried to configure the router logger of the developer_debug sublogger, so I added this new section, and restarted Gitea:

[log.developer_debug.router]
LEVEL = TRACE

According to Gitea (https://myinstance.lan/admin/config, Log Configuration) this did not change or add a new router logger.

Then I thought maybe the problem is that I didn't define the target (where should this be output to), so I added ROUTER = log.developer_debug to the above section in the hopes that it will be combined with that, which now looked like this:

[log.developer_debug.router]
LEVEL = TRACE
ROUTER = log.developer_debug

Later I started to suspect this is the same ROUTER as what can be found directly in [log].

Still no change, so I thought maybe this logger isn't detected by Gitea at all, and remembered that I had to add even my developer_debug sublogger to [log].MODE.
So I tried configuring [log].MODE so that the router logger also logs to this new sublogger:

[log]
MODE = console,developer_debug
ROUTER = console,developer_debug.router

With this, now I see that a new Router logger is visible in the log configuration (https://myinstance.lan/admin/config, Log Configuration), but I still don't see router logs in this log file.
However I think I see now that this variable ([log].ROUTER) is key to my sublogger being detected, and that it is similar to [log].MODE, and probably [log].ACCESS and possibly others work this way too.

Then went back to this: https://docs.gitea.io/en-us/config-cheat-sheet/#router-log-log
Noticed I also need to set DISABLE_ROUTER_LOG = false in [log] for the [log].ROUTER to be taken into account, so did that. Restarted Gitea, and still no difference.
Later I reverted this by commentint out the line, as I no longer know how to interpret this. I don't know anymore if this will disable router logging totally, or if it will make the [log].ROUTER be taken into account.

I had a hard time in setting up the developer_debug logger too, but I did manage that.
However I don't know what am I doing wrong. The docs on logging are confusing me. In desperation I redeclared the MODE and FILE_PATH for this sub-sublogger, but this just doesn't seem to work for some reason.

This is all of the configuration that I now have related to logging:

[log]
MODE = console,developer_debug
ROUTER = console,developer_debug.router
#DISABLE_ROUTER_LOG = false

[log.console]
COLORIZE = true

[log.developer_debug]
LEVEL = debug
MODE = file
FILE_NAME = /tmp/gitea/developer_debug.log
# 1 << 18 == 4.718.592 B - 4,5 MB
# forrás: https://github.com/go-gitea/gitea/blob/abd59cd5cdd94dae844d3fcf7bc89e9d455b363c/custom/conf/app.example.ini#L591
MAX_SIZE_SHIFT = 18

[log.developer_debug.router]
LEVEL = TRACE
#ROUTER = log.developer_debug
MODE = file
FILE_NAME = /tmp/gitea/developer_debug_router.log

@mpeter50
Copy link
Contributor Author

I think I got it.
Some important things I missed:

  • [log.name.xy] where xy is the name of a logger, is how loggers are actually configured for a specific output named 'name'
  • [log].ROUTER needs to have a list of all subloggers that wants to receive router events, if it differs from console only
  • [log.name.router] always needs to have MODE specified, as it will not inherit it from [log.name]
  • FILE_PATH-s can be shared between loggers of outputs, but this won't be inherited either, as it has a default value

@lunny here is a new log gist with router logs. It includes logs from recent startup of Gitea. Last event was made after the user dashboard finished loading.

@lunny
Copy link
Member

lunny commented Oct 31, 2022

2022/10/31 00:18:35 [635f064b] router: started   GET / for 192.168.56.20:0
2022/10/31 00:18:39 [635f043c-24] router: slow      GET / for 192.168.56.20:0, elapsed 3845.7ms @ web/home.go:26(web.Home)
2022/10/31 00:19:48 .../queue/workerpool.go:138:zeroBoost() [D] [635f0694] WorkerPool: 14 (for mirror-channel) has zero workers - adding 1 temporary workers for 5m0s
2022/10/31 00:19:48 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= remote get-url origin
2022/10/31 00:19:48 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= remote update --prune origin
2022/10/31 00:19:50 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= commit-graph write
2022/10/31 00:19:58 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= rev-parse
2022/10/31 00:19:58 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch
2022/10/31 00:19:58 ...s/repository/repo.go:271:SyncReleasesWithTags() [D] [635f0694-2] SyncReleasesWithTags: in Repo[283:Migrations/Simple-Gallery]
2022/10/31 00:19:58 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch-check
2022/10/31 00:19:58 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= for-each-ref --format objecttype %(objecttype)%00refname:short %(refname:short)%00object %(object)%00objectname %(objectname)%00creator %(creator)%00contents %(contents)%00contents:signature %(contents:signature)%00%00 --sort -*creatordate refs/tags
2022/10/31 00:19:59 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= rev-parse
2022/10/31 00:19:59 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch
2022/10/31 00:19:59 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch-check
2022/10/31 00:19:59 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= show-ref --heads
2022/10/31 00:19:59 ...dules/git/command.go:179:Run() [D] [635f0694-2] /var/lib/gitea/git/repositories/migrations/simple-gallery.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= for-each-ref --sort=-committerdate refs/heads/ --count 1 --format=%(committerdate)
2022/10/31 00:20:00 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= remote get-url origin
2022/10/31 00:20:00 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= remote update --prune origin
2022/10/31 00:20:02 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= commit-graph write
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= rev-parse
2022/10/31 00:20:09 ...s/repository/repo.go:271:SyncReleasesWithTags() [D] [635f06a0] SyncReleasesWithTags: in Repo[282:Migrations/Simple-Commons]
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch-check
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= for-each-ref --format objecttype %(objecttype)%00refname:short %(refname:short)%00object %(object)%00objectname %(objectname)%00creator %(creator)%00contents %(contents)%00contents:signature %(contents:signature)%00%00 --sort -*creatordate refs/tags
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= rev-parse
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= show-ref --heads
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch-check
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= cat-file --batch
2022/10/31 00:20:09 ...dules/git/command.go:179:Run() [D] [635f06a0] /var/lib/gitea/git/repositories/migrations/simple-commons.git: /usr/bin/git -c protocol.version=2 -c uploadpack.allowfilter=true -c uploadpack.allowAnySHA1InWant=true -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= for-each-ref --sort=-committerdate refs/heads/ --count 1 --format=%(committerdate)
2022/10/31 00:20:11 ...s/context/context.go:219:HTML() [D] [635f064b] Template: user/dashboard/dashboard
2022/10/31 00:20:11 [635f064b] router: completed GET / for 192.168.56.20:0, 200 OK in 96058.0ms @ web/home.go:26(web.Home)

@lunny
Copy link
Member

lunny commented Jan 18, 2023

Could you test v1.18.1 resolved the problem?

@mpeter50
Copy link
Contributor Author

It doesn't yet. Page load time of the main page was 157949ms according to the footer.

image

@lunny
Copy link
Member

lunny commented Jan 19, 2023

How many records in your action table? Could you give more context?

@mpeter50
Copy link
Contributor Author

mpeter50 commented Jan 19, 2023

I have around 257.000, according to the summary on the admin dashboard at https://gitea.instance/admin:

image

@mpeter50
Copy link
Contributor Author

If there is anything more I can help with (like additional statistics, logs, etc), please let me know

@lunny
Copy link
Member

lunny commented Jan 19, 2023

And could you paste your logs when visit dashaboard? Include xorm log and router log.

@mpeter50
Copy link
Contributor Author

Sure, here is the log: https://gist.github.com/mpeter50/4fa95c0d7a575b273f0a9369c0073929

I have cut off what was happening while Gitea was starting up, but by timestamp I made sure to include everything from when I started loading the dashboard.

@lunny
Copy link
Member

lunny commented Jan 20, 2023

2023/01/19 22:02:39 ...activities/action.go:361:GetFeeds() [I] [63c9af54] [SQL] SELECT `action`.* FROM `action` INNER JOIN `repository` ON `repository`.id = `action`.repo_id WHERE act_user_id IN (SELECT `user`.id FROM `user` WHERE (keep_activity_private=? AND visibility IN (?,?)) OR id=?) AND repo_id IN (SELECT id FROM repository WHERE (`repository`.is_private=? AND `repository`.owner_id NOT IN (SELECT id FROM `user` WHERE type=? AND visibility IN (?))) OR `repository`.id IN (SELECT repo_id FROM `access` WHERE `access`.user_id=? AND `access`.mode>?) OR `repository`.id IN (SELECT `team_repo`.repo_id FROM team_repo INNER JOIN team_user ON `team_user`.team_id = `team_repo`.team_id WHERE `team_user`.uid=?) OR `repository`.owner_id=? OR (`repository`.is_private=? AND `repository`.owner_id IN (SELECT `org_user`.org_id FROM org_user WHERE `org_user`.uid=?))) AND user_id=? AND is_deleted=? ORDER BY `action`.`created_unix` DESC LIMIT 20 [false public limited 4 false 1 private 4 0 4 4 false 4 4 false] - 2m35.228546803s

So this is a database index problem.

@rsq424
Copy link

rsq424 commented Mar 2, 2023

add new index to action table
create index IDX_action_u_d_c on action (user_id asc, is_deleted asc, created_unix desc);

@lunny
Copy link
Member

lunny commented Mar 2, 2023

add new index to action table create index IDX_action_u_d_c on action (user_id asc, is_deleted asc, created_unix desc);

We need a new index for this.

@Zettat123
Copy link
Contributor

@mpeter50 I have added about 300000 records to the action table but I can't reproduce this bug in my dev environment.

add new index to action table create index IDX_action_u_d_c on action (user_id asc, is_deleted asc, created_unix desc);

Could you please add this index to your action table and check if the index works?

@sillyguodong sillyguodong added the issue/needs-feedback For bugs, we need more details. For features, the feature must be described in more detail label Mar 6, 2023
@hiifong
Copy link
Member

hiifong commented Mar 13, 2023

@mpeter50 I have added about 300000 records to the action table but I can't reproduce this bug in my dev environment.

add new index to action table create index IDX_action_u_d_c on action (user_id asc, is_deleted asc, created_unix desc);

Could you please add this index to your action table and check if the index works?

image
image
image
image
image
image

@hiifong
Copy link
Member

hiifong commented Mar 13, 2023

So far, I don't think the effect is very obvious

@mpeter50
Copy link
Contributor Author

@Zettat123

@mpeter50 I have added about 300000 records to the action table but I can't reproduce this bug in my dev environment.

The database server runs on an ARMv7 device (a Raspberry Pi 4), and is backed by portable hard drive as storage. These may be necessary for this to occur. But, other pages dont take nearly as long to load.

Could you please add this index to your action table and check if the index works?

I'll try to do so, but it might take some more time for me to get to that.

@hiifong

So far, I don't think the effect is very obvious

I'm not sure I understand you, but as far as I know a fix has not yet been implemented.

@Zettat123
Copy link
Contributor

I'll try to do so, but it might take some more time for me to get to that.

@mpeter50 Thanks! Please let us know the result if you get the work done.

lunny pushed a commit that referenced this issue Mar 24, 2023
Close #21611 

Add the index mentioned in
#21611 (comment) .
Since we already have an index for `("created_unix", "user_id",
"is_deleted")` columns on PostgreSQL, I removed the database type check
to apply this index to all types of databases.
@mpeter50
Copy link
Contributor Author

I'm sorry for the late reply!

Today I have updated my instance to 1.19.0, and the latency is still very similar as before:

image

According to the database server, I had the following indexes on the action table at this time:

MariaDB [gitea]> show index FROM action;
+--------+------------+-----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table  | Non_unique | Key_name              | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+--------+------------+-----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| action |          0 | PRIMARY               |            1 | id           | A         |      459337 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| action |          1 | IDX_action_comment_id |            1 | comment_id   | A         |        4547 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            1 | act_user_id  | A         |           6 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            2 | repo_id      | A         |        1258 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            3 | created_unix | A         |      153112 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            4 | user_id      | A         |      459337 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            5 | is_deleted   | A         |      459337 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| action |          1 | IDX_action_r_u_d      |            1 | repo_id      | A         |        1085 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_r_u_d      |            2 | user_id      | A         |        2595 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_r_u_d      |            3 | is_deleted   | A         |        2595 |     NULL | NULL   |      | BTREE      |         |               | NO      |
+--------+------------+-----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
10 rows in set (0.001 sec)

So then I created the suggested index:

MariaDB [gitea]> create index IDX_action_u_d_c on action (user_id asc, is_deleted asc, created_unix desc);
Query OK, 0 rows affected (24.369 sec)
Records: 0  Duplicates: 0  Warnings: 0

And the list of indexes now look like this:

MariaDB [gitea]> show index FROM action;
+--------+------------+-----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table  | Non_unique | Key_name              | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+--------+------------+-----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| action |          0 | PRIMARY               |            1 | id           | A         |      459337 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| action |          1 | IDX_action_comment_id |            1 | comment_id   | A         |        4547 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            1 | act_user_id  | A         |           6 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            2 | repo_id      | A         |        1258 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            3 | created_unix | A         |      153112 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            4 | user_id      | A         |      459337 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_au_r_c_u_d |            5 | is_deleted   | A         |      459337 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| action |          1 | IDX_action_r_u_d      |            1 | repo_id      | A         |        1085 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_r_u_d      |            2 | user_id      | A         |        2595 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_r_u_d      |            3 | is_deleted   | A         |        2595 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| action |          1 | IDX_action_u_d_c      |            1 | user_id      | A         |           8 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| action |          1 | IDX_action_u_d_c      |            2 | is_deleted   | A         |          14 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| action |          1 | IDX_action_u_d_c      |            3 | created_unix | A         |      459337 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
+--------+------------+-----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
13 rows in set (0.002 sec)

However, the dashboard takes around the same amount of time to load as before:

image

@rsq424
Copy link

rsq424 commented Apr 13, 2023

I think you can look at gitea.log and find out the longest time SQL AND explain "SQL"

@mpeter50
Copy link
Contributor Author

I have re-enabled sql logging, and uploaded the logs here that were gathered the next time I tried loading the dashboard.

The longest SQL statement was this one, it is on the 10th line in the gist:

2023/04/16 11:52:57 ...activities/action.go:402:GetFeeds() [I] [643bc489] [SQL] SELECT `action`.* FROM `action` INNER JOIN `repository` ON `repository`.id = `action`.repo_id WHERE act_user_id IN (SELECT `user`.id FROM `user` WHERE (keep_activity_private=? AND visibility IN (?,?)) OR id=?) AND repo_id IN (SELECT id FROM repository WHERE (`repository`.is_private=? AND `repository`.owner_id NOT IN (SELECT id FROM `user` WHERE type=? AND visibility IN (?))) OR `repository`.id IN (SELECT repo_id FROM `access` WHERE `access`.user_id=? AND `access`.mode>?) OR `repository`.id IN (SELECT `team_repo`.repo_id FROM team_repo INNER JOIN team_user ON `team_user`.team_id = `team_repo`.team_id WHERE `team_user`.uid=?) OR `repository`.owner_id=? OR (`repository`.is_private=? AND `repository`.owner_id IN (SELECT `org_user`.org_id FROM org_user WHERE `org_user`.uid=?))) AND user_id=? AND is_deleted=? ORDER BY `action`.`created_unix` DESC LIMIT 20 [false public limited 4 false 1 private 4 0 4 4 false 4 4 false] - 3m58.691245393s

I'm sorry but I did not understand the second part of your comment.
What should I explain?

@lunny
Copy link
Member

lunny commented Apr 16, 2023

What's your Gitea instance version and how many records on your Gitea database table action?

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 16, 2023

What's your Gitea instance version and how many records on your Gitea database table action?

I guess these information is in #21611 (comment)

  • Gitea: 1.19.0
  • Records: around 459337 ?
SELECT `action`.* FROM `action` 
INNER JOIN `repository` ON `repository`.id = `action`.repo_id
WHERE act_user_id IN (
    SELECT `user`.id FROM `user` 
    WHERE (keep_activity_private=? AND visibility IN (?,?)) OR id=?
) 
AND repo_id IN (
    SELECT id FROM repository 
    WHERE (`repository`.is_private=? 
        AND `repository`.owner_id NOT IN (SELECT id FROM `user` WHERE type=? AND visibility IN (?))
    )
    OR `repository`.id IN (SELECT repo_id FROM `access` WHERE `access`.user_id=? AND `access`.mode>?)
    OR `repository`.id IN (
        SELECT `team_repo`.repo_id FROM team_repo 
        INNER JOIN team_user ON `team_user`.team_id = `team_repo`.team_id WHERE `team_user`.uid=?
    )
    OR `repository`.owner_id=? 
    OR (`repository`.is_private=? AND `repository`.owner_id IN (
        SELECT `org_user`.org_id FROM org_user WHERE `org_user`.uid=?)
    )
)
AND user_id=? AND is_deleted=? ORDER BY `action`.`created_unix` DESC
LIMIT 20 

--- args: [false public limited 4 false 1 private 4 0 4 4 false 4 4 false]

If it's a MySQL index problem, it's possible try to use analyze table action to reset the table statistics data, or modify code to "FORCE INDEX".

Otherwise, if the MySQL server is slow (eg: hardware limitation), at the moment it's difficult to optimize this SQL (the "activity" mechanism would be totally rewritten). I used to truncate my action table regularly ......

@mpeter50
Copy link
Contributor Author

Thats right, Gitea 1.19.0 and ~459340 action records according to the admin dashboard.

The analyze table action command while connected to the database did not help on the problem.

Otherwise, if the MySQL server is slow (eg: hardware limitation)

This is probably the case, while waiting for the dashboard to load I can see in a resource monitor that the mysqld process is waiting for io most of its time.
I'll look into truncating the action table. I guess it is normally just dropping and recreating it, but I would prefer to only delete the <org> synced commits to <branch> at <org>/<repo> from mirror type of actions.

Also, a related question: is there currently a mechanism for ignoring (as in not saving to db) certain types of actions?
For example, those I referred to above..
My action table mostly consists of these, but I don't need them, and actually these kind of bothered me on the user dashboard, but only because they squeeze out more interesting events.

@mpeter50
Copy link
Contributor Author

Update: the admin dashboard now counts ~547767 actions, probably the analyze table command recounted them.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 16, 2023

while waiting for the dashboard to load I can see in a resource monitor that the mysqld process is waiting for io most of its time.

I guess you are using HDD/Slow storage and/or MySQL's memory is pretty low.

I would prefer to only delete the synced commits to at / from mirror type of actions.

That's ActionMirrorSyncPush // 18 , op_type=18

is there currently a mechanism for ignoring (as in not saving to db) certain types of actions?

IIRC, no ...... unfortunately

@lunny
Copy link
Member

lunny commented Apr 16, 2023

Or you can enable [cron.delete_old_actions] and change OLDER_THAN to a short value.

@mpeter50
Copy link
Contributor Author

I guess you are using HDD/Slow storage and/or MySQL's memory is pretty low

Yes, it is a portable HDD that is shared with other services and even the OS.
But for the action query ~200 seconds delay no other load measurable load is needed.

That's ActionMirrorSyncPush // 18 , op_type=18

Thanks a lot, this will be helpful!

Or you can enable [cron.delete_old_actions] and change OLDER_THAN to a short value.

I'll consider that too, thanks!

@mpeter50
Copy link
Contributor Author

For anyone (and future me) reading this in the future:

BEFORE EXECUTING THESE COMMANDS you probably want to shut down Gitea and make a backup of the database.
Deleting rows with op_type=18 ("synced from" actions, as seen on the user dashboard) only helps if you are a data hoarder and you mirror dozens of active repositories.

According to SELECT COUNT(id) FROM action WHERE op_type = 18;, I had 352867 "synced from" records.
After deleting those with the DELETE FROM action WHERE op_type=18; command, 227443 rows remained in the table, and the dashboard now loads under 10 seconds, at first 9300 ms, but most times its lower (e.g. 3400), and also if you load it once it will be faster for a few minutes (maybe until new actions are created?).

If this delay is still too much, you can check what other types of actions are amassed in the table with SELECT COUNT(id),op_type FROM action GROUP BY op_type;. The query will return op_type IDs and their amounts, but you can look them up here.
In my case the majority of the remaining actions are of type ActionMirrorSyncCreate and ActionMirrorSyncDelete, which are probably "synced new reference" and "synced and deleted reference" actions. It's on you whether you keep them, AFAIK no rows of this table are important for Gitea to work correctly.

Be aware though that you will need to repeat this regularly, as "synced from" actions are still created at every automatic mirror sync when changes are pulled.
How often? You'll notice it when you need to do it, but probably you wont need to do it more than once a few months or a year.
If you change the mirrors to sync less often (default is 8 hours), these will pile up slower.

You can also set up automatic deletion of old actions as Lunny said above, but this will probably also mess with your and other users' activity graph (on the profile page, as seen below) if you set it to a value too low.
This applies especially if you delete all actions.

image.

@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/needs-feedback For bugs, we need more details. For features, the feature must be described in more detail type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants