-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow to open a repo if there’s many branches in it #14180
Comments
So I did a little debugging with Gitea code and found that the reason is the number of branches. |
The root cause of this seem to be |
Could you use pprof to confirm where the delay is? |
I’m new to golang. I’ll try pprof later. Based on my debugging using logging, I found most of the time was spent in By the way because |
Yes. Before I broke my hand this was precisely the kind of thing I was working on. Ok. pprof can be enabled by setting Once you have that running on your server you can run:
Ok. pprof can be enabled by setting ENABLE_PPROF=true in [server] in app.ini And get a SVG on your browser with: |
I've just reread the issue opening comment - please could you try again on current master. You will likely find it much faster |
I was using the latest master on my mac. And it is still not good. I also delibrately tried building it with 'gogit' flag. But changing 'show-ref' to 'branch/tag' without gogit gives me the best performance.
I'm having a hard time setting up a golang dev environment on my Windows machine. But i'll keep trying.
|
Interestingly I am not able to duplicate the slow down on linux - ah found it |
On Linux there's buff/cache, which will aggresively cache everything used on the file system. So i think on Linux it should be better. But still, 'show-ref' feels like an unnecessary slow path for me. All the hashes retrieved end up being just ignored. There has to be a better way, right? Even it's not 'branch/tag'. |
is it this path that is the slow down http://localhost/gitea/administrator/pytorch/branches/ ? |
The one with 4000+ branches on the same page? TBH I never once successfully opened the page until I added pagination myself. The repo homepage feels considerably slow to me, which is the first thing I noticed. |
OK I've managed to get pprof results.
callshowref is not the problem |
Interesting. That explains why using 'branch/tag' still feels slower, comparing to repos that have a smaller size. Way to go, PyTorch. |
it's Gitea's fault - not pytorches tbh. |
OK - this means that your PR will definitely be helpful - but if we're doing optimisations we should be guided by what actually takes time. I'm not quite sure why:
takes so long - I'd have to check - but it's likely that anything that reduces the number of branches will improve that. In terms of main repo showing being slow that's likely to do with generating the history for each file. That is unfortunately a slightly difficult problem - and is necessarily slow - we need to ajax getting that info instead of delaying render. However you would likely benefit from enabling the cache: https://docs.gitea.io/en-us/config-cheat-sheet/#cache---lastcommitcache-settings-cachelast_commit |
@zeripath I was joking about PyTorch. I'm glad my findings turned out to be useful to gitea. |
Last commit cache is only for treepath view page but not branches page ? |
So I was trying pprof but got something like this:
I don't know what I did wrong. The steps I took is:
Anything else I need to do to get the method trace? |
We will be watching this too. Our repo has 1600+ branches and as we evaluate Gitea we are seeing 8-10 second page load times. Git clones don't seem to be affected. |
The branches list on repo home page should be loaded asynchronously. |
any progress for this issue? |
When you say progress have you tried main recently? There have been some improvements there. However, we still need to stop loading all of the branches for every page. |
There is also another example https://gitea1.dev.blender.org/blender-foundation/blender/branches |
To accelerate the branches list, maybe we have to sync all branches into database like we did with tags. |
I agree, this would also help for PR creating page where branches need to be selected and other places where we currently show all branches to make that dropdown to show only top branches and make it async searchable |
Yes, I can somewhat understand that proposal, but I do have to say that this will be difficult and error-prone to implement/ maintain: |
Yes, that's why I hesitate so long to post the comment. Since we have stored tags in database and it works well, what's the different from storing branches names? And if we have a better method, I would like to give up the idea. |
Related #14180 Related #25233 Related #22639 Close #19786 Related #12763 This PR will change all the branches retrieve method from reading git data to read database to reduce git read operations. - [x] Sync git branches information into database when push git data - [x] Create a new table `Branch`, merge some columns of `DeletedBranch` into `Branch` table and drop the table `DeletedBranch`. - [x] Read `Branch` table when visit `code` -> `branch` page - [x] Read `Branch` table when list branch names in `code` page dropdown - [x] Read `Branch` table when list git ref compare page - [x] Provide a button in admin page to manually sync all branches. - [x] Sync branches if repository is not empty but database branches are empty when visiting pages with branches list - [x] Use `commit_time desc` as the default FindBranch order by to keep consistent as before and deleted branches will be always at the end. --------- Co-authored-by: Jason Song <i@wolfogre.com>
I think this could be closed per #22743. I have tested pytorch which have over 8700 branches and the home page takes about 1200ms in my macBook pro. |
- Send request to get branch/tag list, use loading icon when waiting for response. - Only fetch when the first time branch/tag list shows. - For backend, removed assignment to `ctx.Data["Branches"]` and `ctx.Data["Tags"]` from `context/repo.go` and passed these data wherever needed. - Changed some `v-if` to `v-show` and used native `svg` as mentioned in #25719 (comment) to improve perfomance when there are a lot of branches. - Places Used the dropdown component: Repo Home Page <img width="1429" alt="Screen Shot 2023-07-06 at 12 17 51" src="https://github.com/go-gitea/gitea/assets/17645053/6accc7b6-8d37-4e88-ae1a-bd2b3b927ea0"> Commits Page <img width="1431" alt="Screen Shot 2023-07-06 at 12 18 34" src="https://github.com/go-gitea/gitea/assets/17645053/2d0bf306-d1e2-45a8-a784-bc424879f537"> Specific commit -> operations -> cherry-pick <img width="758" alt="Screen Shot 2023-07-06 at 12 23 28" src="https://github.com/go-gitea/gitea/assets/17645053/1e557948-3881-4e45-a625-8ef36d45ae2d"> Release Page <img width="1433" alt="Screen Shot 2023-07-06 at 12 25 05" src="https://github.com/go-gitea/gitea/assets/17645053/3ec82af1-15a4-4162-a50b-04a9502161bb"> - Demo https://github.com/go-gitea/gitea/assets/17645053/d45d266b-3eb0-465a-82f9-57f78dc5f9f3 - Note: UI of dropdown menu could be improved in another PR as it should apply to more dropdown menus. Fix #14180 --------- Co-authored-by: silverwind <me@silverwind.io> Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
[x]
):Description
The title may be a bit misleading.
I was building a local mirror of PyTorch manually using
git push —mirror
. And I found that opening the repo is exceptionally slow (~11 seconds). I also built a clone of gitea itself and the time to open the repo is reasonable (~2 seconds).I think the reason is that PyTorch has over 4000 branches in it.
Screenshots
The text was updated successfully, but these errors were encountered: