Skip to content

Commit

Permalink
Rework repository archive (#14723)
Browse files Browse the repository at this point in the history
* Use storage to store archive files

* Fix backend lint

* Add archiver table on database

* Finish archive download

* Fix test

* Add database migrations

* Add status for archiver

* Fix lint

* Add queue

* Add doctor to check and delete old archives

* Improve archive queue

* Fix tests

* improve archive storage

* Delete repo archives

* Add missing fixture

* fix fixture

* Fix fixture

* Fix test

* Fix archiver cleaning

* Fix bug

* Add docs for repository archive storage

* remove repo-archive configuration

* Fix test

* Fix test

* Fix lint

Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
  • Loading branch information
3 people authored Jun 23, 2021
1 parent c9c7afd commit b223d36
Show file tree
Hide file tree
Showing 25 changed files with 628 additions and 460 deletions.
10 changes: 10 additions & 0 deletions custom/conf/app.example.ini
Original file line number Diff line number Diff line change
Expand Up @@ -2048,6 +2048,16 @@ PATH =
;; storage type
;STORAGE_TYPE = local

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; settings for repository archives, will override storage setting
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;[storage.repo-archive]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; storage type
;STORAGE_TYPE = local

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; lfs storage will override storage
Expand Down
17 changes: 17 additions & 0 deletions docs/content/doc/advanced/config-cheat-sheet.en-us.md
Original file line number Diff line number Diff line change
Expand Up @@ -995,6 +995,23 @@ MINIO_USE_SSL = false

And used by `[attachment]`, `[lfs]` and etc. as `STORAGE_TYPE`.

## Repository Archive Storage (`storage.repo-archive`)

Configuration for repository archive storage. It will inherit from default `[storage]` or
`[storage.xxx]` when set `STORAGE_TYPE` to `xxx`. The default of `PATH`
is `data/repo-archive` and the default of `MINIO_BASE_PATH` is `repo-archive/`.

- `STORAGE_TYPE`: **local**: Storage type for repo archive, `local` for local disk or `minio` for s3 compatible object storage service or other name defined with `[storage.xxx]`
- `SERVE_DIRECT`: **false**: Allows the storage driver to redirect to authenticated URLs to serve files directly. Currently, only Minio/S3 is supported via signed URLs, local does nothing.
- `PATH`: **./data/repo-archive**: Where to store archive files, only available when `STORAGE_TYPE` is `local`.
- `MINIO_ENDPOINT`: **localhost:9000**: Minio endpoint to connect only available when `STORAGE_TYPE` is `minio`
- `MINIO_ACCESS_KEY_ID`: Minio accessKeyID to connect only available when `STORAGE_TYPE` is `minio`
- `MINIO_SECRET_ACCESS_KEY`: Minio secretAccessKey to connect only available when `STORAGE_TYPE is` `minio`
- `MINIO_BUCKET`: **gitea**: Minio bucket to store the lfs only available when `STORAGE_TYPE` is `minio`
- `MINIO_LOCATION`: **us-east-1**: Minio location to create bucket only available when `STORAGE_TYPE` is `minio`
- `MINIO_BASE_PATH`: **repo-archive/**: Minio base path on the bucket only available when `STORAGE_TYPE` is `minio`
- `MINIO_USE_SSL`: **false**: Minio enabled ssl only available when `STORAGE_TYPE` is `minio`

## Other (`other`)

- `SHOW_FOOTER_BRANDING`: **false**: Show Gitea branding in the footer.
Expand Down
15 changes: 15 additions & 0 deletions docs/content/doc/advanced/config-cheat-sheet.zh-cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,21 @@ MINIO_USE_SSL = false

然后你在 `[attachment]`, `[lfs]` 等中可以把这个名字用作 `STORAGE_TYPE` 的值。

## Repository Archive Storage (`storage.repo-archive`)

Repository archive 的存储配置。 如果 `STORAGE_TYPE` 为空,则此配置将从 `[storage]` 继承。如果不为 `local` 或者 `minio` 而为 `xxx`, 则从 `[storage.xxx]` 继承。当继承时, `PATH` 默认为 `data/repo-archive``MINIO_BASE_PATH` 默认为 `repo-archive/`

- `STORAGE_TYPE`: **local**: Repository archive 的存储类型,`local` 将存储到磁盘,`minio` 将存储到 s3 兼容的对象服务。
- `SERVE_DIRECT`: **false**: 允许直接重定向到存储系统。当前,仅 Minio/S3 是支持的。
- `PATH`: 存放 Repository archive 上传的文件的地方,默认是 `data/repo-archive`
- `MINIO_ENDPOINT`: **localhost:9000**: Minio 地址,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_ACCESS_KEY_ID`: Minio accessKeyID,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_SECRET_ACCESS_KEY`: Minio secretAccessKey,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_BUCKET`: **gitea**: Minio bucket,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_LOCATION`: **us-east-1**: Minio location ,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_BASE_PATH`: **repo-archive/**: Minio base path ,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_USE_SSL`: **false**: Minio 是否启用 ssl ,仅当 `STORAGE_TYPE``minio` 时有效。

## Other (`other`)

- `SHOW_FOOTER_BRANDING`: 为真则在页面底部显示Gitea的字样。
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
aacbdfe9e1c4b47f60abe81849045fa4e96f1d75
1 change: 1 addition & 0 deletions models/fixtures/repo_archiver.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[] # empty
2 changes: 2 additions & 0 deletions models/migrations/migrations.go
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,8 @@ var migrations = []Migration{
NewMigration("Create PushMirror table", createPushMirrorTable),
// v184 -> v185
NewMigration("Rename Task errors to message", renameTaskErrorsToMessage),
// v185 -> v186
NewMigration("Add new table repo_archiver", addRepoArchiver),
}

// GetCurrentDBVersion returns the current db version
Expand Down
1 change: 1 addition & 0 deletions models/migrations/v181.go
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.

Expand Down
22 changes: 22 additions & 0 deletions models/migrations/v185.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.

package migrations

import (
"xorm.io/xorm"
)

func addRepoArchiver(x *xorm.Engine) error {
// RepoArchiver represents all archivers
type RepoArchiver struct {
ID int64 `xorm:"pk autoincr"`
RepoID int64 `xorm:"index unique(s)"`
Type int `xorm:"unique(s)"`
Status int
CommitID string `xorm:"VARCHAR(40) unique(s)"`
CreatedUnix int64 `xorm:"INDEX NOT NULL created"`
}
return x.Sync2(new(RepoArchiver))
}
1 change: 1 addition & 0 deletions models/models.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ func init() {
new(RepoTransfer),
new(IssueIndex),
new(PushMirror),
new(RepoArchiver),
)

gonicNames := []string{"SSL", "UID"}
Expand Down
97 changes: 47 additions & 50 deletions models/repo.go
Original file line number Diff line number Diff line change
Expand Up @@ -1587,6 +1587,22 @@ func DeleteRepository(doer *User, uid, repoID int64) error {
return err
}

// Remove archives
var archives []*RepoArchiver
if err = sess.Where("repo_id=?", repoID).Find(&archives); err != nil {
return err
}

for _, v := range archives {
v.Repo = repo
p, _ := v.RelativePath()
removeStorageWithNotice(sess, storage.RepoArchives, "Delete repo archive file", p)
}

if _, err := sess.Delete(&RepoArchiver{RepoID: repoID}); err != nil {
return err
}

if repo.NumForks > 0 {
if _, err = sess.Exec("UPDATE `repository` SET fork_id=0,is_fork=? WHERE fork_id=?", false, repo.ID); err != nil {
log.Error("reset 'fork_id' and 'is_fork': %v", err)
Expand Down Expand Up @@ -1768,64 +1784,45 @@ func DeleteRepositoryArchives(ctx context.Context) error {
func DeleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration) error {
log.Trace("Doing: ArchiveCleanup")

if err := x.Where("id > 0").Iterate(new(Repository), func(idx int, bean interface{}) error {
return deleteOldRepositoryArchives(ctx, olderThan, idx, bean)
}); err != nil {
log.Trace("Error: ArchiveClean: %v", err)
return err
}

log.Trace("Finished: ArchiveCleanup")
return nil
}

func deleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration, idx int, bean interface{}) error {
repo := bean.(*Repository)
basePath := filepath.Join(repo.RepoPath(), "archives")

for _, ty := range []string{"zip", "targz"} {
select {
case <-ctx.Done():
return ErrCancelledf("before deleting old repository archives with filetype %s for %s", ty, repo.FullName())
default:
}

path := filepath.Join(basePath, ty)
file, err := os.Open(path)
if err != nil {
if !os.IsNotExist(err) {
log.Warn("Unable to open directory %s: %v", path, err)
return err
}

// If the directory doesn't exist, that's okay.
continue
}

files, err := file.Readdir(0)
file.Close()
for {
var archivers []RepoArchiver
err := x.Where("created_unix < ?", time.Now().Add(-olderThan).Unix()).
Asc("created_unix").
Limit(100).
Find(&archivers)
if err != nil {
log.Warn("Unable to read directory %s: %v", path, err)
log.Trace("Error: ArchiveClean: %v", err)
return err
}

minimumOldestTime := time.Now().Add(-olderThan)
for _, info := range files {
if info.ModTime().Before(minimumOldestTime) && !info.IsDir() {
select {
case <-ctx.Done():
return ErrCancelledf("before deleting old repository archive file %s with filetype %s for %s", info.Name(), ty, repo.FullName())
default:
}
toDelete := filepath.Join(path, info.Name())
// This is a best-effort purge, so we do not check error codes to confirm removal.
if err = util.Remove(toDelete); err != nil {
log.Trace("Unable to delete %s, but proceeding: %v", toDelete, err)
}
for _, archiver := range archivers {
if err := deleteOldRepoArchiver(ctx, &archiver); err != nil {
return err
}
}
if len(archivers) < 100 {
break
}
}

log.Trace("Finished: ArchiveCleanup")
return nil
}

var delRepoArchiver = new(RepoArchiver)

func deleteOldRepoArchiver(ctx context.Context, archiver *RepoArchiver) error {
p, err := archiver.RelativePath()
if err != nil {
return err
}
_, err = x.ID(archiver.ID).Delete(delRepoArchiver)
if err != nil {
return err
}
if err := storage.RepoArchives.Delete(p); err != nil {
log.Error("delete repo archive file failed: %v", err)
}
return nil
}

Expand Down
86 changes: 86 additions & 0 deletions models/repo_archiver.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.

package models

import (
"fmt"

"code.gitea.io/gitea/modules/git"
"code.gitea.io/gitea/modules/timeutil"
)

// RepoArchiverStatus represents repo archive status
type RepoArchiverStatus int

// enumerate all repo archive statuses
const (
RepoArchiverGenerating = iota // the archiver is generating
RepoArchiverReady // it's ready
)

// RepoArchiver represents all archivers
type RepoArchiver struct {
ID int64 `xorm:"pk autoincr"`
RepoID int64 `xorm:"index unique(s)"`
Repo *Repository `xorm:"-"`
Type git.ArchiveType `xorm:"unique(s)"`
Status RepoArchiverStatus
CommitID string `xorm:"VARCHAR(40) unique(s)"`
CreatedUnix timeutil.TimeStamp `xorm:"INDEX NOT NULL created"`
}

// LoadRepo loads repository
func (archiver *RepoArchiver) LoadRepo() (*Repository, error) {
if archiver.Repo != nil {
return archiver.Repo, nil
}

var repo Repository
has, err := x.ID(archiver.RepoID).Get(&repo)
if err != nil {
return nil, err
}
if !has {
return nil, ErrRepoNotExist{
ID: archiver.RepoID,
}
}
return &repo, nil
}

// RelativePath returns relative path
func (archiver *RepoArchiver) RelativePath() (string, error) {
repo, err := archiver.LoadRepo()
if err != nil {
return "", err
}

return fmt.Sprintf("%s/%s/%s.%s", repo.FullName(), archiver.CommitID[:2], archiver.CommitID, archiver.Type.String()), nil
}

// GetRepoArchiver get an archiver
func GetRepoArchiver(ctx DBContext, repoID int64, tp git.ArchiveType, commitID string) (*RepoArchiver, error) {
var archiver RepoArchiver
has, err := ctx.e.Where("repo_id=?", repoID).And("`type`=?", tp).And("commit_id=?", commitID).Get(&archiver)
if err != nil {
return nil, err
}
if has {
return &archiver, nil
}
return nil, nil
}

// AddRepoArchiver adds an archiver
func AddRepoArchiver(ctx DBContext, archiver *RepoArchiver) error {
_, err := ctx.e.Insert(archiver)
return err
}

// UpdateRepoArchiverStatus updates archiver's status
func UpdateRepoArchiverStatus(ctx DBContext, archiver *RepoArchiver) error {
_, err := ctx.e.ID(archiver.ID).Cols("status").Update(archiver)
return err
}
2 changes: 2 additions & 0 deletions models/unit_tests.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ func MainTest(m *testing.M, pathToGiteaRoot string) {

setting.RepoAvatar.Storage.Path = filepath.Join(setting.AppDataPath, "repo-avatars")

setting.RepoArchive.Storage.Path = filepath.Join(setting.AppDataPath, "repo-archive")

if err = storage.Init(); err != nil {
fatalTestError("storage.Init: %v\n", err)
}
Expand Down
15 changes: 15 additions & 0 deletions modules/context/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -380,6 +380,21 @@ func (ctx *Context) ServeFile(file string, names ...string) {
http.ServeFile(ctx.Resp, ctx.Req, file)
}

// ServeStream serves file via io stream
func (ctx *Context) ServeStream(rd io.Reader, name string) {
ctx.Resp.Header().Set("Content-Description", "File Transfer")
ctx.Resp.Header().Set("Content-Type", "application/octet-stream")
ctx.Resp.Header().Set("Content-Disposition", "attachment; filename="+name)
ctx.Resp.Header().Set("Content-Transfer-Encoding", "binary")
ctx.Resp.Header().Set("Expires", "0")
ctx.Resp.Header().Set("Cache-Control", "must-revalidate")
ctx.Resp.Header().Set("Pragma", "public")
_, err := io.Copy(ctx.Resp, rd)
if err != nil {
ctx.ServerError("Download file failed", err)
}
}

// Error returned an error to web browser
func (ctx *Context) Error(status int, contents ...string) {
var v = http.StatusText(status)
Expand Down
Loading

0 comments on commit b223d36

Please sign in to comment.