Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd3 Locker: First pass #202

Merged
merged 18 commits into from
Nov 10, 2018
Merged

etcd3 Locker: First pass #202

merged 18 commits into from
Nov 10, 2018

Conversation

chen-anders
Copy link
Contributor

@chen-anders chen-anders commented Aug 29, 2018

Resolves: #157

This is a first pass at an etcd3 locker that uses the built-in locks (from the etcd/clientv3/concurrency package). We create sessions with valid leases to lock an upload. Sessions are created with a renewing 60s lease (but kept alive with KeepAlive as long as the process is still alive). These sessions are reused to unlock uploads when the upload completes. Once an upload has been unlocked, we close the session.

This code has been tested locally with an install of etcd3 (3.1, 3.2, 3.3) with the use of the go-etcd-harness.

We're also running this locker with some small scale traffic to our tusd server, in which we haven't seen issues so far.

Example Usage (pseudocode):

import (
    "fmt"
    "time"
    "github.com/coreos/etcd/clientv3"
    "github.com/tus/tusd/etcd3locker"
    "github.com/tus/tusd"
)
...

etcdClient, err := clientv3.New(clientv3.Config{
	Endpoints:   ["127.0.0.1:2379"],
	DialTimeout: 2 * time.Second,
})
if err != nil {
	return nil, fmt.Errorf("Failed to create etcd client: %v", err.Error())
}
composer := tusd.NewStoreComposer()
locker, err := etcd3locker.New(etcdClient)
if err != nil {
	return nil, fmt.Errorf("Failed to create etcd locker: %v", err.Error())
}
locker.UseIn(composer)

@kvz
Copy link
Member

kvz commented Aug 29, 2018

Thanks a lot for this work @chen-anders! Just a quick note that our maintainer is on holidays so it might take a bit before you'll have this thoroughly reviewed.

@chen-anders chen-anders force-pushed the anders/etcd3locker branch 7 times, most recently from f7c282e to 15f624c Compare August 30, 2018 01:38
Copy link
Member

@Acconut Acconut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this amazing PR. Apologies for my delayed response but as @kvz said I have been quite busy the last weeks. That being said I have a few questions about this but I am eager to see this landing in tusd.

etcd3locker/locker.go Show resolved Hide resolved
etcd3locker/lock.go Outdated Show resolved Hide resolved
etcd3locker/locker.go Show resolved Hide resolved
etcd3locker/locker.go Outdated Show resolved Hide resolved
etcd3locker/locker.go Outdated Show resolved Hide resolved
etcd3locker/locker.go Outdated Show resolved Hide resolved
etcd3locker/locker.go Outdated Show resolved Hide resolved
@chen-anders chen-anders force-pushed the anders/etcd3locker branch 2 times, most recently from 6ba04a0 to c25194e Compare September 20, 2018 21:53
@Acconut
Copy link
Member

Acconut commented Sep 28, 2018

Thanks for the updates and comments. I will have a look at them in the next few days. Would you mind looking into the test suite failures (https://travis-ci.org/tus/tusd/builds/431249928) in the meantime?

@chen-anders
Copy link
Contributor Author

@Acconut - Tests are now passing.

.scripts/test_all.sh Outdated Show resolved Hide resolved
.scripts/test_all.sh Outdated Show resolved Hide resolved
.scripts/test_all.sh Outdated Show resolved Hide resolved
@chen-anders
Copy link
Contributor Author

@Acconut - this is ready for another look-over.

@Acconut Acconut merged commit 9af87d5 into tus:master Nov 10, 2018
@Acconut
Copy link
Member

Acconut commented Nov 10, 2018

Thank you very much for this amazing PR and especially dealing with my questions, we appreciate this a lot 👍

Only one question: Is there an update regarding #202 (comment)?

@chen-anders
Copy link
Contributor Author

Is there an update regarding #202 (comment)?

The coroutine leak here was a red herring since we ended up finding an issue where writing straight to the filesystem (instead of using a mounted volume) in Docker causing heavy page cache use (when using the S3 plugin). In our Kubernetes clusters, this registered as continuously increasing memory usage.

The simplification made some time ago did make it so unnecessary leases were no longer being opened, but I don't think that was causing any major memory usage. As part of our investigation, we SSH'd into the container and did a ps aux, which showed that our tusd program only using 30-90MB of RAM (depending on number of open connections/active uploads) at most instead of the 1-2GB being registered by Kubernetes.

@chen-anders chen-anders deleted the anders/etcd3locker branch November 10, 2018 20:58
@Acconut
Copy link
Member

Acconut commented Nov 11, 2018

I am glad to hear that there was no goroutine leak, thank you for the detailed answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants