Goofys is a high-performance, POSIX-ish Amazon S3 file system written in Go
Goofys allows you to mount an S3 bucket as a filey system.
It's a Filey System instead of a File System because goofys strives for performance first and POSIX second. Particularly things that are difficult to support on S3 or would translate into more than one round-trip would either fail (random writes) or faked (no per-file permission). Goofys does not have an on disk data cache (checkout catfs), and consistency model is close-to-open.
-
On Linux, install via pre-built binaries. You may also need to install fuse too if you want to mount it on startup.
-
On macOS, install via Homebrew:
$ brew cask install osxfuse
$ brew install goofys
- Or build from source with Go 1.10 or later:
$ export GOPATH=$HOME/work
$ go get github.com/StatCan/goofys
$ go install github.com/StatCan/goofys
$ cat ~/.aws/credentials
[default]
aws_access_key_id = AKID1234567890
aws_secret_access_key = MY-SECRET-KEY
$ $GOPATH/bin/goofys <bucket> <mountpoint>
$ $GOPATH/bin/goofys <bucket:prefix> <mountpoint> # if you only want to mount objects under a prefix
Users can also configure credentials via the
AWS CLI
or the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables.
To mount an S3 bucket on startup, make sure the credential is
configured for root
, and can add this to /etc/fstab
:
goofys#bucket /mnt/mountpoint fuse _netdev,allow_other,--file-mode=0666,--dir-mode=0777 0 0
See also: Instruction for Azure Blob Storage, Azure Data Lake Gen1, and Azure Data Lake Gen2.
Got more questions? Check out questions other people asked
Using --stat-cache-ttl 1s --type-cache-ttl 1s
for goofys
-ostat_cache_expire=1
for s3fs to simulate cold runs. Detail for the
benchmark can be found in
bench.sh. Raw data
is available as well. The test was run on an EC2 m5.4xlarge in us-west-2a
connected to a bucket in us-west-2. Units are seconds.
To run the benchmark, configure EC2's instance role to be able to write to $TESTBUCKET
, and then do:
$ sudo docker run -e BUCKET=$TESTBUCKET -e CACHE=false --rm --privileged --net=host -v /tmp/cache:/tmp/cache StatCan/goofys-bench
# result will be written to $TESTBUCKET
See also: cached benchmark result and result on Azure.
Copyright (C) 2015 - 2019 Ka-Hing Cheung
Licensed under the Apache License, Version 2.0
goofys has been tested under Linux and macOS.
List of non-POSIX behaviors/limitations:
- only sequential writes supported
- does not store file mode/owner/group
- use
--(dir|file)-mode
or--(uid|gid)
options
- use
- does not support symlink or hardlink
ctime
,atime
is always the same asmtime
- cannot
rename
directories with more than 1000 children unlink
returns success even if file is not presentfsync
is ignored, files are only flushed onclose
goofys has been tested with the following non-AWS S3 providers:
- Amplidata / WD ActiveScale
- Ceph (ex: Digital Ocean Spaces, DreamObjects, gridscale)
- EdgeFS
- EMC Atmos
- Google Cloud Storage
- Minio (limited)
- OpenStack Swift
- S3Proxy
- Scaleway
- Wasabi
Additionally, goofys also works with the following non-S3 object stores:
- Azure Blob Storage
- Azure Data Lake Gen1
- Azure Data Lake Gen2
Integration with meta-fuse-csi-plugin
This repository also contains files from the the meta-fuse-csi-plugin repository with more information being found here. The initial commit to add them is here with subsequent commits refining the process. The purpose of this was to simplify the process described here.
- Data is stored on Amazon S3
- Amazon SDK for Go
- Other related fuse filesystems
- catfs: caching layer that can be used with goofys
- s3fs: another popular filesystem for S3
- gcsfuse: filesystem for Google Cloud Storage. Goofys borrowed some skeleton code from this project.
- S3Proxy is used for
go test
- fuse binding, also used by
gcsfuse