Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated documentation for S3-compatible object stores #592

Merged
merged 6 commits into from
Feb 13, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 35 additions & 6 deletions docs/source/how_to_guides/configure_cloud_storage_credentials.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Your config and credentials files should follow the standard structure output by

```
[default]
region=us-west-2
region=<your region, e.g. us-west-2>
output=json

```
Expand All @@ -44,13 +44,31 @@ output=json

```
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
aws_access_key_id=<key ID>
aws_secret_access_key=<application key>

```

More details about the authentication can be found [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html).

Alternatively, this can also be set through [environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html).

````{tabs}
```{code-tab} py
import os
os.environ["AWS_ACCESS_KEY_ID"] = '<key ID>'
os.environ["AWS_SECRET_ACCESS_KEY"] = '<application key>'
os.environ["AWS_DEFAULT_REGION"] = '<your region, e.g. us-west-2>'
```

```{code-tab} sh
export AWS_ACCESS_KEY_ID='<key ID>'
export AWS_SECRET_ACCESS_KEY='<application key>'
export AWS_DEFAULT_REGION='<your region, e.g. us-west-2>'
```
````


### Requester Pays Bucket

If the bucket you are accessing is a [Requester Pays](https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html) bucket, then set the below environment variable by providing a bucket name. If there are more than one requester pays bucket, provide each one separated by a comma.
Expand Down Expand Up @@ -87,25 +105,36 @@ export S3_CANNED_ACL='authenticated-read'
````

## Any S3 compatible object store
For any S3 compatible object store such as [Cloudflare R2](https://www.cloudflare.com/products/r2/), [Coreweave](https://docs.coreweave.com/storage/object-storage), [Backblaze b2](https://www.backblaze.com/b2/cloud-storage.html), etc., setup your credentials as mentioned in the above `Amazon S3` section. The only difference is you must set your object store endpoint url. To do this, you need to set the ``S3_ENDPOINT_URL`` environment variable.
For any S3 compatible object store such as [Cloudflare R2](https://www.cloudflare.com/products/r2/), [Coreweave](https://docs.coreweave.com/storage/object-storage), [Backblaze b2](https://www.backblaze.com/b2/cloud-storage.html), etc., set up your credentials as mentioned in the above `Amazon S3` section. Alternatively, you may use the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variable names to specify your credentials, even though you are not using AWS. The only difference is that you must set your object store endpoint url. To do this, you need to set the ``S3_ENDPOINT_URL`` environment variable.

Below is one such example, which sets a R2 `endpoint url` in your run environment.
Below are the examples of setting an R2 or Backblaze `endpoint url` in your run environment.

```{note}
Your endpoint url is `https://<accountid>.r2.cloudflarestorage.com`. The account ID can be retrieved through your [Cloudflare console](https://dash.cloudflare.com/).
R2: Your endpoint url is `https://<accountid>.r2.cloudflarestorage.com`. The account ID can be retrieved through your [Cloudflare console](https://dash.cloudflare.com/).
Backblaze: Your endpoint url is 'https://s3.<your region>.backblazeb2.com'. The region can be retrieved through your [Backblaze console](https://secure.backblaze.com/b2_buckets.htm).
```

````{tabs}
```{code-tab} py
import os
# If using R2
os.environ['S3_ENDPOINT_URL'] = 'https://<accountid>.r2.cloudflarestorage.com'
# If using Backblaze
os.environ['S3_ENDPOINT_URL'] = 'https://s3.<your region>.backblazeb2.com'
karan6181 marked this conversation as resolved.
Show resolved Hide resolved
```

```{code-tab} sh
# If using R2
export S3_ENDPOINT_URL='https://<accountid>.r2.cloudflarestorage.com'
# If using Backblaze
export S3_ENDPOINT_URL='https://s3.<your region>.backblazeb2.com'
karan6181 marked this conversation as resolved.
Show resolved Hide resolved
```
````


Note that even with S3 compatible object stores, URLs should be of the form `s3://<bucket name>/<path within the bucket>` and use the `s3://` path prefix, instead of `<endpoint url>/<bucket name>/<path within the bucket>`.


## Google Cloud Storage

### MosaicML platform
Expand Down
Loading