Skip to content

Commit

Permalink
RFC-0423: Command Line Interface (#423)
Browse files Browse the repository at this point in the history
* Add draft for cli

Signed-off-by: Xuanwo <github@xuanwo.io>

* Assign number

Signed-off-by: Xuanwo <github@xuanwo.io>

* Add README of rfcs

Signed-off-by: Xuanwo <github@xuanwo.io>

* Don't run ci if only docs updated

Signed-off-by: Xuanwo <github@xuanwo.io>

* Update docs

Signed-off-by: Xuanwo <github@xuanwo.io>

* Fix CI

Signed-off-by: Xuanwo <github@xuanwo.io>

* Remove not needed cache

Signed-off-by: Xuanwo <github@xuanwo.io>

* Fix CI of skywalking-eyes

Signed-off-by: Xuanwo <github@xuanwo.io>

* Add tracking issues

Signed-off-by: Xuanwo <github@xuanwo.io>
  • Loading branch information
Xuanwo authored Jul 8, 2022
1 parent 7994bfb commit ae00ec2
Show file tree
Hide file tree
Showing 10 changed files with 300 additions and 6 deletions.
8 changes: 2 additions & 6 deletions .github/actions/check/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@ inputs:
runs:
using: "composite"
steps:
- uses: Swatinem/rust-cache@v1
with:
sharedKey: base-v1

- name: Format
uses: actions-rs/cargo@v1
with:
Expand All @@ -24,7 +20,7 @@ runs:
args: --no-deps

- name: Check License Header
uses: apache/skywalking-eyes@main
uses: apache/skywalking-eyes/header@main
env:
GITHUB_TOKEN: ${{ inputs.github_token }}
with:
Expand All @@ -45,4 +41,4 @@ runs:
uses: actions-rs/cargo@v1
with:
command: clippy
args: --all-targets --all-features -- -D warnings
args: --all-targets --all-features -- -D warnings
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
paths-ignore:
- "docs/**"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/service_test_azblob.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
paths-ignore:
- "docs/**"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/service_test_fs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
paths-ignore:
- "docs/**"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/service_test_hdfs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
paths-ignore:
- "docs/**"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/service_test_memory.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
paths-ignore:
- "docs/**"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/service_test_s3.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
paths-ignore:
- "docs/**"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
Expand Down
1 change: 1 addition & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,4 @@
- [0337-dir-entry](rfcs/0337-dir-entry.md)
- [0409-accessor-capabilities](rfcs/0409-accessor-capabilities.md)
- [0413-presign](rfcs/0413-presign.md)
- [0423-command-line-interface](rfcs/0423-command-line-interface.md)
268 changes: 268 additions & 0 deletions docs/rfcs/0423-command-line-interface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
- Proposal Name: `command_line_interface`
- Start Date: 2022-07-08
- RFC PR: [datafuselabs/opendal#423](https://github.com/datafuselabs/opendal/pull/423)
- Tracking Issue: [datafuselabs/opendal#422](https://github.com/datafuselabs/opendal/issues/422)

# Summary

Add command line interface for OpenDAL.

# Motivation

> **Q**: There are so many cli out there, why we still need a cli for OpenDAL?
>
> **A**: Because there are so many cli out there.
To manipulate our date store in different could service, we need to install different clis:

- [`aws-cli`]/[`s3cmd`]/... for AWS (S3)
- [`azcopy`] for Azure Storage Service
- [`gcloud`] for Google Cloud

Those clis provide native and seamless experiences for their own products but also lock us and our data.

However, for 80% cases, we just want to do simple jobs like `cp`, `mv` and `rm`. It's boring to figure out how to use them:

- `aws --endpoint-url http://127.0.0.1:9900/ s3 cp data s3://testbucket/data --recursive`
- `azcopy copy 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive`
- `gsutil cp data gs://testbucket/`

Can we use them in the same way? Can we let the data flow freely?

Let's look back OpenDAL's slogan:

**Open Data Access Layer that connect the whole world together**

This is a natural extension for OpenDAL: providing a command line interface!

# Guide-level explanation

OpenDAL will provide a new cli called: `oli`. It's a shortcut of `OpenDAL Command Line Interface`.

Users can install this cli via:

```shell
cargo install oli
```

Or using they favourite package management:

```shell
# Archlinux
pacman -S oli
# Debian / Ubuntu
apt install oli
# Rocky Linux / Fedora
dnf install oli
# macOS
brew install oli
```

With `oli`, users can:

- Upload files to s3: `oli cp books.csv s3://bucket/books.csv`
- Download files from azblob: `oli cp azblob://bucket/books.csv /tmp/books.csv`
- Move data between storage services: `oli mv s3://bucket/dir azblob://bucket/dir`
- Delete all files: `oli rm -rf s3://bucket`

`oli` also provide alias to make cloud data manipulating even natural:

- `ocp` for `oli cp`
- `ols` for `oli ls`
- `omv` for `oli mv`
- `orm` for `oli rm`
- `ostat` for `oli stat`

`oli` will provide profile management so users don't need to provide credential every time:

- `oli profile add my_s3 --bucket test --access-key-id=example --secret-access-key=example`
- `ocp my_s3://dir /tmp/dir`

# Reference-level explanation

`oli` will be a separate crate apart from `opendal` so we will not pollute the dependencies of `opendal`. But `oli` will be releases at the same time with the same version of `opendal`. That means `oli` will always use the same (latest) version of opendal.

Most operations of `oli` should be trivial, we will propose new RFCs if requiring big changes.

`oli` won't keep configuration. All config will go through environment, for example:

- `OIL_COLOR=always`
- `OIL_CONCURRENCY=16`

Besides, `oil` will read profile from env like `cargo`:

- `OIL_PROFILE_TEST_TYPE=s3`
- `OIL_PROFILE_TEST_ENDPOINT=http://127.0.0.1:1090`
- `OIL_PROFILE_TEST_BUCKET=test_bucket`
- `OIL_PROFILE_TEST_ACCESS_KET_ID=access_key_id`
- `OIL_PROFILE_TEST_SECRET_ACCESS_KEY=secret_access_key`

With those environments, we can:

```shell
ocp path/to/dir test://test/to/dir
```

# Drawbacks

None

# Rationale and alternatives

## s3cmd

[s3cmd](https://s3tools.org/s3cmd) is a command line s3 client for Linux and Mac.

```shell
Usage: s3cmd [options] COMMAND [parameters]

S3cmd is a tool for managing objects in Amazon S3 storage. It allows for
making and removing "buckets" and uploading, downloading and removing
"objects" from these buckets.

Commands:
Make bucket
s3cmd mb s3://BUCKET
Remove bucket
s3cmd rb s3://BUCKET
List objects or buckets
s3cmd ls [s3://BUCKET[/PREFIX]]
List all object in all buckets
s3cmd la
Put file into bucket
s3cmd put FILE [FILE...] s3://BUCKET[/PREFIX]
Get file from bucket
s3cmd get s3://BUCKET/OBJECT LOCAL_FILE
Delete file from bucket
s3cmd del s3://BUCKET/OBJECT
Delete file from bucket (alias for del)
s3cmd rm s3://BUCKET/OBJECT
Restore file from Glacier storage
s3cmd restore s3://BUCKET/OBJECT
Synchronize a directory tree to S3 (checks files freshness using
size and md5 checksum, unless overridden by options, see below)
s3cmd sync LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR
Disk usage by buckets
s3cmd du [s3://BUCKET[/PREFIX]]
Get various information about Buckets or Files
s3cmd info s3://BUCKET[/OBJECT]
Copy object
s3cmd cp s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]
Modify object metadata
s3cmd modify s3://BUCKET1/OBJECT
Move object
s3cmd mv s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]
Modify Access control list for Bucket or Files
s3cmd setacl s3://BUCKET[/OBJECT]
Modify Bucket Policy
s3cmd setpolicy FILE s3://BUCKET
Delete Bucket Policy
s3cmd delpolicy s3://BUCKET
Modify Bucket CORS
s3cmd setcors FILE s3://BUCKET
Delete Bucket CORS
s3cmd delcors s3://BUCKET
Modify Bucket Requester Pays policy
s3cmd payer s3://BUCKET
Show multipart uploads
s3cmd multipart s3://BUCKET [Id]
Abort a multipart upload
s3cmd abortmp s3://BUCKET/OBJECT Id
List parts of a multipart upload
s3cmd listmp s3://BUCKET/OBJECT Id
Enable/disable bucket access logging
s3cmd accesslog s3://BUCKET
Sign arbitrary string using the secret key
s3cmd sign STRING-TO-SIGN
Sign an S3 URL to provide limited public access with expiry
s3cmd signurl s3://BUCKET/OBJECT <expiry_epoch|+expiry_offset>
Fix invalid file names in a bucket
s3cmd fixbucket s3://BUCKET[/PREFIX]
Create Website from bucket
s3cmd ws-create s3://BUCKET
Delete Website
s3cmd ws-delete s3://BUCKET
Info about Website
s3cmd ws-info s3://BUCKET
Set or delete expiration rule for the bucket
s3cmd expire s3://BUCKET
Upload a lifecycle policy for the bucket
s3cmd setlifecycle FILE s3://BUCKET
Get a lifecycle policy for the bucket
s3cmd getlifecycle s3://BUCKET
Remove a lifecycle policy for the bucket
s3cmd dellifecycle s3://BUCKET
List CloudFront distribution points
s3cmd cflist
Display CloudFront distribution point parameters
s3cmd cfinfo [cf://DIST_ID]
Create CloudFront distribution point
s3cmd cfcreate s3://BUCKET
Delete CloudFront distribution point
s3cmd cfdelete cf://DIST_ID
Change CloudFront distribution point parameters
s3cmd cfmodify cf://DIST_ID
Display CloudFront invalidation request(s) status
s3cmd cfinvalinfo cf://DIST_ID[/INVAL_ID]
```
## aws-cli
[aws-cli](https://aws.amazon.com/cli/) is the official cli provided by AWS.
```shell
$ aws s3 ls s3://mybucket
LastWriteTime Length Name
------------ ------ ----
PRE myfolder/
2013-09-03 10:00:00 1234 myfile.txt

$ aws s3 cp myfolder s3://mybucket/myfolder --recursive
upload: myfolder/file1.txt to s3://mybucket/myfolder/file1.txt
upload: myfolder/subfolder/file1.txt to s3://mybucket/myfolder/subfolder/file1.txt

$ aws s3 sync myfolder s3://mybucket/myfolder --exclude *.tmp
upload: myfolder/newfile.txt to s3://mybucket/myfolder/newfile.txt
```
## azcopy
[azcopy](https://github.com/Azure/azure-storage-azcopy) is the new Azure Storage data transfer utility.
```shell
azcopy copy 'C:\myDirectory\myTextFile.txt' 'https://mystorageaccount.blob.core.windows.net/mycontainer/myTextFile.txt'

azcopy copy 'https://mystorageaccount.blob.core.windows.net/mycontainer/myTextFile.txt' 'C:\myDirectory\myTextFile.txt'

azcopy sync 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive
```
## gsutil
[gsutil](https://cloud.google.com/storage/docs/gsutil) is a Python application that lets you access Cloud Storage from the command line.
```shell
gsutil cp [OPTION]... src_url dst_url
gsutil cp [OPTION]... src_url... dst_url
gsutil cp [OPTION]... -I dst_url

gsutil mv [-p] src_url dst_url
gsutil mv [-p] src_url... dst_url
gsutil mv [-p] -I dst_url

gsutil rm [-f] [-r] url...
gsutil rm [-f] [-r] -I
```
# Unresolved questions
None.
# Future possibilities
None.
[`aws-cli`]: https://github.com/aws/aws-cli
[`s3cmd`]: https://s3tools.org/s3cmd
[`azcopy`]: https://github.com/Azure/azure-storage-azcopy
[`gcloud`]: https://cloud.google.com/sdk/docs/install
17 changes: 17 additions & 0 deletions docs/rfcs/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,18 @@
# RFCs of OpenDAL

RFCs power OpenDAL's development.

To add new features and big refactors:

- Start a new RFCs with the template [0000-example](./0000-example.md).
- Submit PR and assign the RFC number with the PR number.
- Adding into [index of rfcs](../SUMMARY.md) for better rendered.
- Request reviews from OpenDAL maintainers.
- Create a tracking issue and update links in RFC after approval.

To find the complete list of approved RFCs, please visit [here](https://opendal.databend.rs/rfcs/index.html).

Some useful tips:

- Start a pre-propose in [discussion](https://github.com/datafuselabs/opendal/discussions/categories/ideas) to communicate quickly.
- The proposer of RFC may not be the same person as the implementor. Try to include enough information in RFC itself.

1 comment on commit ae00ec2

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploy preview for opendal ready!

✅ Preview
https://opendal-kdleef5nt-databend.vercel.app

Built with commit ae00ec2.
This pull request is being automatically deployed with vercel-action

Please sign in to comment.