Disclaimer: This project is still in alpha, so there will be bugs. Use at your own risk! But if you find bugs or have feature requests, open an issue :)
metaframe
is a CLI data documentation tool (+catalog). It leverages junegunn/fzf and lyft/amundsen to create a blazingly fast CLI framework to:
- Easily document your tables, using an organizational structure where tables are first-class citizens.
- Run ETL jobs from the command-line (or manually document your datasets).
- Search through your tables.
brew install rsyi/tap/metaframe
If not on macOS, clone this directory, then run the following in the base directory of the repo (make sure ./dist
does not exist, or pyinstaller won't rebuild):
make && make install
If there are errors, it's often because the specific flavor of python referenced by pip3
on your machine is incompatible (metaframe is tested against python 3.7 and 3.8 only). To troubleshoot this, try using a virtual environment in 3.7 or 3.8 or modifying the makefile pip3
reference to specific binary paths in your filesystem. Or open an issue!
We don't explicitly add an alias for the mf
binary, so you'll want to either add ~/.metaframe/bin/
to your PATH
, or add the following alias to your .bash_profile
or .zshrc
file.
alias mf=~/.metaframe/bin/mf
Start by running:
mf init
which will generate a file structure in ~/.metaframe
.
If you want to manually document tables, create a new table stub by running:
mf new <TABLE_NAME>
Then run mf
to search over these docs! See the Manual usage section for more information.
If you want to run ETL jobs to automatically populate this metadata, keep reading.
If you want metadata to be scraped and populated automatically, you'll next need to add an entry to your connections.yaml
file, which can be accessed by running mf connections edit
. For example:
- name: presto # optional
type: presto
host: host.mysite.com:8889
username: # optional
password: # optional
cluster: system # optional
The only necessary arguments are the host
and the type
. See Connection setup for more details (including information on type-specific syntax).
Once this configuration is complete, you can run your ETL job by running:
mf etl
By default this only pulls tables that haven't already been pulled. For more details, see ETL.
Run:
mf
to search over all metadata. Hitting enter
will open the editable part of the docs in your default text editor, defined by the environmental variable $EDITOR
.