Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

First version of new CONTRIBUTING.md #508

Merged
merged 1 commit into from
Apr 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Contributing

First, thank you for contributing to Meilisearch! The goal of this document is to provide everything you need to start contributing to Milli, the search engine of Meilisearch.

Remember that there are many ways to contribute other than writing code: writing [tutorials or blog posts](https://github.com/meilisearch/awesome-meilisearch), improving [the documentation](https://github.com/meilisearch/documentation), submitting [bug reports](https://github.com/meilisearch/milli/issues/new) and [feature requests](https://github.com/meilisearch/product/discussions/categories/feedback-feature-proposal)...

## Table of Contents
- [Assumptions](#assumptions)
- [How to Contribute](#how-to-contribute)
- [Development Workflow](#development-workflow)
- [Git Guidelines](#git-guidelines)
- [Release Process (for internal team only)](#release-process-for-internal-team-only)

## Assumptions

1. **You're familiar with [GitHub](https://github.com) and the [Pull Requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)(PR) workflow.**
2. **You've read the Meilisearch [documentation](https://docs.meilisearch.com).**
3. **You know about the [Meilisearch community](https://docs.meilisearch.com/learn/what_is_meilisearch/contact.html).
Please use this for help.**

## How to Contribute

1. Ensure your change has an issue! Find an
[existing issue](https://github.com/meilisearch/milli/issues/) or [open a new issue](https://github.com/meilisearch/milli/issues/new).
* This is where you can get a feel if the change will be accepted or not.
2. Once approved, [fork the Milli repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) in your own GitHub account.
3. [Create a new Git branch](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-and-deleting-branches-within-your-repository)
4. Review the [Development Workflow](#development-workflow) section that describes the steps to maintain the repository.
5. Make your changes on your branch.
6. [Submit the branch as a Pull Request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) pointing to the `main` branch of the Meilisearch repository. A maintainer should comment and/or review your Pull Request within a few days. Although depending on the circumstances, it may take longer.

## Development Workflow

_WIP section_

### Setup and run

```bash
cargo run --release
```

We recommend using the `--release` flag to test the full performance.

### Test

```bash
cargo test
```

### Querying the engine via the web interface

To help you develop your feature you might need to use a web interface! You can query the engine by going to [the HTML page itself](http://127.0.0.1:9700).

### Compile and run the HTTP debug server

You can specify the number of threads to use to index documents and many other settings too.

```bash
cd http-ui
cargo run --release -- --db my-database.mdb -vvv --indexing-jobs 8
```

### Index your documents

It can index a massive amount of documents in not much time, I already achieved to index:
- 115m songs (song and artist name) in \~48min and take 81GiB on disk.
- 12m cities (name, timezone and country ID) in \~4min and take 6GiB on disk.

These metrics are done on a MacBook Pro with the M1 processor.

You can feed the engine with your CSV (comma-separated, yes) data like this:

```bash
printf "id,name,age\n1,hello,32\n2,kiki,24\n" | http POST 127.0.0.1:9700/documents content-type:text/csv
```

Don't forget to specify the `id` of the documents. Also, note that it supports JSON and JSON
streaming: you can send them to the engine by using the `content-type:application/json` and
`content-type:application/x-ndjson` headers respectively.

## Git Guidelines

### Git Branches

All changes must be made in a branch and submitted as PR.

We do not enforce any branch naming style, but please use something descriptive of your changes.

### Git Commits

As minimal requirements, your commit message should:
- be capitalized
- not finish by a dot or any other punctuation character (!,?)
- start with a verb so that we can read your commit message this way: "This commit will ...", where "..." is the commit message.
e.g.: "Fix the home page button" or "Add more tests for create_index method"

We don't follow any other convention, but if you want to use one, we recommend [the Chris Beams one](https://chris.beams.io/posts/git-commit/).

### GitHub Pull Requests

Some notes on GitHub PRs:

- All PRs must be reviewed and approved by at least one maintainer.
- The PR title should be accurate and descriptive of the changes. The title of the PR will be indeed automatically added to the next [release changelogs](https://github.com/meilisearch/milli/releases/).
- [Convert your PR as a draft](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/changing-the-stage-of-a-pull-request) if your changes are a work in progress: no one will review it until you pass your PR as ready for review.<br>
The draft PRs are recommended when you want to show that you are working on something and make your work visible.
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [Bors](https://github.com/bors-ng/bors-ng) to automatically enforce this requirement without the PR author having to rebase manually.

## Release Process (for internal team only)

Meilisearch tools follow the [Semantic Versioning Convention](https://semver.org/).

### Automation to rebase and Merge the PRs <!-- omit in toc -->

This project integrates a bot that helps us manage pull requests merging.<br>
_[Read more about this](https://github.com/meilisearch/integration-guides/blob/main/resources/bors.md)._

### Automated changelogs <!-- omit in toc -->

This project integrates a tool to create automated changelogs: the [release-drafter](https://github.com/release-drafter/release-drafter/).

### How to Publish the Release <!-- omit in toc -->

Make a PR modifying all the `Cargo.toml` files with the right version.

Once the changes are merged on `main`, you can publish the current draft release via the [GitHub interface](https://github.com/meilisearch/milli/releases): on this page, click on `Edit` (related to the draft release) > update the description if needed > when you are ready, click on `Publish release`.

<hr>

Thank you again for reading this through, we can not wait to begin to work with you if you made your way through this contributing guide ❤️
43 changes: 4 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,50 +20,15 @@ This repository contains crates to quickly debug the engine:
- The `search` crate is a simple command-line that helps run [flamegraph] on top of it.
- The `helpers` crate is only used to modify the database inplace, sometimes.

### Compile and run the HTTP debug server
## How to use it?

You can specify the number of threads to use to index documents and many other settings too.

```bash
cd http-ui
cargo run --release -- --db my-database.mdb -vvv --indexing-jobs 8
```

### Index your documents

It can index a massive amount of documents in not much time, I already achieved to index:
- 115m songs (song and artist name) in \~48min and take 81GiB on disk.
- 12m cities (name, timezone and country ID) in \~4min and take 6GiB on disk.

These metrics are done on a MacBook Pro with the M1 processor.

You can feed the engine with your CSV (comma-separated, yes) data like this:

```bash
printf "id,name,age\n1,hello,32\n2,kiki,24\n" | http POST 127.0.0.1:9700/documents content-type:text/csv
```

Don't forget to specify the `id` of the documents. Also, note that it supports JSON and JSON
streaming: you can send them to the engine by using the `content-type:application/json` and
`content-type:application/x-ndjson` headers respectively.

### Querying the engine via the website

You can query the engine by going to [the HTML page itself](http://127.0.0.1:9700).
_Section in WIP_

## Contributing

You can setup a `git-hook` to stop you from making a commit too fast. It'll stop you if:
- Any of the workspaces does not build
- Your code is not well-formatted

These two things are also checked in the CI, so ignoring the hook won't help you merge your code.
But if you need to, you can still add `--no-verify` when creating your commit to ignore the hook.
We're glad you're thinking about contributing to this repository! Feel free to pick an issue, and to ask any question you need. Some points might not be clear and we are available to help you!

To enable the hook, run the following command from the root of the project:
```
cp script/pre-commit .git/hooks/pre-commit
```
Also, we recommend following the [CONTRIBUTING.md](/CONTRIBUTING.md) to create your PR.

[Meilisearch]: https://github.com/meilisearch/meilisearch
[flamegraph]: https://github.com/flamegraph-rs/flamegraph