Skip to content

Commit

Permalink
Deduplicate README.md (#79)
Browse files Browse the repository at this point in the history
* Deduplicate README.md

* Remove CONTRIBUTING.md as it is no longer relevant

Co-authored-by: Sathis Kumar <sathis.kumar@udemy.com>
  • Loading branch information
msathis and msathis authored Apr 26, 2021
1 parent 8380c5d commit e973e30
Show file tree
Hide file tree
Showing 6 changed files with 26 additions and 458 deletions.
77 changes: 0 additions & 77 deletions CONTRIBUTING.md

This file was deleted.

28 changes: 14 additions & 14 deletions datafusion/DEVELOPERS.md → DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,40 +35,40 @@ DataFusion is written in Rust and it uses a standard rust toolkit:
Below is a checklist of what you need to do to add a new scalar function to DataFusion:

* Add the actual implementation of the function:
* [here](src/physical_plan/string_expressions.rs) for string functions
* [here](src/physical_plan/math_expressions.rs) for math functions
* [here](src/physical_plan/datetime_expressions.rs) for datetime functions
* create a new module [here](src/physical_plan) for other functions
* In [src/physical_plan/functions](src/physical_plan/functions.rs), add:
* [here](datafusion/src/physical_plan/string_expressions.rs) for string functions
* [here](datafusion/src/physical_plan/math_expressions.rs) for math functions
* [here](datafusion/src/physical_plan/datetime_expressions.rs) for datetime functions
* create a new module [here](datafusion/src/physical_plan) for other functions
* In [src/physical_plan/functions](datafusion/src/physical_plan/functions.rs), add:
* a new variant to `BuiltinScalarFunction`
* a new entry to `FromStr` with the name of the function as called by SQL
* a new line in `return_type` with the expected return type of the function, given an incoming type
* a new line in `signature` with the signature of the function (number and types of its arguments)
* a new line in `create_physical_expr` mapping the built-in to the implementation
* tests to the function.
* In [tests/sql.rs](tests/sql.rs), add a new test where the function is called through SQL against well known data and returns the expected result.
* In [src/logical_plan/expr](src/logical_plan/expr.rs), add:
* In [tests/sql.rs](datafusion/tests/sql.rs), add a new test where the function is called through SQL against well known data and returns the expected result.
* In [src/logical_plan/expr](datafusion/src/logical_plan/expr.rs), add:
* a new entry of the `unary_scalar_expr!` macro for the new function.
* In [src/logical_plan/mod](src/logical_plan/mod.rs), add:
* In [src/logical_plan/mod](datafusion/src/logical_plan/mod.rs), add:
* a new entry in the `pub use expr::{}` set.

## How to add a new aggregate function

Below is a checklist of what you need to do to add a new aggregate function to DataFusion:

* Add the actual implementation of an `Accumulator` and `AggregateExpr`:
* [here](src/physical_plan/string_expressions.rs) for string functions
* [here](src/physical_plan/math_expressions.rs) for math functions
* [here](src/physical_plan/datetime_expressions.rs) for datetime functions
* create a new module [here](src/physical_plan) for other functions
* In [src/physical_plan/aggregates](src/physical_plan/aggregates.rs), add:
* [here](datafusion/src/physical_plan/string_expressions.rs) for string functions
* [here](datafusion/src/physical_plan/math_expressions.rs) for math functions
* [here](datafusion/src/physical_plan/datetime_expressions.rs) for datetime functions
* create a new module [here](datafusion/src/physical_plan) for other functions
* In [src/physical_plan/aggregates](datafusion/src/physical_plan/aggregates.rs), add:
* a new variant to `BuiltinAggregateFunction`
* a new entry to `FromStr` with the name of the function as called by SQL
* a new line in `return_type` with the expected return type of the function, given an incoming type
* a new line in `signature` with the signature of the function (number and types of its arguments)
* a new line in `create_aggregate_expr` mapping the built-in to the implementation
* tests to the function.
* In [tests/sql.rs](tests/sql.rs), add a new test where the function is called through SQL against well known data and returns the expected result.
* In [tests/sql.rs](datafusion/tests/sql.rs), add a new test where the function is called through SQL against well known data and returns the expected result.

## How to display plans graphically

Expand Down
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ async fn main() -> datafusion::error::Result<()> {
let df = ctx.read_csv("tests/example.csv", CsvReadOptions::new())?;

let df = df.filter(col("a").lt_eq(col("b")))?
.aggregate(vec![col("a")], vec![min(col("b"))])?
.limit(100)?;
.aggregate(vec![col("a")], vec![min(col("b"))])?
.limit(100)?;

// execute and print results
let results: Vec<RecordBatch> = df.collect().await?;
Expand Down Expand Up @@ -141,11 +141,11 @@ DataFusion also includes a simple command-line interactive SQL utility. See the
- [x] SQL Parser
- [x] SQL Query Planner
- [x] Query Optimizer
- [x] Constant folding
- [x] Join Reordering
- [x] Limit Pushdown
- [x] Projection push down
- [x] Predicate push down
- [x] Constant folding
- [x] Join Reordering
- [x] Limit Pushdown
- [x] Projection push down
- [x] Predicate push down
- [x] Type coercion
- [x] Parallel query execution

Expand Down Expand Up @@ -213,7 +213,9 @@ DataFusion also includes a simple command-line interactive SQL utility. See the
- [ ] MINUS
- [x] Joins
- [x] INNER JOIN
- [ ] CROSS JOIN
- [x] LEFT JOIN
- [x] RIGHT JOIN
- [x] CROSS JOIN
- [ ] OUTER JOIN
- [ ] Window

Expand Down
1 change: 1 addition & 0 deletions datafusion/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ description = "DataFusion is an in-memory query engine that uses Apache Arrow as
version = "4.0.0-SNAPSHOT"
homepage = "https://github.com/apache/arrow"
repository = "https://github.com/apache/arrow"
readme = "../README.md"
authors = ["Apache Arrow <dev@arrow.apache.org>"]
license = "Apache-2.0"
keywords = [ "arrow", "query", "sql" ]
Expand Down
Loading

0 comments on commit e973e30

Please sign in to comment.