From f79058eecfb1c9d0d0f29f36ed03e2d1df3fe365 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C5=81ukasz=20Osipiuk?= <lukasz@osipiuk.net>
Date: Mon, 13 Aug 2018 18:33:07 +0200
Subject: [PATCH] Add high level CBO documentation

Extracted-From: https://github.com/starburstdata/presto
---
 presto-docs/src/main/sphinx/optimizer.rst     |  1 +
 .../optimizer/cost-based-optimizations.rst    | 76 +++++++++++++++++++
 2 files changed, 77 insertions(+)
 create mode 100644 presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst

diff --git a/presto-docs/src/main/sphinx/optimizer.rst b/presto-docs/src/main/sphinx/optimizer.rst
index 483df3d2375c2..ac160e3f92717 100644
--- a/presto-docs/src/main/sphinx/optimizer.rst
+++ b/presto-docs/src/main/sphinx/optimizer.rst
@@ -7,3 +7,4 @@ Query Optimizer
 
     optimizer/statistics
     optimizer/cost-in-explain
+    optimizer/cost-based-optimizations
diff --git a/presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst b/presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst
new file mode 100644
index 0000000000000..f76f47dce53ce
--- /dev/null
+++ b/presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst
@@ -0,0 +1,76 @@
+========================
+Cost based optimizations
+========================
+
+Presto supports several cost based optimizations, described below.
+
+Join Enumeration
+----------------
+
+The order in which joins are executed in a query can have a significant impact
+on the query's performance. The aspect of join ordering that has the largest
+impact on performance is the size of the data being processed and transferred
+over the network. If a join that produces a lot of data is performed early in
+the execution, then subsequent stages will need to process large amounts of
+data for longer than necessary, increasing the time and resources needed for
+the query.
+
+With cost based join enumeration, Presto uses
+:doc:`/optimizer/statistics` provided by connectors to estimate
+the costs for different join orders and automatically pick the
+join order with the lowest computed costs.
+
+The join enumeration strategy is governed by the ``join_reordering_strategy``
+session property, with the ``optimizer.join-reordering-strategy``
+configuration property providing the default value.
+
+The valid values are:
+ * ``AUTOMATIC`` (default) - full automatic join enumeration enabled
+ * ``ELIMINATE_CROSS_JOINS`` - eliminate unnecessary cross joins
+ * ``NONE`` - purely syntactic join order
+
+If using ``AUTOMATIC`` and statistics are not available, or if for any other
+reason a cost could not be computed, the ``ELIMINATE_CROSS_JOINS`` strategy is
+used instead.
+
+Join Distribution Selection
+---------------------------
+
+Presto uses a hash based join algorithm. That implies that for each join
+operator a hash table must be created from one join input (called build side).
+The other input (probe side) is then iterated and for each row the hash table is
+queried to find matching rows.
+
+There are two types of join distributions:
+ * Partitioned: each node participating in the query builds a hash table
+   from only fraction of the data
+ * Broadcast: each node participating in the query builds a hash table
+   from all of the data (data is replicated to each node)
+
+Each type have their trade offs. Partitioned joins require redistributing both
+tables using a hash of the join key. This can be slower (sometimes
+substantially) than broadcast joins, but allows much larger joins. In
+particular, broadcast joins will be faster if the build side is much smaller
+than the probe side. However, broadcast joins require that the tables on the
+build side of the join after filtering fit in memory on each node, whereas
+distributed joins only need to fit in distributed memory across all nodes.
+
+With cost based join distribution selection, Presto automatically chooses to
+use a partitioned or broadcast join. With cost based join enumeration, Presto
+automatically chooses which side is the probe and which is the build.
+
+The join distribution strategy is governed by the ``join_distribution_type ``
+session property, with the ``join-distribution-type`` configuration property
+providing the default value.
+
+The valid values are:
+ * ``AUTOMATIC`` (default) - join distribution type is determined automatically
+   for each join
+ * ``BROADCAST`` - broadcast join distribution is used for all joins
+ * ``PARTITIONED`` - partitioned join distribution is used for all join
+
+Connector Implementations
+-------------------------
+
+In order for the Presto optimizer to use the cost based strategies,
+the connection implementation must provide :doc:`statistics`.