From f79058eecfb1c9d0d0f29f36ed03e2d1df3fe365 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C5=81ukasz=20Osipiuk?= Date: Mon, 13 Aug 2018 18:33:07 +0200 Subject: [PATCH] Add high level CBO documentation Extracted-From: https://github.com/starburstdata/presto --- presto-docs/src/main/sphinx/optimizer.rst | 1 + .../optimizer/cost-based-optimizations.rst | 76 +++++++++++++++++++ 2 files changed, 77 insertions(+) create mode 100644 presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst diff --git a/presto-docs/src/main/sphinx/optimizer.rst b/presto-docs/src/main/sphinx/optimizer.rst index 483df3d2375c2..ac160e3f92717 100644 --- a/presto-docs/src/main/sphinx/optimizer.rst +++ b/presto-docs/src/main/sphinx/optimizer.rst @@ -7,3 +7,4 @@ Query Optimizer optimizer/statistics optimizer/cost-in-explain + optimizer/cost-based-optimizations diff --git a/presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst b/presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst new file mode 100644 index 0000000000000..f76f47dce53ce --- /dev/null +++ b/presto-docs/src/main/sphinx/optimizer/cost-based-optimizations.rst @@ -0,0 +1,76 @@ +======================== +Cost based optimizations +======================== + +Presto supports several cost based optimizations, described below. + +Join Enumeration +---------------- + +The order in which joins are executed in a query can have a significant impact +on the query's performance. The aspect of join ordering that has the largest +impact on performance is the size of the data being processed and transferred +over the network. If a join that produces a lot of data is performed early in +the execution, then subsequent stages will need to process large amounts of +data for longer than necessary, increasing the time and resources needed for +the query. + +With cost based join enumeration, Presto uses +:doc:`/optimizer/statistics` provided by connectors to estimate +the costs for different join orders and automatically pick the +join order with the lowest computed costs. + +The join enumeration strategy is governed by the ``join_reordering_strategy`` +session property, with the ``optimizer.join-reordering-strategy`` +configuration property providing the default value. + +The valid values are: + * ``AUTOMATIC`` (default) - full automatic join enumeration enabled + * ``ELIMINATE_CROSS_JOINS`` - eliminate unnecessary cross joins + * ``NONE`` - purely syntactic join order + +If using ``AUTOMATIC`` and statistics are not available, or if for any other +reason a cost could not be computed, the ``ELIMINATE_CROSS_JOINS`` strategy is +used instead. + +Join Distribution Selection +--------------------------- + +Presto uses a hash based join algorithm. That implies that for each join +operator a hash table must be created from one join input (called build side). +The other input (probe side) is then iterated and for each row the hash table is +queried to find matching rows. + +There are two types of join distributions: + * Partitioned: each node participating in the query builds a hash table + from only fraction of the data + * Broadcast: each node participating in the query builds a hash table + from all of the data (data is replicated to each node) + +Each type have their trade offs. Partitioned joins require redistributing both +tables using a hash of the join key. This can be slower (sometimes +substantially) than broadcast joins, but allows much larger joins. In +particular, broadcast joins will be faster if the build side is much smaller +than the probe side. However, broadcast joins require that the tables on the +build side of the join after filtering fit in memory on each node, whereas +distributed joins only need to fit in distributed memory across all nodes. + +With cost based join distribution selection, Presto automatically chooses to +use a partitioned or broadcast join. With cost based join enumeration, Presto +automatically chooses which side is the probe and which is the build. + +The join distribution strategy is governed by the ``join_distribution_type `` +session property, with the ``join-distribution-type`` configuration property +providing the default value. + +The valid values are: + * ``AUTOMATIC`` (default) - join distribution type is determined automatically + for each join + * ``BROADCAST`` - broadcast join distribution is used for all joins + * ``PARTITIONED`` - partitioned join distribution is used for all join + +Connector Implementations +------------------------- + +In order for the Presto optimizer to use the cost based strategies, +the connection implementation must provide :doc:`statistics`.