-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regression in cost-based optimizer when calculating cost for Window operations #2491
Conversation
…w operations Signed-off-by: Andy Grove <andygrove@nvidia.com>
@@ -351,6 +359,13 @@ object MemoryCostHelper { | |||
def calculateCost(dataSize: Long, memorySpeed: Double): Double = { | |||
(dataSize / GIGABYTE) / memorySpeed | |||
} | |||
|
|||
def isWindowExpr[INPUT <: Expression](expr: BaseExprMeta[INPUT]) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change the name of this? A WindowExpression
is something different that does have a type. It has children that might now, but the name to be feels broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would isUnevaluableWindowExpr
be appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this method can be a catch all for all expressions whose cost cannot be evaluated, and the name should not refer to a specific flavor of expressions?
@@ -351,6 +359,13 @@ object MemoryCostHelper { | |||
def calculateCost(dataSize: Long, memorySpeed: Double): Double = { | |||
(dataSize / GIGABYTE) / memorySpeed | |||
} | |||
|
|||
def isWindowExpr[INPUT <: Expression](expr: BaseExprMeta[INPUT]) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this method can be a catch all for all expressions whose cost cannot be evaluated, and the name should not refer to a specific flavor of expressions?
@@ -308,6 +312,10 @@ class GpuCostModel(conf: RapidsConf) extends CostModel { | |||
} | |||
|
|||
private def exprCost[INPUT <: Expression](expr: BaseExprMeta[INPUT], rowCount: Double): Double = { | |||
if (MemoryCostHelper.isWindowExpr(expr)) { | |||
// Window expressions are Unevaluable and accessing dataType causes an exception | |||
return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use 0d
for consistency with the uses of zero just below it. I think0.0
is actually more Scala-esque but consistency is most important.
@@ -308,6 +312,10 @@ class GpuCostModel(conf: RapidsConf) extends CostModel { | |||
} | |||
|
|||
private def exprCost[INPUT <: Expression](expr: BaseExprMeta[INPUT], rowCount: Double): Double = { | |||
if (MemoryCostHelper.isWindowExpr(expr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also make CBO best effort to minimize effect on production systems:
def exprCost(expr) = Try(doExprCost(expr)).recover {
case _ if (!RapidsConf.isTestEnabled) => 0.0
// add some diagnostics logging too, or have a counter for unsuccessful cost estimates
}.get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature is experimental and disabled by default. The implementation is likely to change a lot still so I think it might be premature for this type of hardening.
Signed-off-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Andy Grove <andygrove@nvidia.com>
build |
Signed-off-by: Andy Grove andygrove@nvidia.com
Closes #2489