-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stream-time metric #756
Conversation
build |
Can one of the admins verify this patch? |
@@ -134,15 +135,17 @@ trait GpuHashJoin extends GpuExec with HashJoin { | |||
override def hasNext: Boolean = { | |||
while (nextCb.isEmpty && (first || stream.hasNext)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the previous PR, stream.hasNext()
can be expensive. Some iterators do their work in hasNext()
rather than next()
(i.e.: need to compute the next batch to see if there is a next batch, thus it may be attempting to grab the semaphore, fetch from disk, etc.).
We need to account for the time spent in stream.hasNext()
or the time could be significantly underreported, similar to what was reported in #658. The total time could be underreported in the same way. I think we need to change where startTime
is being setup to capture the time spent in stream.hasNext()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlowe I have made a change to include the hasNext(); please review the code; Thanks.
754f5d0
to
92a1a3a
Compare
shims/spark300/src/main/scala/com/nvidia/spark/rapids/shims/spark300/GpuHashJoin.scala
Outdated
Show resolved
Hide resolved
while (nextCb.isEmpty && (first || stream.hasNext)) { | ||
var may_continue = true | ||
while (nextCb.isEmpty && may_continue) { | ||
val startTime = System.nanoTime() | ||
if (stream.hasNext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It would be nice to take the chance to clean up the metrics collection using some of our convenience methods/classes. I think this makes the code much more readable.
withResource(new MetricRange(totalTime)) { _ =>
val upstreamHasNext = withResource(new MetricRange(streamTime)) { _ =>
stream.hasNext
}
if (upstreamHasNext) {
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will you merge the current commit first or waiting me to change it to withResource
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have it fixed first unless it is blocking you in some way.
Signed-off-by: houyu <houyu02@baidu.com>
Signed-off-by: houyu <houyu02@baidu.com>
Signed-off-by: houyu <houyu02@baidu.com>
b53bc17
to
e12b707
Compare
build |
Signed-off-by: houyu <houyu02@baidu.com>
Signed-off-by: houyu <houyu02@baidu.com>
Signed-off-by: houyu <houyu02@baidu.com>
Signed-off-by: houyu <houyu02@baidu.com>
sorry, a little chaos for this commiting... |
build |
@JustPlay is this intended to be your final version of the patch? or were you going to try and use |
Sorry. I am still a new hand in scala, i can not change it into Thanks for review the code and answer my other question(s) @revans2 |
That is fine I think the code is good and I can take a crack and cleaning it up in a follow on PR. |
* Add stream-time metric Signed-off-by: houyu <houyu02@baidu.com> Co-authored-by: zhangjishun <zhangjishun@baidu.com>
* Add stream-time metric Signed-off-by: houyu <houyu02@baidu.com> Co-authored-by: zhangjishun <zhangjishun@baidu.com>
* Add stream-time metric Signed-off-by: houyu <houyu02@baidu.com> Co-authored-by: zhangjishun <zhangjishun@baidu.com>
…IDIA#756) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com> Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Signed-off-by: houyu houyu02@baidu.com
Add stream-time metric