-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive Parallelism #1726
Adaptive Parallelism #1726
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posting early comments regarding structure and high level implementation. Will continue review the internal implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it looks good to me. But will let @sanyamsinghal also review the changes
yb-voyager/src/tgtdb/yugabytedb.go
Outdated
} | ||
defer rows.Close() | ||
for rows.Next() { | ||
var uuid, metrics, status, errorStr string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can there be a possibility of Null for these vars? (use sql.NullString
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think so. At worst it would be an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think still using sql.NullString
will still be safe and fullproof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. made the changes. While making this change, I also realised that GetClusterMetrics needs to return the Status/Error as well (it is only returning the metrics right now). Will take this up in a subsequent PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few overall comments:
- Please revisit the log statements, which are going to be printed every 10sec, remove if few of those seems redundant.
- Can we add more top-level comments in the
conn_pool.go
file. The code is getting a bit more advanced now and detailed comments would help in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test the new code in conn pool is not getting triggered if adaptive parallelism is not enabled.
Felt risky because new code is interleaved with current code.
Can we refactor WithConn()
function and break it into smaller functions for simplicity and keep adaptive logic separate.
yb-voyager/src/tgtdb/conn_pool.go
Outdated
if pool.pendingConnsToClose >= 0 { | ||
// since we are increasing, we can just reduce the pending close count | ||
if pool.pendingConnsToClose > delta { | ||
log.Infof("adaptive: Decreasing pendingConnsToClose by %d", delta) | ||
pool.pendingConnsToClose -= delta | ||
delta = 0 | ||
} else { | ||
log.Infof("adaptive: Decreasing pendingConnsToClose to 0 by %d", pool.pendingConnsToClose) | ||
delta -= pool.pendingConnsToClose | ||
pool.pendingConnsToClose = 0 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pool.pendingConnsToClose > 0 {
// since we are increasing, we can just reduce the pending close count
connCount := Math.min(pool.pendingConnsToClose, delta)
delta -= connCount
pool.pendingConnsToClose -= connCount
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit confusing to be honest. 😅
Even though an if-else block is more verbose, I prefer it because it's simpler to reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Zero equality in the condition is still of no use.
- I dont think it will be confusing or hard to read. I feel it was quite verbose, You can add a comment saying -
take as much connections as possible from pool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is subjective. However, on second thoughts, it seems alright. I've made changes!
The only code that is interleaved with existing code is to do with |
Deferring this, as adaptive parallelism is not enabled by default. Moreover, we'll get a better understanding of which logs to keep once we start testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1 failing test but it was because of a DB-side issue. Ignoring that. |
import-data will periodically fetch resource usage metrics from the target yugabyteDB, and adapt the parallelism used, (i.e. parallel-jobs value) accordingly. The aim is to ensure that the cluster remains stable throughout, while ensuring that the data ingestion is as efficient as possible.
https://docs.google.com/document/d/1beD7zNtpmfYflXV1hVJ9mq_uqyCTJ9Es4titPEksSNE/edit#heading=h.3c3bf00hwf