You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.
Part of #1365. Usually, Lightning will insert multiple rows at once, however, this will make the error recording and skipping not that easy if it's in a batch mode. We need to support downgrading to row-by-row insert when batch insert meets an error.
Describe the feature you'd like:
Normally, Lightning will import multiple rows like this:
If we want to record and skip any row error that occurs during this insert while not interrupt other normal data rows, we need to make the insert look like this:
start transaction;
insert into t1 values (111);
insert into t1 values (222);
insert into t1 values (333);
insert into t1 values (444);
commit;
Though this may lead to performance degradation, we will only downgrade to row-by-row insert when batch insert meets an error.
returnerrors.Annotatef(err, "[%s] write rows reach max retry %d and still failed", tableName, writeRowsMaxRetryTimes)
}
returnnil
}
(*tidbBackend).WriteRows will split data into different rows and check if the error is retryable. Retryable errors are often the result of, e.g, network problems, in which case retrying is feasible. However, errors that we need to record and skip are often caused by some fundamental errors, such as the mismatched column type, for which we only need to process further by implementing new code here.
For the recording, we need to know the position and content of the data from the files for import.
For the skipping, we need to only skip the row with the error and make sure the others being insert successfully.
Furthermore, metrics and tracking information are needed, such as error counts.
The text was updated successfully, but these errors were encountered:
@JmPotato: The label(s) component/import cannot be applied. These labels are supported: Hacktoberfest, duplicate, good first issue, invalid, needs-cherry-pick-release-3.1, needs-cherry-pick-release-4.0, needs-cherry-pick-release-5.0, needs-cherry-pick-release-5.1, question, release-blocker, wontfix.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.
Feature Request
Describe your feature request related problem:
Part of #1365. Usually, Lightning will insert multiple rows at once, however, this will make the error recording and skipping not that easy if it's in a batch mode. We need to support downgrading to row-by-row insert when batch insert meets an error.
Describe the feature you'd like:
Normally, Lightning will import multiple rows like this:
If we want to record and skip any row error that occurs during this insert while not interrupt other normal data rows, we need to make the insert look like this:
Though this may lead to performance degradation, we will only downgrade to row-by-row insert when batch insert meets an error.
Implementation
br/pkg/lightning/backend/tidb/tidb.go
Lines 366 to 384 in a6f471e
(*tidbBackend).WriteRows
will split data into different rows and check if the error is retryable. Retryable errors are often the result of, e.g, network problems, in which case retrying is feasible. However, errors that we need to record and skip are often caused by some fundamental errors, such as the mismatched column type, for which we only need to process further by implementing new code here.Furthermore, metrics and tracking information are needed, such as error counts.
The text was updated successfully, but these errors were encountered: