Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script to bisect to a culprit commit. #8442

Merged
merged 1 commit into from
Feb 21, 2018

Conversation

panyx0718
Copy link
Contributor

@panyx0718 panyx0718 commented Feb 14, 2018

Fix #8449

I expect that we need it to:

  1. Find flaky test commit.
  2. Find regression commit.
    Disable test is dangerous. We should first try to find the culprit commit
    and fix it or revert it.

I managed to find 3d1ac72 (between 279aa62 and 6773129) that causes
the recent failure of test_rnn_encoder_decoder.

# See the License for the specific language governing permissions and
# limitations under the License.

# A script to bisect the mainline commits and find the culprit commit.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand the difference between this tool and git bisect. According to this comment, it seems that this tool checks only commits on the $BRANCH branch? Also, When I read the code, it seems that this tool also calls ctest repeatedly -- is that for debug unit tests with random behavior? Would it be easier for users to understand about this tool if we have a Github issue listing the design purposes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it only check BRANCH branch. (mainline). git bisect also checks feature branch. I found commits in Paddle's feature branch are not guaranteed to pass tests or compile, so it bisect there won't work anyway.

Yes, the repeat is used to debug flaky behavior. I'll create a issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GOOD = sys.argv[3]
BAD = sys.argv[4]
TEST = sys.argv[5]
BRANCH = 'develop'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to merge this tool, we actually suppose that many users would use it. If so, it seems command line flags would be more convenient than constant definitions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# Update PARALLEL for build paralleism.
#
# Example:
# python bisect.py $PWD/../Paddle2 $PWD e80059 6bba41 test_rnn_encoder_decoder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the first command line parameter (the git repo dir) by looking for the nearest ancestor directory of $PWD which has a .git subdirectory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sometimes create the build directory in a different directory outside the git repo. That way, I can mount an in-memory filesystem for faster build.

@panyx0718
Copy link
Contributor Author

Thanks, PTAL?

I expect that we need it to:
1. Find flaky test commit.
2. Find regression commit.
Disable test is dangerous. We should first try to find the culprit commit
and fix it or revert it.

I managed to find 3d1ac72 (between 279aa62 and 6773129) that causes
the recent failure of test_rnn_encoder_decoder.
@panyx0718
Copy link
Contributor Author

@wangkuiyi It seems I don't have "write access". So I can't merge this PR myself.

@panyx0718 panyx0718 merged commit 88b870e into PaddlePaddle:develop Feb 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tool to bisect culprit commits that cause test failure or regression.
2 participants