Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECOM-3177 fix flaky js tests on dashboard.TrackEvents #11416

Merged
merged 1 commit into from
Feb 16, 2016
Merged

Conversation

schenedx
Copy link
Contributor

@schenedx schenedx commented Feb 3, 2016

@adampalay @benpatterson Please review
FYI @rock345
This is a pure js unit test change.
It's implemented according to the suggestion provided by @adampalay on the jira ticket ECOM-3177

I am not able to really answer why these test cases don't fail every single time before this change, but my hope is after this change, the test would pass all the time unless it catches a real bug.

@benpatterson
Copy link
Contributor

@jzoldak do you have time to review this?

window.edx.dashboard.generateProgramProperties]
];

for (var cnt=0; cnt< expectedArray.length; cnt ++){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: index instead of cnt

@adampalay
Copy link
Contributor

@schenedx , first pass complete

@adampalay
Copy link
Contributor

👍 , thanks!

@benpatterson
Copy link
Contributor

Hmmm tests weren't kicked off for this PR.

jenkins ok to test

@benpatterson
Copy link
Contributor

jenkins run all

@benpatterson
Copy link
Contributor

@schenedx is there a reason you're not in the edx github org? could you put in a request to #devops to make that happen? once you do that (and make sure your profile is public), your PRs will automatically kick off for you in this repo.

findCourseButtonArgs,
xseriersActionButtonArgs];

for (var index=0; index< expectedArgsForTrackLinkCall.length; index ++){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm wondering if this line will result in a jshint violation. if it does, please fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benpatterson I was able to fix the jshint errors. Thanks for pointing out.

@benpatterson
Copy link
Contributor

I don't fully understand the fix here; I don't see the spy call being removed, for example. It just seems to me it's grouped the 7 tests and is running them at once? I think that approach might be easier from a coding perspective, but if there is a failure for one of the tests, it will take more cycles to troubleshoot which click was not tracked.

Or, perhaps I'm overlooking something....Thoughts?

@schenedx
Copy link
Contributor Author

schenedx commented Feb 3, 2016

@benpatterson:
The root cause here is that within the code at https://github.com/edx/edx-platform/blob/master/lms/static/js/dashboard/track_events.js#L34, the function to be tested "edx.dashboard.trackEvents" called the "window.analytics.trackLink" function 7 times.
When this happens, the jasmine spy object which is mocking "window.analytics.trackLink" will be called 7 times in the test case. When the spy object is called 7 times, there will be 7 set of arguments associated with the spy mock object. Therefore sometimes the test case fails when "toHaveBeenCalledWith" is used with just 1 out of the seven set of the arguments.

What I am doing in my fix is to

  1. Make sure the "window.analytics.trackLink" did get called 7 times in the "edx.dashboard.trackEvents" function
  2. Make sure the arguments are correct with each of the call to the trackLink function.
    Notice with my change, I am no longer using "toHaveBeenCalledWith" function which causes the flakiness.

@schenedx
Copy link
Contributor Author

schenedx commented Feb 3, 2016

jenkins run all

1 similar comment
@schenedx
Copy link
Contributor Author

schenedx commented Feb 3, 2016

jenkins run all

@benpatterson
Copy link
Contributor

jenkins ok to test

@benpatterson
Copy link
Contributor

@schenedx Sorry if I'm being dense here, but based on your description, I would think that the spy was started/stopped at the wrong time (started before all the tests, stopped after all the tests, as opposed to started/stopped one at a time).

I'm still concerned that, in the event of one of the tests failing, it would not be obvious which one failed, because the only assertion that will fail is that the target was called 6 times instead of 7....

@schenedx
Copy link
Contributor Author

schenedx commented Feb 4, 2016

@benpatterson
I understand your concern, therefore I did some tests on my local machine.
If the argument of one of seven "window.analytics.trackLink" calls is wrong, here is the error message from the test:

edx.dashboard.trackEvents: sends analytics events when the user interacts with dashboard [fail] Expected [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bismon.dashboard.course_options_dropdown.clicked', Function ] to equal [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bi.dashboard.course_options_dropdown.clicked', Function ].

If one of the "window.analytics.trackLink" calls is missing, here is the error from the unit test:

edx.dashboard.trackEvents: sends analytics events when the user interacts with dashboard [fail] Expected 6 to be 7. Expected [ { 0 : HTMLNode, 1 : HTMLNode, 2 : HTMLNode, length : 3, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.verified-info' }, 'edx.bi.dashboard.verified_info_link.clicked', Function ] to equal [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bi.dashboard.course_options_dropdown.clicked', Function ]. Expected [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.btn-find-courses' }, 'edx.bi.dashboard.find_courses_button.clicked', { category : 'dashboard', label : null } ] to equal [ { 0 : HTMLNode, 1 : HTMLNode, 2 : HTMLNode, length : 3, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.verified-info' }, 'edx.bi.dashboard.verified_info_link.clicked', Function ]. Expected [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.xseries-action .btn' }, 'edx.bi.dashboard.xseries_cta_message.clicked', Function ] to equal [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.btn-find-courses' }, 'edx.bi.dashboard.find_courses_button.clicked', { category : 'dashboard', label : null } ]. Expected undefined to equal [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.xseries-action .btn' }, 'edx.bi.dashboard.xseries_cta_message.clicked', Function ].

Base on the message above, and if you look at the code at https://github.com/edx/edx-platform/blob/master/lms/static/js/dashboard/track_events.js , it's clear to me what the problem might be.

I feel like the trouble you are having is how the code is written, not how the unit test is written. Do you have a better way to write the unit tests based on the code in track_events.js?

@benpatterson
Copy link
Contributor

hi @schenedx thanks for taking a closer look. In part, I am concerned about the output (according to the 2nd example you posted, a user must grok more logs to figure out which one failed).

But, more generally, the js tests get collected in an xunit-style output file, which clearly calls out which testcase fails. So, a developer that might be looking over a failed test suite, will very easily understand which test failed. For example, in today's world, if the course image link no longer sends events, we'd get a test failure like this: JavaScript.firefox.edx.dashboard.TrackEvents: sends an analytics event when the user clicks course image link But if that same problem manifested itself with this code, there would be a test failure for JavaScript.firefox.edx.dashboard.trackEvents: sends analytics events when the user interacts with dashboard

I do agree that your change may help the actual flaky condition, but IMO we are taking things in the wrong direction in terms of making tests easy to understand for someone that might be new to the code. This would be accomplished through discrete tests, as opposed to groups. So, I'm not talking about the underlying production code; I'm talking about the way the tests are composed.

Now, again I'm curious about the spying mechanism here, because perhaps that's where you and I are missing each other's points. Can a spy for track_events be setup and torn down per-test, as opposed to per-test-class? I think that might be part of the issue in terms of writing discrete tests...

@schenedx
Copy link
Contributor Author

schenedx commented Feb 4, 2016

The spy is a type of mock. We are basically substituting the "window.analytics.trackLink" function with a spy function within "edx.dashboard.trackEvents". If the function we are testing calls the "trackLink" 7 times, the spy will be used 7 times. I don't see a way to unit test the "edx.dashboard.trackEvents" statement by statement. As far as I know, jasmine unit tests are meant to test the whole function, not individual logic statements within the function.
The only way I know how to test the 7 "trackLink" call is we'd have to refactor the actual code within "edx.dashboard.trackEvents" to split that into 7 different methods. If we are just changing the unit test code by itself, I don't see a good way to test each of the 7 "trackLink" call separately.

@benpatterson
Copy link
Contributor

Ok now I've had a chance to take a closer look. I see how the 2nd expect call can make it more clear which link was not defined with an event emission tied to it. So, thank you for explaining. However, I'm still not sold on grouping them together. If these are flaky, then why do they all fail together? (Because, the spy is set up on each spec, not on the entire file...so I'd expect if there was flakiness in setup, etc, that only on testcase would fail; however, all of them have failed in the past when the tests fail.)

So perhaps I don't understand how this was flaky to begin with. Were you able to reproduce the flaky condition? In an earlier comment you'd mentioned toHaveBeenCalledWith is causing the flakiness. Why?

I know I have a lot of questions on this code review. Thank you for your patience in addressing them.

@schenedx
Copy link
Contributor Author

schenedx commented Feb 5, 2016

@benpatterson
The flakiness of the original test is something I am not 100% sure about. That being said, here are the things I know:
The original test cases would call the "edx.dashboard.trackEvents" function 7 separate times. Each time it calls the function, it would only ask the "toHaveBeenCalledWith" 1 of the 7 set of arguments. I am surprised most of the time, the "toHaveBeenCalledWith" call would pass. When "toHaveBeenCalledWith" does throw error, here is the actual error message:
Expected spy analytics.trackLink to have been called with [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.course-title > a' }, 'edx.bi.dashboard.course_title.clicked', Function ] but actual calls were [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.course-title > a' }, 'edx.bi.dashboard.course_title.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.cover' }, 'edx.bi.dashboard.course_image.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.enter-course' }, 'edx.bi.dashboard.enter_course.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bi.dashboard.course_options_dropdown.clicked', Function ], [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.verified-info' }, 'edx.bi.dashboard.verified_info_link.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.btn-find-courses' }, 'edx.bi.dashboard.find_courses_button.clicked', { category : 'dashboard', label : null } ]

If you read the message above carefully, you will see the "toHaveBeenCalledWith" call is complaining the test expected just 1 set of arguments, but the "trackLink" function was called with 7 different sets of arguments. This make sense because the "trackLink" function was called 7 times within "edx.dashboard.trackEvents".
This error is actually happening much less than expected. On my own computer, I was not able to see this error happening with the "toHaveBeenCalledWith" function. I did not spend the time to dig into the jasmine implementation code to see why this error is not happening all the time. What I believe from the error message above, it make sense for the "toHaveBeenCalledWith" to throw this error every time. Therefore I think the suggestion by @adampalay make sense to not use the test "toHaveBeenCalledWith" to avoid this flakiness.

$('.xseries-action .btn'),
'edx.bi.dashboard.xseries_cta_message.clicked',
window.edx.dashboard.generateProgramProperties
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of toHaveBeenCalledWith seems to say that this call should be fine, since it's just checking that those arguments were called on the spy at least once.

http://jasmine.github.io/2.0/introduction.html#section-Spies

So, I think there may be a different root cause, here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpennington Thank you for the closer look.
First, I believe we are using Jasmine 1.3 instead of Jasmine 2. Therefore the documentation should be http://jasmine.github.io/1.3/introduction.html#section-Spies
Nevertheless, the 1.3 documentation says the same that "The toHaveBeenCalledWith matcher will return true if the argument list matches any of the recorded calls to the spy."
This explains why I cannot reproduce the issue, and why the original test fails only a couple of times for the past few month.
However, if you eye ball the last error message I collected in the above comments, there should not be any reason for the test to fail.
At this point, I am also questioning what the root cause actually is. Do you think there will be enough data to figure out the root cause?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I had to guess, I'd guess that the HTMLNode objects weren't evaluating equal to the spy, perhaps because of a timing problem.

This is the expected call

[
    {
        0 : HTMLNode,
        length : 1,
        prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 },
        context : HTMLNode,
        selector : '.course-title > a'
    },
    'edx.bi.dashboard.course_title.clicked',
    Function
]

one of the elements in the list of actual calls is

[
    {
        0 : HTMLNode,
        length : 1,
        prevObject: { 0 : HTMLNode, context : HTMLNode, length : 1 },
        context : HTMLNode,
        selector : '.course-title > a'
    },
    'edx.bi.dashboard.course_title.clicked',
    Function
]

The only places that could be different are the various HTMLNodes and the Function calls (neither of which have any identifying info in the print statement).

@schenedx
Copy link
Contributor Author

@benpatterson @adampalay @AlasdairSwan @cpennington
Please review my latest attempt at fix this flaky test.
With suggestions from @adampalay and @AlasdairSwan, we feel it's necessary to update the original code to fix the flakiness of the test.

@cpennington
Copy link
Contributor

@schenedx Can you briefly explain why changing the test this way will fix the flakiness? I'm not following enough of the fix to understand what was varying when the test broke.

'edx.bi.dashboard.verified_info_link.clicked',
window.edx.dashboard.generateTrackProperties
);
});

it('sends an analytics event when the user clicks find courses button', function() {
var $findCourse = $('.btn-find-courses');
var property = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should only be 1 var per function, comma separating multiple variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Will fix.

@schenedx
Copy link
Contributor Author

@cpennington
So, after much investigate and debugging, I believe the root cause of the flakiness is the place where jquery is called twice on the same DOM node to create the jquery objects. These two different jquery objects are compared within the "toHaveBeenCalledWith" function.
The issue might be some elements in the bowel of the jquery objects, they are different sometimes.
Without able to successfully reproduce this difference, I cannot tell exactly what is different. This is also the area you suspected to be the root cause for the flakiness.
With this thought, I think it would be better if we inject the jquery object, as well as the property function, into the function we are testing, instead of just constructing it. This is a classic dependency injection pattern that fits unit testing well.
I have confidence with this change, the unit test should no longer be flaky.

@AlasdairSwan
Copy link
Contributor

Once tests pass 👍

@benpatterson
Copy link
Contributor

I'd like to hear what @cpennington thinks, but I like this solution better insofar as keeping the specs as separate ones. Thanks for giving this a closer look.

@cpennington
Copy link
Contributor

👍, although I wonder if this is likely to bite us again in another test in the future, and whether https://github.com/velesin/jasmine-jquery might help mitigate that across the whole test suite.

@AlasdairSwan
Copy link
Contributor

@cpennington while I can't talk to whether jasmine-jquery would have helped us here I have found it very useful in other projects. +1 for adding it to edx-platform.

@adampalay
Copy link
Contributor

@AlasdairSwan
Copy link
Contributor

Excellent

@schenedx
Copy link
Contributor Author

@benpatterson do you have further comments or questions? If not, I'll merge

@cpennington
Copy link
Contributor

Yeah, I thought we might have it. I guess I just wonder if there's something we could be using with it to be more resilient to equality comparison of jquery objects being flaky.

@benpatterson
Copy link
Contributor

@schenedx you should fix your quality failure.

@schenedx
Copy link
Contributor Author

@benpatterson Right. Rookie mistake. Now fixed.

@benpatterson
Copy link
Contributor

@schenedx thanks.

I agree with @cpennington that I'm not 100% convinced this fixes the flakiness; however, it's certainly worth putting into circulation. We can revisit this if the flakiness reappears.

Also, I realize you just fixed things up, but your commit message should be more indicative of what you're commiting. I'm seeing "this is a combination of 2 commits." Could it instead be something that would be easy to understand for someone that's reading the logs?

@schenedx
Copy link
Contributor Author

@benpatterson Got it. I have updated the commit message to something easier to read.

@benpatterson
Copy link
Contributor

Thanks @schenedx

This is a 👍 from me once all tests pass.

schenedx added a commit that referenced this pull request Feb 16, 2016
ECOM-3177 fix flaky js tests on dashboard.TrackEvents
@schenedx schenedx merged commit 3fa62a6 into master Feb 16, 2016
@schenedx schenedx deleted the schen/ECOM-3177 branch February 16, 2016 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants