ECOM-3177 fix flaky js tests on dashboard.TrackEvents #11416

schenedx · 2016-02-03T17:24:22Z

@adampalay @benpatterson Please review
FYI @rock345
This is a pure js unit test change.
It's implemented according to the suggestion provided by @adampalay on the jira ticket ECOM-3177

I am not able to really answer why these test cases don't fail every single time before this change, but my hope is after this change, the test would pass all the time unless it catches a real bug.

benpatterson · 2016-02-03T18:04:46Z

@jzoldak do you have time to review this?

adampalay · 2016-02-03T18:08:41Z

lms/static/js/spec/dashboard/track_events_spec.js

+                    window.edx.dashboard.generateProgramProperties]
+                ];
+
+                for (var cnt=0; cnt< expectedArray.length; cnt ++){


nit: index instead of cnt

adampalay · 2016-02-03T18:11:36Z

@schenedx , first pass complete

adampalay · 2016-02-03T18:48:51Z

👍 , thanks!

benpatterson · 2016-02-03T18:54:28Z

Hmmm tests weren't kicked off for this PR.

jenkins ok to test

benpatterson · 2016-02-03T18:54:48Z

jenkins run all

benpatterson · 2016-02-03T18:56:10Z

@schenedx is there a reason you're not in the edx github org? could you put in a request to #devops to make that happen? once you do that (and make sure your profile is public), your PRs will automatically kick off for you in this repo.

benpatterson · 2016-02-03T18:56:49Z

lms/static/js/spec/dashboard/track_events_spec.js

+                                                    findCourseButtonArgs,
+                                                    xseriersActionButtonArgs];
+
+                for (var index=0; index< expectedArgsForTrackLinkCall.length; index ++){


i'm wondering if this line will result in a jshint violation. if it does, please fix it.

@benpatterson I was able to fix the jshint errors. Thanks for pointing out.

benpatterson · 2016-02-03T19:01:36Z

I don't fully understand the fix here; I don't see the spy call being removed, for example. It just seems to me it's grouped the 7 tests and is running them at once? I think that approach might be easier from a coding perspective, but if there is a failure for one of the tests, it will take more cycles to troubleshoot which click was not tracked.

Or, perhaps I'm overlooking something....Thoughts?

schenedx · 2016-02-03T19:40:31Z

@benpatterson:
The root cause here is that within the code at https://github.com/edx/edx-platform/blob/master/lms/static/js/dashboard/track_events.js#L34, the function to be tested "edx.dashboard.trackEvents" called the "window.analytics.trackLink" function 7 times.
When this happens, the jasmine spy object which is mocking "window.analytics.trackLink" will be called 7 times in the test case. When the spy object is called 7 times, there will be 7 set of arguments associated with the spy mock object. Therefore sometimes the test case fails when "toHaveBeenCalledWith" is used with just 1 out of the seven set of the arguments.

What I am doing in my fix is to

Make sure the "window.analytics.trackLink" did get called 7 times in the "edx.dashboard.trackEvents" function
Make sure the arguments are correct with each of the call to the trackLink function.
Notice with my change, I am no longer using "toHaveBeenCalledWith" function which causes the flakiness.

schenedx · 2016-02-03T19:53:37Z

jenkins run all

schenedx · 2016-02-03T20:02:54Z

jenkins run all

benpatterson · 2016-02-03T22:38:59Z

jenkins ok to test

benpatterson · 2016-02-03T22:40:52Z

@schenedx Sorry if I'm being dense here, but based on your description, I would think that the spy was started/stopped at the wrong time (started before all the tests, stopped after all the tests, as opposed to started/stopped one at a time).

I'm still concerned that, in the event of one of the tests failing, it would not be obvious which one failed, because the only assertion that will fail is that the target was called 6 times instead of 7....

schenedx · 2016-02-04T02:18:59Z

@benpatterson
I understand your concern, therefore I did some tests on my local machine.
If the argument of one of seven "window.analytics.trackLink" calls is wrong, here is the error message from the test:

edx.dashboard.trackEvents: sends analytics events when the user interacts with dashboard [fail] Expected [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bismon.dashboard.course_options_dropdown.clicked', Function ] to equal [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bi.dashboard.course_options_dropdown.clicked', Function ].

If one of the "window.analytics.trackLink" calls is missing, here is the error from the unit test:

edx.dashboard.trackEvents: sends analytics events when the user interacts with dashboard [fail] Expected 6 to be 7. Expected [ { 0 : HTMLNode, 1 : HTMLNode, 2 : HTMLNode, length : 3, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.verified-info' }, 'edx.bi.dashboard.verified_info_link.clicked', Function ] to equal [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bi.dashboard.course_options_dropdown.clicked', Function ]. Expected [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.btn-find-courses' }, 'edx.bi.dashboard.find_courses_button.clicked', { category : 'dashboard', label : null } ] to equal [ { 0 : HTMLNode, 1 : HTMLNode, 2 : HTMLNode, length : 3, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.verified-info' }, 'edx.bi.dashboard.verified_info_link.clicked', Function ]. Expected [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.xseries-action .btn' }, 'edx.bi.dashboard.xseries_cta_message.clicked', Function ] to equal [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.btn-find-courses' }, 'edx.bi.dashboard.find_courses_button.clicked', { category : 'dashboard', label : null } ]. Expected undefined to equal [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.xseries-action .btn' }, 'edx.bi.dashboard.xseries_cta_message.clicked', Function ].

Base on the message above, and if you look at the code at https://github.com/edx/edx-platform/blob/master/lms/static/js/dashboard/track_events.js , it's clear to me what the problem might be.

I feel like the trouble you are having is how the code is written, not how the unit test is written. Do you have a better way to write the unit tests based on the code in track_events.js?

benpatterson · 2016-02-04T13:37:22Z

hi @schenedx thanks for taking a closer look. In part, I am concerned about the output (according to the 2nd example you posted, a user must grok more logs to figure out which one failed).

But, more generally, the js tests get collected in an xunit-style output file, which clearly calls out which testcase fails. So, a developer that might be looking over a failed test suite, will very easily understand which test failed. For example, in today's world, if the course image link no longer sends events, we'd get a test failure like this: JavaScript.firefox.edx.dashboard.TrackEvents: sends an analytics event when the user clicks course image link But if that same problem manifested itself with this code, there would be a test failure for JavaScript.firefox.edx.dashboard.trackEvents: sends analytics events when the user interacts with dashboard

I do agree that your change may help the actual flaky condition, but IMO we are taking things in the wrong direction in terms of making tests easy to understand for someone that might be new to the code. This would be accomplished through discrete tests, as opposed to groups. So, I'm not talking about the underlying production code; I'm talking about the way the tests are composed.

Now, again I'm curious about the spying mechanism here, because perhaps that's where you and I are missing each other's points. Can a spy for track_events be setup and torn down per-test, as opposed to per-test-class? I think that might be part of the issue in terms of writing discrete tests...

schenedx · 2016-02-04T14:37:45Z

The spy is a type of mock. We are basically substituting the "window.analytics.trackLink" function with a spy function within "edx.dashboard.trackEvents". If the function we are testing calls the "trackLink" 7 times, the spy will be used 7 times. I don't see a way to unit test the "edx.dashboard.trackEvents" statement by statement. As far as I know, jasmine unit tests are meant to test the whole function, not individual logic statements within the function.
The only way I know how to test the 7 "trackLink" call is we'd have to refactor the actual code within "edx.dashboard.trackEvents" to split that into 7 different methods. If we are just changing the unit test code by itself, I don't see a good way to test each of the 7 "trackLink" call separately.

benpatterson · 2016-02-05T00:05:51Z

Ok now I've had a chance to take a closer look. I see how the 2nd expect call can make it more clear which link was not defined with an event emission tied to it. So, thank you for explaining. However, I'm still not sold on grouping them together. If these are flaky, then why do they all fail together? (Because, the spy is set up on each spec, not on the entire file...so I'd expect if there was flakiness in setup, etc, that only on testcase would fail; however, all of them have failed in the past when the tests fail.)

So perhaps I don't understand how this was flaky to begin with. Were you able to reproduce the flaky condition? In an earlier comment you'd mentioned toHaveBeenCalledWith is causing the flakiness. Why?

I know I have a lot of questions on this code review. Thank you for your patience in addressing them.

schenedx · 2016-02-05T01:31:05Z

@benpatterson
The flakiness of the original test is something I am not 100% sure about. That being said, here are the things I know:
The original test cases would call the "edx.dashboard.trackEvents" function 7 separate times. Each time it calls the function, it would only ask the "toHaveBeenCalledWith" 1 of the 7 set of arguments. I am surprised most of the time, the "toHaveBeenCalledWith" call would pass. When "toHaveBeenCalledWith" does throw error, here is the actual error message:
Expected spy analytics.trackLink to have been called with [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.course-title > a' }, 'edx.bi.dashboard.course_title.clicked', Function ] but actual calls were [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.course-title > a' }, 'edx.bi.dashboard.course_title.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.cover' }, 'edx.bi.dashboard.course_image.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.enter-course' }, 'edx.bi.dashboard.enter_course.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.wrapper-action-more' }, 'edx.bi.dashboard.course_options_dropdown.clicked', Function ], [ { 0 : HTMLNode, 1 : HTMLNode, length : 2, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.verified-info' }, 'edx.bi.dashboard.verified_info_link.clicked', Function ], [ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.btn-find-courses' }, 'edx.bi.dashboard.find_courses_button.clicked', { category : 'dashboard', label : null } ]

If you read the message above carefully, you will see the "toHaveBeenCalledWith" call is complaining the test expected just 1 set of arguments, but the "trackLink" function was called with 7 different sets of arguments. This make sense because the "trackLink" function was called 7 times within "edx.dashboard.trackEvents".
This error is actually happening much less than expected. On my own computer, I was not able to see this error happening with the "toHaveBeenCalledWith" function. I did not spend the time to dig into the jasmine implementation code to see why this error is not happening all the time. What I believe from the error message above, it make sense for the "toHaveBeenCalledWith" to throw this error every time. Therefore I think the suggestion by @adampalay make sense to not use the test "toHaveBeenCalledWith" to avoid this flakiness.

cpennington · 2016-02-05T16:14:30Z

lms/static/js/spec/dashboard/track_events_spec.js

-                    $('.xseries-action .btn'),
-                    'edx.bi.dashboard.xseries_cta_message.clicked',
-                    window.edx.dashboard.generateProgramProperties
-                );


My reading of toHaveBeenCalledWith seems to say that this call should be fine, since it's just checking that those arguments were called on the spy at least once.

http://jasmine.github.io/2.0/introduction.html#section-Spies

So, I think there may be a different root cause, here.

@cpennington Thank you for the closer look.
First, I believe we are using Jasmine 1.3 instead of Jasmine 2. Therefore the documentation should be http://jasmine.github.io/1.3/introduction.html#section-Spies
Nevertheless, the 1.3 documentation says the same that "The toHaveBeenCalledWith matcher will return true if the argument list matches any of the recorded calls to the spy."
This explains why I cannot reproduce the issue, and why the original test fails only a couple of times for the past few month.
However, if you eye ball the last error message I collected in the above comments, there should not be any reason for the test to fail.
At this point, I am also questioning what the root cause actually is. Do you think there will be enough data to figure out the root cause?

If I had to guess, I'd guess that the HTMLNode objects weren't evaluating equal to the spy, perhaps because of a timing problem.

This is the expected call

[ { 0 : HTMLNode, length : 1, prevObject : { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.course-title > a' }, 'edx.bi.dashboard.course_title.clicked', Function ]

one of the elements in the list of actual calls is

[ { 0 : HTMLNode, length : 1, prevObject: { 0 : HTMLNode, context : HTMLNode, length : 1 }, context : HTMLNode, selector : '.course-title > a' }, 'edx.bi.dashboard.course_title.clicked', Function ]

The only places that could be different are the various HTMLNodes and the Function calls (neither of which have any identifying info in the print statement).

schenedx · 2016-02-11T16:16:18Z

@benpatterson @adampalay @AlasdairSwan @cpennington
Please review my latest attempt at fix this flaky test.
With suggestions from @adampalay and @AlasdairSwan, we feel it's necessary to update the original code to fix the flakiness of the test.

cpennington · 2016-02-11T16:20:53Z

@schenedx Can you briefly explain why changing the test this way will fix the flakiness? I'm not following enough of the fix to understand what was varying when the test broke.

AlasdairSwan · 2016-02-11T16:21:07Z

lms/static/js/spec/dashboard/track_events_spec.js

                    'edx.bi.dashboard.verified_info_link.clicked',
                    window.edx.dashboard.generateTrackProperties
                );
            });

            it('sends an analytics event when the user clicks find courses button', function() {
+                var $findCourse = $('.btn-find-courses');
+                var property = {


There should only be 1 var per function, comma separating multiple variables.

Got it. Will fix.

schenedx · 2016-02-11T16:35:16Z

@cpennington
So, after much investigate and debugging, I believe the root cause of the flakiness is the place where jquery is called twice on the same DOM node to create the jquery objects. These two different jquery objects are compared within the "toHaveBeenCalledWith" function.
The issue might be some elements in the bowel of the jquery objects, they are different sometimes.
Without able to successfully reproduce this difference, I cannot tell exactly what is different. This is also the area you suspected to be the root cause for the flakiness.
With this thought, I think it would be better if we inject the jquery object, as well as the property function, into the function we are testing, instead of just constructing it. This is a classic dependency injection pattern that fits unit testing well.
I have confidence with this change, the unit test should no longer be flaky.

AlasdairSwan · 2016-02-12T19:31:53Z

Once tests pass 👍

benpatterson · 2016-02-13T17:36:20Z

I'd like to hear what @cpennington thinks, but I like this solution better insofar as keeping the specs as separate ones. Thanks for giving this a closer look.

cpennington · 2016-02-16T13:37:03Z

👍, although I wonder if this is likely to bite us again in another test in the future, and whether https://github.com/velesin/jasmine-jquery might help mitigate that across the whole test suite.

AlasdairSwan · 2016-02-16T13:50:31Z

@cpennington while I can't talk to whether jasmine-jquery would have helped us here I have found it very useful in other projects. +1 for adding it to edx-platform.

adampalay · 2016-02-16T14:01:14Z

wait, we have jasmine-jquery: https://github.com/edx/edx-platform/blob/master/cms/static/js_test.yml#L51

AlasdairSwan · 2016-02-16T14:01:50Z

Excellent

schenedx · 2016-02-16T14:13:18Z

@benpatterson do you have further comments or questions? If not, I'll merge

cpennington · 2016-02-16T14:14:31Z

Yeah, I thought we might have it. I guess I just wonder if there's something we could be using with it to be more resilient to equality comparison of jquery objects being flaky.

benpatterson · 2016-02-16T15:04:43Z

@schenedx you should fix your quality failure.

schenedx · 2016-02-16T15:58:28Z

@benpatterson Right. Rookie mistake. Now fixed.

benpatterson · 2016-02-16T16:01:39Z

@schenedx thanks.

I agree with @cpennington that I'm not 100% convinced this fixes the flakiness; however, it's certainly worth putting into circulation. We can revisit this if the flakiness reappears.

Also, I realize you just fixed things up, but your commit message should be more indicative of what you're commiting. I'm seeing "this is a combination of 2 commits." Could it instead be something that would be easy to understand for someone that's reading the logs?

schenedx · 2016-02-16T16:05:41Z

@benpatterson Got it. I have updated the commit message to something easier to read.

benpatterson · 2016-02-16T16:15:05Z

Thanks @schenedx

This is a 👍 from me once all tests pass.

ECOM-3177 fix flaky js tests on dashboard.TrackEvents

adampalay reviewed Feb 3, 2016
View reviewed changes

benpatterson reviewed Feb 3, 2016
View reviewed changes

cpennington reviewed Feb 5, 2016
View reviewed changes

schenedx force-pushed the schen/ECOM-3177 branch from 9d0d539 to 1543340 Compare February 11, 2016 15:58

AlasdairSwan reviewed Feb 11, 2016
View reviewed changes

schenedx force-pushed the schen/ECOM-3177 branch from 1543340 to 6679915 Compare February 11, 2016 16:37

schenedx force-pushed the schen/ECOM-3177 branch from 6679915 to 5716f03 Compare February 16, 2016 15:21

ECOM-3177 fix the flaky js test by doing dependency injection

f6fc711

schenedx force-pushed the schen/ECOM-3177 branch from 5716f03 to f6fc711 Compare February 16, 2016 16:04

schenedx added a commit that referenced this pull request Feb 16, 2016

Merge pull request #11416 from edx/schen/ECOM-3177

3fa62a6

ECOM-3177 fix flaky js tests on dashboard.TrackEvents

schenedx merged commit 3fa62a6 into master Feb 16, 2016

schenedx deleted the schen/ECOM-3177 branch February 16, 2016 17:45

ECOM-3177 fix flaky js tests on dashboard.TrackEvents #11416

ECOM-3177 fix flaky js tests on dashboard.TrackEvents #11416

Conversation

schenedx commented Feb 3, 2016

benpatterson commented Feb 3, 2016

Choose a reason for hiding this comment

adampalay commented Feb 3, 2016

adampalay commented Feb 3, 2016

benpatterson commented Feb 3, 2016

benpatterson commented Feb 3, 2016

benpatterson commented Feb 3, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benpatterson commented Feb 3, 2016

schenedx commented Feb 3, 2016

schenedx commented Feb 3, 2016

schenedx commented Feb 3, 2016

benpatterson commented Feb 3, 2016

benpatterson commented Feb 3, 2016

schenedx commented Feb 4, 2016

benpatterson commented Feb 4, 2016

schenedx commented Feb 4, 2016

benpatterson commented Feb 5, 2016

schenedx commented Feb 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schenedx commented Feb 11, 2016

cpennington commented Feb 11, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schenedx commented Feb 11, 2016

AlasdairSwan commented Feb 12, 2016

benpatterson commented Feb 13, 2016

cpennington commented Feb 16, 2016

AlasdairSwan commented Feb 16, 2016

adampalay commented Feb 16, 2016

AlasdairSwan commented Feb 16, 2016

schenedx commented Feb 16, 2016

cpennington commented Feb 16, 2016

benpatterson commented Feb 16, 2016

schenedx commented Feb 16, 2016

benpatterson commented Feb 16, 2016

schenedx commented Feb 16, 2016

benpatterson commented Feb 16, 2016