Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackstopJS hangs in Docker container #603

Closed
questionmarc opened this issue Nov 21, 2017 · 18 comments
Closed

BackstopJS hangs in Docker container #603

questionmarc opened this issue Nov 21, 2017 · 18 comments

Comments

@questionmarc
Copy link

I seem to keep having this problem where the chrome headless hangs forever until the script ends. Here's the output I'm seeing:

Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--no-sandbox","--window-size=320,480"],"port":9222,"waitTimeout":30000,"visible":false}
Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--no-sandbox","--window-size=1024,768"],"port":9223,"waitTimeout":30000,"visible":false}
Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--no-sandbox","--window-size=1600,900"],"port":9224,"waitTimeout":30000,"visible":false}
9223 Chrome v61 detected.
9223 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
9222 Chrome v61 detected.
9222 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
9224 Chrome v61 detected.
9224 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
bash-4.3#

This is utilizing the same dockerfile as the one listed in the repo. I'm not sure why it's hanging but I don't get any real output. Changing the engine to PhantomJS works, and from looking at the debug output from the debug flag, it is reaching the webpage. I tried adding some flags to the docker run from reading this project: https://github.com/yukinying/chrome-headless-browser-docker . I also tried to add the --no-sandbox option to the chromeFlags but to no avail.

So I'm not sure if this is a chrome-headless issue? An issue with chromium v61? Alpine? I'm also not sure how to debug this further.

If you have any insight as to things I could be doing to make progress, please let me know.

@questionmarc
Copy link
Author

New Development: with debug on and setting the asyncCaptureLimit to 1, I got one of the scenarios to complete. But then the other two time out.

@questionmarc
Copy link
Author

questionmarc commented Nov 21, 2017

output for that run was something like

Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--no-sandbox","--window-size=320,480"],"port":9222,"waitTimeout":30000,"visible":false}
9222 Chrome v61 detected.
9222 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
9222 INFO >  BackstopTools have been installed.
Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--no-sandbox","--window-size=1024,768"],"port":9223,"waitTimeout":30000,"visible":false}
9223 Chrome v61 detected.
9223 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--no-sandbox","--window-size=1600,900"],"port":9224,"waitTimeout":30000,"visible":false}
9224 Chrome v61 detected.
9224 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***

So it looks like the BackstopTools finished on the first one, but then timed out on the other two?

@questionmarc questionmarc changed the title Questions about Docker BackstopJS BackstopJS hangs in Docker container Nov 21, 2017
@garris
Copy link
Owner

garris commented Nov 21, 2017

What does the report look like?

@questionmarc
Copy link
Author

      COMMAND | Executing core for `report`
      compare | Chromy error: GotoTimeoutError. See scenario – scenario 1 (tablet)
      compare | Chromy error: GotoTimeoutError. See scenario – scenario 1 (desktop)
   See: /usr/src/app/backstop_data/bitmaps_test/20171121-215652/failed_diff_test_scenario_1_0_document_0_phone.png
      compare | ERROR { requireSameDimensions: false, size: isDifferent, content: undefined%, threshold: 0.1% }: scenario 1 test_scenario_1_0_document_0_phone.png
       report | Test completed...
       report | 0 Passed
       report | 3 Failed
       report | Writing jUnit Report
       report | Writing browser report
       report | jUnit report written to: backstop_data/ci_report/xunit.xml
       report | Browser reported copied
       report | Copied configuration to: /usr/src/app/backstop_data/html_report/config.js
       report | *** Mismatch errors found ***
      COMMAND | Command `report` ended with an error after [0.731s]
      COMMAND | Command `test` ended with an error after [66.935s]

One of the failures was a correct failure (the ui did infact change so I was expecting that) but the other two compares end in GotoTimeoutError.

The html report just has blank spaces where the tests failed.

@kiran-redhat
Copy link
Contributor

Change GoToTimeout to 360000 and see if still coming?

@garris, There seems to be some issue with gototimeout which is fixed in chromy .5.6 can we upgraded Backstop chromy dependency? I have done that in my fork and all seems to work fine

@questionmarc
Copy link
Author

I think the issue is because I'm trying to do too much at once. I have a 8gb backstop_data folder with the complete set of reference images I want to use for my.. 3000 some odd test cases. 😅 I tried it out locally with just the sample init file using a fork. Sure enough it didn't implode like my previous test case had. 😃

Now the real question is how I can reduce what I'm doing or break up the backstop_data into different containers.. 🤔

@questionmarc
Copy link
Author

Hm.. I think I'm wrong again about that.. seems it's just my scenarios that are failing somehow. I don't get this failure when I run locally, just in the docker container. So strange...

@questionmarc
Copy link
Author

Looks like there's an internal timer that after a certain number of seconds pass the application quits? Is this different from waitTimeout ? Can that be edited with an option?

@questionmarc
Copy link
Author

questionmarc commented Nov 22, 2017

So.. After some debug statementing.. I can confirm that it is reaching the process.on('exit') event when it exits the application. Even though the headless chrome hasn't returned from my scenario. 🤔

Edit: Maybe there's something wrong with my scenario? I'll add it below.

{
  "label": "en-us WoW Homepage Full Page check",
  "url": "https://worldofwarcraft.com/en-us/",
  "hideSelectors": [
    "body > div.body > div > div.page-Home > div.page-Home-masthead > div > div.VideoPane-bg > video",
    "body > div.body > div > div.page-Home > div.page-Home-masthead > div > div.VideoPane-content > div:nth-child(7)"
  ]
}

@questionmarc
Copy link
Author

questionmarc commented Nov 22, 2017

It seems that setting the timer on Chromy has no effect. No matter what value I set it to the scenario always ends at ~30s.

Edit: Didn't realize gotoTimeout and waitTimeout were 2 different timeout settings in chromy. 😞

@questionmarc
Copy link
Author

questionmarc commented Nov 22, 2017

Even with a gotoTimeout of 5min+ the process silently fails and eventually receives the exit signal with code 0.

Added to cli/index.js:50:

  process.on('exit', function (code) {
    console.log(`Exit signal recieved: ${code}`);
    process.exit(exitCode);
  });

And then in my docker container's bash:

Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--window-size=1024,768"],"port":9226,"waitTimeout":120000,"visible":false,"gotoTimeout":300000}
Cookie state restored with: []
SCENARIO > en-us WoW Homepage Full Page check
9226 Chrome v61 detected.
9226 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
Starting Chromy: {"chromeFlags":["--disable-gpu","--force-device-scale-factor=1","--disable-infobars=true","--window-size=1600,900"],"port":9227,"waitTimeout":120000,"visible":false,"gotoTimeout":300000}
Cookie state restored with: []
SCENARIO > en-us WoW Homepage Full Page check
9227 Chrome v61 detected.
9227 ***WARNING! CHROME VERSION 62 OR GREATER IS REQUIRED. PLEASE UPDATE YOUR CHROME APP!***
Exit signal recieved: 0

Note that the process doesn't even enter the report portion.

Edit: Kept tracking all the way down into Chromy itself. I'm not sure why but the process just hangs on the this.client.Page.navigate({url: url}) call. That's controlled by the chrome-remote-interface which just fails after the timeout and causes chromy to throw GotoTimeoutError. But it still ends the process, which is annoying. :/

@questionmarc
Copy link
Author

Ok.. I solved the issue. It was a docker issue all along. 😓 Linking the reference material that got me to solving the problem (which was ironically because I was doing research into alternatives to chromy and found puppeteer): https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md#tips

essentially adding --shm-size=2gb and --cap-add=SYS_ADMIN to my docker run command let chrome have enough shared memory to finish running the test.

What was probably happening was that the chrome process was running out of shared memory and couldn't finish loading the page (huge page, lots of assets, etc...) So then when that occurred, it was being treated as a gotoTimeout because it couldn't load the page with the shared space provided by docker by default (which is 64mb by default).

After increasing the shared memory size to 2gbs the process completed successfully. 😄

@garris
Copy link
Owner

garris commented Nov 22, 2017

@questionmarc Great -- thank you for posting a fix!

see also... #537 (comment)

@garris
Copy link
Owner

garris commented Nov 22, 2017

I just added a link to this issue in the Docs.

@kiran-redhat
Copy link
Contributor

@questionmarc @kevineye @garris I am still randomly getting GotoTimeoutError: goto() timeout when I used asyncCaptureLimit 10 (all my tests have lots of content similar to www.amazon.com) but I don't see issue when only selectors or pages with small content is tested. Is this same with you guys?

@kevineye
Copy link

kevineye commented Dec 12, 2017

@kiran-redhat When I saw this issue, either larger pages or large screencap sizes seemed to require more shared memory -- I don't remember which or if I ever tried to figure out the difference. Obviously more running at once (higher asyncCaptureLimit) also made it happen more often, but I remember I would have the problems with some pages/captures even with asyncCaptureLimite = 1 until I raised the docker shared memory available.

I did not need 2gb for my tests to run at asyncCaptureLimit=10, just 512mb, but if your pages/captures are large enough maybe you need even more, or maybe there's another problem.

I don't believe the actual error I got involved "GotoTimeoutError". I think it was WaitTimeoutError: evaluate(). I could be wrong -- I no longer have the error in front of me, but I remember it matching #537 pretty closely.

When debugging, I was able to run df and see the usage of /dev/shm going up and run ps to see chrome processes running and then getting killed (all in the backstop container while backstop was running). I also turned on enough logging (some by tweaking the src of installed js libs -- again, in the container) to actually see chrome being killed with a bus error trying to address invalid or out of bounds memory addresses.

Good luck!

@kiran-redhat
Copy link
Contributor

Hey @kevineye thanks for you analysis. I also observed that it is not working even with gotToTimeOut of 7 mins, as @questionmarc mentioned above in 'Edit' section, it is just silently failing by throwing GotoTimeoutError: goto() timeout when it can't open given url.

iain17 added a commit to iain17/BackstopJS that referenced this issue Feb 13, 2018
- Changed base image to none alpine. Allowing new chrome version.
- Updated Chrome to the latest (64). Potentially fixing issues garris#603 garris#537
garris pushed a commit that referenced this issue Feb 13, 2018
- Changed base image to none alpine. Allowing new chrome version.
- Updated Chrome to the latest (64). Potentially fixing issues #603 #537
domq pushed a commit to epfl-si/elements that referenced this issue Jul 20, 2021
Does:
- Require a decent version of Node (v0.10.48 or something is... a tad old)
- Plan for sufficient resources for Chrome (garris/BackstopJS#603 (comment))
- Set up the show
- Wait for the show to be ready (i.e. busy-wait for port 3000 to serve)
- Run the test suite

Does not (yet):
- Auto-promote new Backstop state
- Regenerate dist
- Regenerate gh-pages
domq pushed a commit to epfl-si/elements that referenced this issue Jul 20, 2021
Does:
- Require a decent version of Node (v0.10.48 or something is... a tad old)
- Plan for sufficient resources for Chrome (garris/BackstopJS#603 (comment))
- Set up the show
- Wait for the show to be ready (i.e. busy-wait for port 3000 to serve)
- Run the test suite

Does not:
- Cut the test suite short when `yarn test --docker` returns nonzero exit code (as it most likely will, every single time)

Does not (yet):
- Auto-promote new Backstop state
- Regenerate dist
- Regenerate gh-pages
domq pushed a commit to epfl-si/elements that referenced this issue Jul 21, 2021
Does:
- Require a decent version of Node (v0.10.48 or something is... a tad old)
- Plan for sufficient resources for Chrome (garris/BackstopJS#603 (comment))
- Set up the show
- Wait for the show to be ready (i.e. busy-wait for port 3000 to serve)
- Run the test suite

Does not:
- Cut the test suite short when `yarn test --docker` returns nonzero exit code (as it most likely will, every single time)

Does not (yet):
- Auto-promote new Backstop state
- Regenerate dist
- Regenerate gh-pages
@AndreKoepke
Copy link

I must also activate "Rosetta" on my Mac M1.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants