Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIP support for additional linters (html-proofer link checker) #410

Merged
merged 4 commits into from
Jun 6, 2018

Conversation

budziq
Copy link
Collaborator

@budziq budziq commented May 9, 2018

This is very much work in progress so please do not merge for now.

  • added additional test matrix items just for linters
  • install and build mdbook only when needed to speedup CI
  • reorganized the travis scripts
  • deployment is run from linter build
  • linkcheck build that is likely to have spurious failures - is currently set as "allow_failure" to not block builds
  • added spellchecker to the ci
  • moved link checking to https://github.com/budziq/link-checker.
    • local link problems are a hard error
    • external link arrors are just printed for informational purposes (there are false positives on docs.rs sites chrono and same_file which have javascript redirects)

TODO:

  • update contributing and PR template with information about the linters workflow
  • find a link checker that works with nested mdBook directory output (or fix mdBook)

fixes #356
fixes #358

@budziq budziq force-pushed the html-proofer branch 6 times, most recently from f5f1379 to 1f2e42f Compare May 10, 2018 08:35
@@ -0,0 +1,327 @@
personal_ws-1.1 en 0 utf-8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file looks like it will become a challenge to maintain. What is the plan to keep this up to date?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I know that this is not ideal.
This is actually how rust-book manages spellchecking in their CI.
A contributor would have to add missing words as lines to somewgere in the dictionary.txt . The spellchecker already uses aspell-en dictionary so most English vocabulary is already present and some technical lingo and variable names still need to be added.

@AndyGauge
Copy link
Collaborator

I really like this idea. I'm not sure if CI is the right place for it though. Doesn't Travis prevent you from making external calls?

@budziq
Copy link
Collaborator Author

budziq commented May 10, 2018

Doesn't Travis prevent you from making external calls?

Surprisingly it does not! This approach to CI linkchecking seams to be fairly common with jeckyl blogs. The only problem is that the communication is not fully deterministic and large number of concurrent requests also causes spurious failures (html-proofer does not seam to have an easy way to throttle the connection count).

@budziq budziq force-pushed the html-proofer branch 3 times, most recently from c1d9ada to 04dac3f Compare May 14, 2018 23:33
@budziq
Copy link
Collaborator Author

budziq commented May 15, 2018

I'm relatively happy with the setup now.

  • Spellchecking is mandatory via separate Travis CI build target. It should not be that hard to maintain. Although we would need update the contributing and PR template to mention it.
  • I'm not totally satisfied with Link checking yet. It is made optional (its spurious failures will not cause whole CI build to fail). But it is easily accessible by going into given Travis build.

Do you have any thoughts @AndyGauge?

@AndyGauge
Copy link
Collaborator

AndyGauge commented May 15, 2018

the spell checker works pretty good locally. I would suggest adding a message explaining how to add entries to the dictionary that displays in CI if there's a failure.

htmlproofer found 2556 dead links in my reorganize branch, and I'm trying to figure out what is going wrong. It has to do with relative paths, so I'd like to figure that out.

EDIT The only workaround I could find for the spurious errors was --external-only

@budziq
Copy link
Collaborator Author

budziq commented May 16, 2018

EDIT The only workaround I could find for the spurious errors was --external-only

Hmm it looks like the html-proofer does not handle the internal sidebar links of html files in subdirectories (only these exhibit problems mentioned by you). The sidebar links are relative to the index.html location while the html-proofer seams to assume that these would be relative to the origin html location
so links in rust-cookbook/web/mime.html pointing to rust-cookbook/about.html become links to rust-cookbook/web/about.html and links to rust-cookbook/web/mime.html from the web dir turn to rust- cookbook/web/web/mime.html and so on.

I do not experience the problem due to flat structure of the original cookbook.

On the other side using --external-only option causes us to miss broken internal links like the one to
complex.html#serialize-and-deserialize-unstructured-json in index.html, intro.html and encoding.html in our build.

We probably need to choose another linkchecker tool.


By the way I love hov the reorganization work starts to look like ❤️ ! @AndyGauge

EDIT:
looks like the official w3.org validator also does not like the nested local links the mdBook generates
https://validator.w3.org/checklink?uri=http%3A%2F%2Fwww.yetanother.site%2Frust-cookbook%2F&hide_type=all&recursive=on&depth=&check=Check

@budziq
Copy link
Collaborator Author

budziq commented May 16, 2018

It looks like the mdBook uses <base href="../"> which link checkers do not follow. I'm experimenting with writing my own link checker script based on https://github.com/mike42/mdcheckr

@AndyGauge
Copy link
Collaborator

That does look like the problem, but I'm seeing this gjtorikian/html-proofer#358

Let me do some experimentation to see if html proofer just needs a PR to fix the issue.

@budziq
Copy link
Collaborator Author

budziq commented May 17, 2018

I've written a small link checker in python https://github.com/budziq/link-checker that seams to work ok (minus few false positives in docs.rs most likely due to to many connections) on restructured cookbook (supports <base> and internal anchors which does not seam to be the case for most other linkcheckers)

@AndyGauge
Copy link
Collaborator

AndyGauge commented May 19, 2018

@budziq What about rust's linkchecker? https://github.com/rust-lang/rust/tree/master/src/tools/linkchecker

Edit: It actually wouldn't be a good fit for the CI environment, even adapted. I found it looking around rust-lang/mdBook#279

@budziq
Copy link
Collaborator Author

budziq commented May 19, 2018

@AndyGauge the linkchecker I've written works fairly well now and should have nonfalse positives for local links https://github.com/budziq/link-checker

I'll try to find some time next week to workout the remaining kinks and integrate it into this PR. I guess I'll split the linkchecking tontwo phases

  • local links only - which should be reliable and can fail CI
  • remote links - that cannot be made fully reliable and would be only printed for informational purposes.

- added additional test matrix item just for linters
- install and build mdbook only when needed to speedup CI
- reorganized the travis scripts
@budziq budziq force-pushed the html-proofer branch 2 times, most recently from d5dc51e to 86aae58 Compare May 21, 2018 20:18
Copy link
Collaborator

@AndyGauge AndyGauge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran pip3 install link-checker and then ran link-checker and it says command not found. I'll dig into it more.

@budziq
Copy link
Collaborator Author

budziq commented May 22, 2018

@AndyGauge if youbhave any outputbfrom instalation process then please post it as an issuebto the link-checker repo. It might be error with packaging. But most likely you have installed itbas a user and nit system wide and the location of user installed pip executables is nit addedbto your PATH

@AndyGauge
Copy link
Collaborator

I was able to get it to install with a few hoops. It was my first pip install so some of the workflow wasn't obvious. I was able to get all the dead links resolved in reorganize. I like the link-checker that you made, it even catches bad named anchors and redirection.

Do you want me to do the updates to the template/Contributing to support this?

@AndyGauge AndyGauge mentioned this pull request Jun 5, 2018
11 tasks
@budziq
Copy link
Collaborator Author

budziq commented Jun 6, 2018

I like the link-checker that you made, it even catches bad named anchors and redirection.

🙂

Do you want me to do the updates to the template/Contributing to support this?

That would be awesome. I'm really low on time these days :/

Copy link
Collaborator

@AndyGauge AndyGauge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to approve this PR as is and add another PR for the templates.

@AndyGauge AndyGauge merged commit 0c167cb into rust-lang-nursery:master Jun 6, 2018
AndyGauge added a commit to AndyGauge/rust-cookbook that referenced this pull request Jul 2, 2018
support for additional linters (link-checker & aspell) (rust-lang-nursery#410)

* Added support for additional linters (html-proofer link checker)

- added additional test matrix item just for linters
- install and build mdbook only when needed to speedup CI
- reorganized the travis scripts

* Added spellchecking script from rust-book to CI

also fixed minor typos

* updated and moved serde_json links to propper position

* move link checking to link-checker

Move errors reorganize datetime add contributing instructions merge master

Renamed categories to avoid confusion fixed broken links

Use mdbook master branch until 0.1.8 releases

Revert "Use mdbook master branch until 0.1.8 releases"

This reverts commit a8dd8e3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have a spellchecker in CI Have a link checker in the CI
2 participants