-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make doc builds more modular #249
Comments
Would be great to have intelligent spell checking (that's based on a dictionary we define), so we can catch misspellings for product names and such and avoid false positives (code, API names, etc). |
Re: Run pre-build checks on the Asciidoc files: There are additional things we should check for that we do not check today:
If not part of the initial scope of this issue, then at least as something we'd want to check for in the future. |
It would also be helpful if the full builds could be launched as needed on jenkins/ci or some other managed resource, since when I have to do the full build and push on my local machine it is really really slow. In particular, I'm referring to the process listed as "Pushing new versions or releases of documentation to the web" here: https://wiki.elastic.co/display/DOC. For some reason we seem to need to do this for almost every release. |
One more that occurred to me after a long night of tracking down broken links: It would be good if a link checker could be run before changes are merged. Now even when I try to proactively prevent them, it's common that more pop up only after the PR is merged, since that's when the full build is done and links are checked. |
@mgreau I believe you have some thoughts around improving our asciidoc performance? |
Two of our biggest pain points are not discovering broken cross doc links before we merge, and pushing updates that are missing content due to undefined attributes. @tsantos You were working on a script to check for undefined attributes in asciidoc files, right? Having a way to run a pre-flight check to catch them before we build would be super helpful. |
yes, sure but I have always worked with the Asciidoctor toolchain:
Having said that, this issue is not about using Asciidoctor (I don't know if this is something that has already been studied) so I can give you some links (not tested personally) about discovering broken cross doc links:
This section Catch a Missing or Undefined Attribute can help you understand how it is implemented in Asciidoctor but again it's not the same behavior with the Python implementation. Let me know if I can help you on this subject. |
@mgreau I've tried asciidoctor in the past. Although it claims to be compatible with asciidoc, there are a number of differences. Also, when I tried to run it on the definitive guide, it just hung. (this may have improved since then). Plus we've made a number of customizations to asciidoc that we'd have to migrate across.
we already have a tool for this, the question is more about the need to build all the docs before being able to run it.
Correct. Being able to have undefined attributes is considered a feature in asciidoc. Not sure I agree, but it is pretty deeply ingrained. |
Correct, there are some differences and they are documented in this Changed Syntax documentation. Also, there is a |
I've just tested out the latest asciidoctor and it builds the def guide (which it didn't before) and it is significantly faster than asciidoc. To produce a docbook 4.5 ouput, it took 1 second while asciidoc took 19s. So it is definitely worth exploring. (That said, a significant amount of time is also taken by the XSL transformations: a further 38 seconds). Docbook output is required as the asciidoctor html output doesn't support html chunking, plus docbook gives us strict link checking. It would still be a big speedup. That said, I don't know about the difference in output and error checking behaviour. Plus the customizations need to be ported over. This is a huge undertaking. |
@clintongormley cool that it works now! Also I just saw this on the asciidoc.org website:
from this commit on the asciidoc project. I was not aware of that (the commit is from September), I have sent a message to Dan Allen to know more about it.
yes sure, there is some work to do. I can take a look if we plan to work on it (cc @drewr) |
From today's festivities, this is a big one: Automate testing of conf.yaml changes--add a CI check that runs the full build from the PR for the update. |
@debadair between now and having a CI check, wouldn't things be much better if we ran a manual full build a few days before the release? Then we could fight through the issues with less time pressure. |
One general suggestion I have is to have 1 branch per minor version. This would match with the branching we use for the rest of the stack, and it means improvements can be made to unreleased versions without destabilizing the build for older versions. If we did this, I think we could integrate the docs build into unified release, so that the docs are fully built and verified along with all the artifacts. At minimum, a docs build could be triggered along with the commit shas of the projects so that we can know they will build (see https://github.com/elastic/release-manager/issues/217). |
@kevinkluge We absolutely should run a full build ahead of time--that's part of our regular procedure & should have happened this time around. The trick is that given the volume of changes that generally occur in the days leading up to a release, running it a few days ahead doesn't guarantee smooth sailing. So we tend to leave it to the bitter end because it's a time-consuming, resource-intensive task to repeat. (At a time when we are super-busy trying to test and merge updates--and we can only run one doc build at time locally.) This is really a matter of getting the docs side of things up to speed with the level of testing & automation we've put in place for everything else. @rjernst Do you mean branch the doc repo for each release, as we do on the product side? There would certainly be advantages to doing that. I'm all for making changes so that we can release specific docs and specific versions. However, not all of the docs use the same versioning as the stack. And some of our infra requires everything to be updated at once when we add new versions. That said, I would love it if we could get the docs integrated with the unified release. |
Related to #249 (comment), it would be helpful to have some way to run the https://github.com/elastic/docs/blob/master/release_docs.sh on jenkins or some other remote server, since it takes a very long time to run on my local machine. We seem to need to run it on most release days. I also had to disable the automatic doc builds, since they were finishing sooner and preventing the release_docs.sh from uploading successfully. This overlap between the two jobs might need to be considered. |
@debadair For spell checking, check out this contribution: https://github.com/checkstyle/checkstyle/blob/master/.ci/test-spelling-unknown-words.sh (referenced in elastic/beats#7456). |
A number of these things have been addressed:
Closing this issue in favor of individual issues for the remaining pieces:
|
We've outgrown the monolithic approach to building, verifying, and publishing the docs. Along the way, the doc repo has grown to the point where it's problematic for people to clone.
The most obvious signs of this are our release day delays. People don't always build the docs before merging and very rarely build everything, which is the only way to check cross doc links. Last minute and unrelated changes frequently require us to restart the doc build, and because we have to build everything, that takes a non-trivial amount of time.
This also affects our day to day workflow, which features frequent diversions to chase down unrelated build failures so we can finish the tasks at hand.
Ideally, we'd be able to:
In addition to making the doc build itself more modular, we need to figure out how to make the docs repo easier to deal with--perhaps by splitting the build infrastructure and generated docs apart. It can literally take people hours to clone the repo, so people avoid putting other doc tools there and opt out of contributing to the docs.
We've started to chip at these problems in the following PRs:
The text was updated successfully, but these errors were encountered: