-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow building all docs with a local repo #659
Conversation
We current offer two ways to build the docs: 1. Build an arbitrary book with `--doc` and its companion arguments like `--resource` and `--asciidoctor`. 2. Build *all* of the books that Elastic publishes with `--all`. Building an arbitrary book is nice and quick because it is have very little overhead on top of the time it takes to build the actual book. But it isn't always the best test because the command that you use to build the book might not exactly match the invocation that `--all` uses to build the book. Furthermore, building a single book is not usually enough! When you are testing documentation changes that you've made locally you often don't know all of the books that your changes might effect. The Elasticsearch repository is used in about a dozen books in combination with a hand full of other repositories. Building all of the docs is quite slow and you might run into errors that you didn't cause. It is exactly what we need to build the docs for the public web site but it isn't right for testing some changes that you make locally. It also doesn't line up properly with local development as it only knows how to pull repos from the Elastic organization at github. This change adds a "third way" to build test the docs that splits the difference between the two options. It uses the same mechanisms that we usually use to build all of the docs but allows you to substitute local directories in place of some branch of some repo: ``` ./build_docs --all --target_repo git@github.com:elastic/built-docs.git \ --keep_hash \ --sub_dir elasticsearch:master:~/Code/elasticsearch ``` This is nice because it: 1. Rebuilds all of the books that your local directory is involved with 2. Uses standard switches when building each of those books 3. Does not rebuild books that don't use your local directory That third thing is actually caused by that new `--keep_hash` flag. It causes the build's up-to-date checking to use the same hash that was used the last time the book was built. `--keep_hash` has uses beyond being paired with `--sub_dir`. In particular, it is useful when testing what happens when you switch a book to Asciidoctor. You can switch the book to Asciidoctor in `conf.yaml` and rebuild with `--keep_hash` and the build will *only* have the asciidoctor change in it. This also drop `--no_fetch` because `--keep_hash` is better in almost every way. The goal of `--sub_dir` isn't so much that people will execute it locally, but that we can execute it on CI. I'm sure folks will execute it locally but it pretty heavy because, just like a normal `--all` build it has to: 1. Clone the `built-docs` repo or fetch updates if it is already cloned. 2. Check out the `built-docs` repo. This is truly slow because it has many, many files in it. 3. Clone all of the repos used to build every book or fetch updates if they are already cloned. I suspect in time we can fix some of these. But for now, this is a heavy thing.
build_docs
Outdated
@@ -103,8 +104,7 @@ def run_build_docs(args): | |||
'-v', | |||
'%s:/doc/%s:ro,cached' % (repo_root, repo_name) | |||
]) | |||
build_docs_args.append( | |||
'/doc/' + repo_name + path.replace(repo_root, '')) | |||
return '/doc/' + repo_name + path.replace(repo_root, '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped automatically adding it to the list because we don't want to do that for --sub_dir
.
@@ -489,6 +489,7 @@ sub init_repos { | |||
user => $Opts->{user}, | |||
url => $Opts->{target_repo}, | |||
reference => $reference_dir, | |||
keep_hash => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't keep_hash
on the target repo because the target_repo
contains the list of hashes to keep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worth a review comment, is it worth a code comment?
@@ -532,7 +534,7 @@ sub init_repos { | |||
else { | |||
$pm->start($name) and next; | |||
eval { | |||
$repo->update_from_remote() unless $Opts->{no_fetch}; | |||
$repo->update_from_remote(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I drop this feature because --keep_hash
is better in most cases.
integtest/Makefile
Outdated
.PHONY: simple_all_expected_files | ||
simple_all_expected_files: | ||
# Test the simplest possible `--all` invocation | ||
$(call BUILD_MINIMAL_ALL,/tmp/simple_all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to these call
invocations because they don't bump into make's broken up-to-date checking for directories.
@@ -44,10 +44,10 @@ contents: | |||
prefix: test | |||
current: master | |||
branches: [ master ] | |||
index: minimal.asciidoc | |||
index: index.asciidoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index.asciidoc
allows me to reuse this file in other tests.
$new .= '|asciidoctor' if $asciidoctor; | ||
|
||
return $old ne $new if $self->keep_hash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're keeping the hash then we don't need to check to see if any of the files that we're using changed so we can just return here.
} | ||
|
||
#=================================== | ||
sub tree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tree
subroutine was not used.
my ($msg) = ( $log =~ /^\w+\s+([^\n]+)/ ); | ||
my $msg; | ||
if ( $sha eq 'local' ) { | ||
$msg = 'local changes'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is worth looking at the substitution branch to try and find a commit message here. We don't even know that the substitution directory is in a git repo....
@@ -316,6 +341,7 @@ sub checkout_to { | |||
#=================================== | |||
my ( $self, $destination ) = @_; | |||
|
|||
die 'sub_dir not supported with checkout_to' if %{ $self->{sub_dirs}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is only used by the target_repo
so we don't need to support sub_dirs
. In fact, it doesn't make sense to have it!
This seems like it helps with some of the issues described in #249 I initially thought that it could help with the scenario where we need to know which books to rebuild when we make changes to shared files like https://github.com/elastic/docs/blob/master/shared/attributes.asciidoc. On closer reading, however, I don't think that's right--these parameters wouldn't catch that scenario and we'll still need to forcefully rebuild all. |
I think that is true.
I think it catches the "Check cross-doc links and perform other post-build verification on specific books using the already-built versions of everything else." bit. I don't think it covers the rest though! |
@jarpy, could you have a look at this one when you get a chance please? |
Woops, there are a lot of conflicts! I'll see about fixing those as soon as I finish what I've got locally. |
integtest/Makefile
Outdated
# First build a repository to use as the source. | ||
# REPLACE the minimal documentation in the source repo | ||
cp ../README.asciidoc /tmp/source/index.asciidoc | ||
cd /tmp/source && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this ever be run in parallel with itself? Might need a proper tmpdir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be difficult to get it to run in parallel with itself but I may as well make a proper tmpdir. You are right about that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess Docker kinda takes care of that, doesn't it? I may have been thinking too old-school.
mount_docs_repo_and_dockerify_path(resource_dir, resource_dir) | ||
build_docs_args.append(mount_docs_repo_and_dockerify_path( | ||
resource_dir, resource_dir)) | ||
elif arg == '--sub_dir': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you add features, all the manual parsing of argv
is going to cost you more and more. argparse
isn't great, but is there any value in considering it over full DIY?
(Normally, I would suggest Click, but I know you can't do packages.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've considered it on and off for a while. I think the value that DIY gets me is being able to pass through arguments that I don't know about. I only rewrite the arguments that have anything to do with paths. The cost of DIY is that I need this everywhere. At this point I think it is net simpler, but I'd be receptive to any argument to the contrary. Like, I wonder if it'd be helpful to force me to list all of the supported arguments in the python file because we plan to drop the perl file one day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I've considered it"
Cool. I trust your judgement.
@@ -489,6 +489,7 @@ sub init_repos { | |||
user => $Opts->{user}, | |||
url => $Opts->{target_repo}, | |||
reference => $reference_dir, | |||
keep_hash => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worth a review comment, is it worth a code comment?
We current offer two ways to build the docs:
--doc
and its companion arguments like--resource
and--asciidoctor
and--chunk
.--all
.Building an arbitrary book is nice and quick because it is have very
little overhead on top of the time it takes to build the actual book.
But it isn't always the best test because the command that you use to
build the book might not exactly match the invocation that
--all
usesto build the book. Furthermore, building a single book is not usually
enough! When you are testing documentation changes that you've made
locally you often don't know all of the books that your changes might
effect. The Elasticsearch repository is used in about a dozen books in
combination with a hand full of other repositories. Also! If you want
to build any books that require multiple repositories then you have to
set all of them up appropriately.
Building all of the docs doesn't have any of those problems but it isn't
wasn't compatible with building from the local filesystem. This change
fixes that by adding a
--sub_dir
option to "substitute" and "directory"in place of some branch of some repository when building all of the docs.
This overrides the standard up-to-date checks to force all books being
built with these subsitutions to be rebuilt every time. It looks like:
This is nice because it:
That third thing is actually caused by that new
--keep_hash
flag. Itcauses the build's up-to-date checking to use the same hash that was
used the last time the book was built.
--keep_hash
has uses beyondbeing paired with
--sub_dir
. In particular, it is useful when testingwhat happens when you switch a book to Asciidoctor. You can switch the
book to Asciidoctor in
conf.yaml
and rebuild with--keep_hash
andthe build will only have the asciidoctor change in it.
This also drops
--no_fetch
because--keep_hash
is better in almostevery way.
The goal of
--sub_dir
isn't so much that people will execute itlocally, but that we can execute it on CI. I'm sure folks will execute
it locally but it pretty heavy because, just like a normal
--all
buildit has to:
built-docs
repo or fetch updates if it is already cloned.many, many files in it.
built-docs
repo. This is truly slow because it hasthey are already cloned.
I suspect in time we can fix some of these. But for now, this is a heavy
thing.