Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: proof of concept for pre/post build commands (build.jobs) #9002

Merged
merged 12 commits into from
Mar 15, 2022

Conversation

humitos
Copy link
Member

@humitos humitos commented Mar 7, 2022

I started this POC by reading all the notes we have and taking a look at the code to understand where are the changes we require doing to make our ideas fit on the build's flow. I'm not yet happy with it and I need to keep thinking if the structure/pattern I'm proposing is good. I'm happy with the result now! 🤗

My main idea here is to move all the code related to the build itself outside the Celery task to isolate it on its own class. This new class (now named DocumentationBuilder) will be in charge of running all the commands required to build the documentation. On the other hand, the Celery task (and its helpers) will be in charge of the logic required to Read the Docs itself (e.g. updating the build's status, generating the search files, etc)

Follows config's file design doc from: https://github.com/readthedocs/readthedocs.org/pull/8190/files#diff-977beae664fbfb05837d2a830fe7432be0e4ed66c2c26317b1bb1645245c35e7R123

@ericholscher
Copy link
Member

I haven't thought about this deeply, but it seems like a good way to abstract things. I'd really like to avoid the term builder, since that's already highly overloaded in our discussions. I don't know the right word for it, but I'm 👍 on having stronger abstractions and assignment of responsibility in the code.

This does feel like a larger refactor thats sneaking in, which I don't think it is required. I'm curious if there's additional value for the user-facing feature from this refactor, or if it's mostly just doing 2 things at once since we're already touching the code?

@humitos
Copy link
Member Author

humitos commented Mar 8, 2022

This does feel like a larger refactor thats sneaking in, which I don't think it is required. I'm curious if there's additional value for the user-facing feature from this refactor, or if it's mostly just doing 2 things at once since we're already touching the code?

I started thinking this with the abstraction in mind, yes. There shouldn't be any difference for the user with or without this abstraction but I think it's better to put this related code all together than splitting it in multiple places.

I'll keep reading the current code and thinking what's the best way to implement this. Thanks for your early feedback!

@humitos
Copy link
Member Author

humitos commented Mar 8, 2022

OK. I did some progress here and it does not seem to be a lot of work and the result it's a lot clearer. It's easy to see where the code for the pre/post steps is run.

I did some QA already and multiple versions from test-builds repository are building properly. The next steps would be:

  • make sync_repository_task to work with the new structure
  • find out a better name for DocumentationBuilder class
  • take a look at the tests in case there are some ones that need to be adapted
  • implement config file pre/post keys (another PR based on this one)
  • add an integration test for the whole process (another PR based on this one)

Most build-related steps are migrated into DocumentationBuilder class.
self.vcs_repository.update_submodules(self.data.config)

def post_checkout(self):
commands = [] # self.data.config.build.jobs.post_checkout
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These config.build.job.* keys should be the only thing required to implement to make everything work. It may make sense to start this work in another PR branched off from this one so the diff is smaller and easier to review.

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I've looked through most of it and it cleans up the tasks nicely, and bring all the various build-related code into a specific place. I'd like a bit more clarity on what belongs in the tasks vs. builder, I think to better understand how to move forward (and name it :D)

readthedocs/projects/tasks/builds.py Outdated Show resolved Hide resolved
# - self.install()
# - self.build()
# - self.upload()
# (finished)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels really clean. I think the big question is the seperate of concerns. I see how we're updating the build data, and we're doing that in the task instead of the builder -- I'd like to be a bit more clear on why that is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally wrote this because I thought we would have more things to do on each state, but in the end, it's just to update the build's status and call one method from DocumentationBuilder class. So, now I think that splitting them into methods may add more overhead than another thing.

I updated this chunk of code just with small comments for each "section" and I think that may be enough for now. In the future, if we add more extra logic here, we can split them. What do you think?

readthedocs/doc_builder/builder.py Outdated Show resolved Hide resolved
readthedocs/doc_builder/builder.py Outdated Show resolved Hide resolved
readthedocs/doc_builder/builder.py Outdated Show resolved Hide resolved
It causes a MaximumRecursionError and it does not test anything different from
other tests that also uses `load_yaml_config` to perform their checks.
@humitos humitos marked this pull request as ready for review March 9, 2022 15:39
@humitos humitos requested a review from a team as a code owner March 9, 2022 15:39
@humitos
Copy link
Member Author

humitos commented Mar 9, 2022

Making this PR ready to review. I think the only thing missing here is to figure out the name we want to give to this class/concept to make it more readable. Then, we can merge it as it is and open a new PR to work on the config file required changes. That way, we are not blocking this work and we avoid generating conflicts with new changes. (see #9002 (comment))

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me. I haven't really been able to tell what changed on the tests, but I think the overall approach is great. I'm pretty happy merging it, but also hoping for at least 1 more person to review it before we ship it..

readthedocs/doc_builder/builder.py Outdated Show resolved Hide resolved
readthedocs/doc_builder/builder.py Outdated Show resolved Hide resolved
@humitos humitos enabled auto-merge March 15, 2022 09:50
@humitos humitos merged commit 2e3f208 into master Mar 15, 2022
@humitos humitos deleted the humitos/build-jobs-config branch March 15, 2022 10:02
jsquyres added a commit to jsquyres/readthedocs.org that referenced this pull request Mar 29, 2022
These env vars were accidentally removed in
readthedocs#9002.  This commit
naievely restores the 2 missing env variables in the new
get_rtd_env_vars() location.

Signed-off-by: Jeff Squyres <jeff@squyres.com>
jsquyres added a commit to jsquyres/readthedocs.org that referenced this pull request Mar 29, 2022
These env vars were accidentally removed in
readthedocs#9002.  This commit
naievely restores the 2 missing env variables in the new
get_rtd_env_vars() location.

Signed-off-by: Jeff Squyres <jeff@squyres.com>
jsquyres added a commit to jsquyres/readthedocs.org that referenced this pull request Mar 29, 2022
These env vars were accidentally removed in
readthedocs#9002.  This commit
naively restores the 2 missing env variables in the new
get_rtd_env_vars() location.

Signed-off-by: Jeff Squyres <jeff@squyres.com>
stsewd pushed a commit that referenced this pull request Mar 30, 2022
These env vars were accidentally removed in
#9002.  This commit
naively restores the 2 missing env variables in the new
get_rtd_env_vars() location.

Signed-off-by: Jeff Squyres <jeff@squyres.com>
ericholscher pushed a commit that referenced this pull request Mar 30, 2022
These env vars were accidentally removed in
#9002.  This commit
naively restores the 2 missing env variables in the new
get_rtd_env_vars() location.

Signed-off-by: Jeff Squyres <jeff@squyres.com>
ericholscher pushed a commit that referenced this pull request Mar 30, 2022
These env vars were accidentally removed in
#9002.  This commit
naively restores the 2 missing env variables in the new
get_rtd_env_vars() location.

Signed-off-by: Jeff Squyres <jeff@squyres.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants