Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize doc builds #14856

Merged
merged 9 commits into from
Jan 26, 2024
18 changes: 12 additions & 6 deletions ci/build_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,19 +41,25 @@ popd
rapids-logger "Build Python docs"
pushd docs/cudf
make dirhtml
make text
mkdir -p "${RAPIDS_DOCS_DIR}/cudf/"{html,txt}
mkdir -p "${RAPIDS_DOCS_DIR}/cudf/html"
mv build/dirhtml/* "${RAPIDS_DOCS_DIR}/cudf/html"
mv build/text/* "${RAPIDS_DOCS_DIR}/cudf/txt"
if [[ "${RAPIDS_BUILD_TYPE}" != "pull-request" ]]; then
make text
mkdir -p "${RAPIDS_DOCS_DIR}/cudf/txt"
mv build/text/* "${RAPIDS_DOCS_DIR}/cudf/txt"
fi
popd

rapids-logger "Build dask-cuDF Sphinx docs"
pushd docs/dask_cudf
make dirhtml
make text
mkdir -p "${RAPIDS_DOCS_DIR}/dask-cudf/"{html,txt}
mkdir -p "${RAPIDS_DOCS_DIR}/dask-cudf/html"
mv build/dirhtml/* "${RAPIDS_DOCS_DIR}/dask-cudf/html"
mv build/text/* "${RAPIDS_DOCS_DIR}/dask-cudf/txt"
if [[ "${RAPIDS_BUILD_TYPE}" != "pull-request" ]]; then
make text
mkdir -p "${RAPIDS_DOCS_DIR}/dask-cudf/txt"
mv build/text/* "${RAPIDS_DOCS_DIR}/dask-cudf/txt"
fi
popd

rapids-upload-docs
1 change: 1 addition & 0 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ dependencies:
- sphinx-autobuild
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx-remove-toctrees
- sphinxcontrib-websupport
- streamz
- sysroot_linux-64==2.17
Expand Down
1 change: 1 addition & 0 deletions conda/environments/all_cuda-120_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ dependencies:
- sphinx-autobuild
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx-remove-toctrees
- sphinxcontrib-websupport
- streamz
- sysroot_linux-64==2.17
Expand Down
1 change: 1 addition & 0 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -472,6 +472,7 @@ dependencies:
- sphinx-autobuild
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx-remove-toctrees
- sphinxcontrib-websupport
notebooks:
common:
Expand Down
13 changes: 12 additions & 1 deletion docs/cudf/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,12 @@
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import filecmp
import glob
import os
import re
import sys
import tempfile
import xml.etree.ElementTree as ET

from docutils.nodes import Text
Expand Down Expand Up @@ -62,13 +64,16 @@ class PseudoLexer(RegexLexer):
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"sphinx_copybutton",
"sphinx_remove_toctrees",
"numpydoc",
"IPython.sphinxext.ipython_console_highlighting",
"IPython.sphinxext.ipython_directive",
"PandasCompat",
"myst_nb",
]

remove_from_toctrees = ["user_guide/api_docs/api/*"]


# Preprocess doxygen xml for compatibility with latest Breathe
def clean_definitions(root):
Expand Down Expand Up @@ -126,7 +131,13 @@ def clean_all_xml_files(path):
for fn in glob.glob(os.path.join(path, "*.xml")):
tree = ET.parse(fn)
clean_definitions(tree.getroot())
tree.write(fn)
with tempfile.NamedTemporaryFile() as tmp_fn:
tree.write(tmp_fn.name)
Comment on lines +134 to +135
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this save time? How is it faster to write to a tempfile, compare to fn, and then write fn versus just writing fn?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that this particular part is slower, but sphinx probably has some make-like timestamp-based "do I rebuild dependents of this file" logic. So by avoiding updating a file with a newer timestamp and the same contents, something downstream does less work.

Copy link
Contributor Author

@vyasr vyasr Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. This change no longer has a meaningful impact in PR CI because we don't build text docs, but in branch/nightly builds (because we build html and then text) or local rebuilds it will make things faster because without this Sphinx tries to read the entire set of files again thinking that they have changed. Instead of calling tree.write here I could also have used os.rename/shutil.move etc but the amount of time taken by this part of the code is negligible and I didn't feel like dealing with issues around manual clean up of the temp file etc.

# Only write files that have actually changed.
if not filecmp.cmp(tmp_fn.name, fn):
tree.write(fn)




# Breathe Configuration
Expand Down
Loading