-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convertDoxygen: Add docstrings to more python objects #2526
Conversation
This also provides changes necessary to support creation of python type stubs Changes: - do not filter "file" xml type: these are necessary to find info on non-class functions - capture function default arguments - add slots for efficiency: there are a lot of xml nodes - fix a bug with Writer.__init__
@@ -52,29 +52,6 @@ class Writer: | |||
# we can combine property docstrings for getters/setters. | |||
propertyTable = {} | |||
|
|||
# Default constructor that assumes we're processing a single module | |||
def __init__(self): | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python doesn't support overloads like C++: you can't simply define a function twice. This __init__
was being replaced by the one below it, so calling Writer()
without args never actually worked. I've fixed the use of this class in convertDoxygen
logfile.write('\n'.join(lines)) | ||
logfile.close() | ||
with open(output_file, 'w') as logfile: | ||
logfile.write('\n'.join(lines)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use context manager to ensure that the file is closed in case of an exception (also, more idiomatic)
@@ -42,13 +42,18 @@ class XMLNode: | |||
""" | |||
Rrepresent a single node in the XML tree. | |||
""" | |||
__slots__ = ("parent", "name", "attrs", "text", "childNodes") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added slots for performance: we make a lot of these classes.
def isStatic(self): | ||
"""Is this doc element static?""" | ||
return self.static is not None and self.static == 'yes' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added isStatic
and __repr__
methods for convenience
writer = Writer(packageName, moduleName) | ||
# Parser.traverse builds the docElement tree for all the | ||
# doxygen XML files, so we only need to call it once if we're | ||
# processing multiple modules | ||
if (docList == None): | ||
if (docList is None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using is
instead of ==
is considered idiomatic because None
is a singleton.
Filed as internal issue #USD-8482 |
.gitignore
Outdated
@@ -1,4 +1,4 @@ | |||
.p4* | |||
.DS_Store | |||
.AppleDouble | |||
|
|||
*.pyc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're adding this - don't suppose you could add __pycache__
to the list as well?
Had that queued up in another PR, but if you just add that to your MR, would avoid a merge conflict...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chadrik, is the .gitignore change critical for the main goal of this PR? For tracking changes for USD releases, it helps if all the changes for a PR are related.
If the .gitignore change isn't specifically needed for this PR, would it be possible to move the .gitignore change (maybe combined with @pmolodo's change) to a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
if (kind == "file"): | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this change is the heart of this PR - unfortunately, when we tested it, it had some negative side effects.
Specifically, many of our classes lost their docstrings after running convertDoxygen.py
with this change.
One example was pxr.Usd.ZipFileWriter
- in Usd/__DOC.py
, we now get:
result["ZipFileWriter"].__doc__ = """"""
instead of:
result["ZipFileWriter"].__doc__ = """
Class for writing a zip file. This class is primarily intended to
support the .usdz file format. It is not a general-purpose zip writer,
as it does not implement the full zip file specification. However, all
files written by this class should be valid zip files and readable by
external zip modules and utilities.
"""
We're using Doxygen 1.8.18... which is old, but also newer than the 1.8.14 listed in VERSIONS.md.
@chadrik, can you confirm whether or not you're seeing the same, and what version of Doxygen you're using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pmolodo , you made us realize we neglected to update VERSIONS.md for the addition of doxygen-awesome, which requires 1.9.x (we are using 1.9.6 internally). We will be pushing out updates to VERSIONS.md shortly, but can you retry your experiments with a 1.9.x version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a test with only removing these lines and comparing build output (using doxygen 1.8.x). The new __DOC.py appears to have duplicate (empty) values for some things, like ZipFileWriter as @pmolodo notes, which end up overwriting the earlier defined values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed.
I did test the results before and after my change to ensure proper results
but it’s quite possible that I missed something. I was doing a synthetic
test of just the docstring build rather than a full cmake build so that
could also have led to differences. I will double check this on my end.
…On Thu, Jul 27, 2023 at 5:45 PM F. Sebastian (spiff) Grassia < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/python/doxygenlib/cdParser.py
<#2526 (comment)>
:
> - if (kind == "file"):
- continue
Hey @pmolodo <https://github.com/pmolodo> , you made us realize we
neglected to update VERSIONS.md for the addition of doxygen-awesome, which
requires 1.9.x (we are using 1.9.6 internally). We will be pushing out
updates to VERSIONS.md shortly, but can you retry your experiments with a
1.9.x version?
—
Reply to this email directly, view it on GitHub
<#2526 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAPOE34CWKYUYNUHDZD453XSL4TNANCNFSM6AAAAAA2B32OIU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
…arsing the additional xml entries These elements shadow class elements of the same name, but they lack any valuable information, and thus create empty docstrings.
I pushed a fix for this.
…On Thu, Jul 27, 2023 at 6:56 PM Dan Y ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/python/doxygenlib/cdParser.py
<#2526 (comment)>
:
> - if (kind == "file"):
- continue
I tried a test with only removing these lines and comparing build output
(using doxygen 1.8.x). The new __DOC.py appears to have duplicate (empty)
*doc* values for some things, like ZipFileWriter as @pmolodo
<https://github.com/pmolodo> notes, which end up overwriting the earlier
defined *doc* values.
—
Reply to this email directly, view it on GitHub
<#2526 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAPOE3ZMO6WTCMP3VX6VLLXSME5XANCNFSM6AAAAAA2B32OIU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Description of Change(s)
With these changes, the
convertDoxgen
build script adds docstrings for many additional python objects.For example, from Sdf these objects are now documented:
The primary change is that we no longer filter "file" xml type: these are necessary to find info on non-class functions.
In addition, this PR also provides a change necessary to support creation of python type stubs: the parser now captures function default arguments.
Performance impact
Casting a wider net during parsing increased parse time from 25s to 1m45s. The addition of slots shaved off ~10s and reduced memory usage.
I've run this through a profiler, and the majority of the time is spent in the xml parsing, with about half of the parsing-time spent in pxr code and the other half in the parser base class. I think there's room to make this more efficient if we can avoid instantiating
XMLNode
unnecessarily. I noticed that for every class/function there is a useless "doxygen" root node, so I tried collapsing that down to a single root, but it only saved 10s and it creates some other problems.I also tried using
multiprocessing
to speed up parsing, but even distributing the parsing of files across 16 processes did not speed it up -- I suspect this was due to the amount ofDocElement
instances that needed to be serialized and marshaled across to the primary process after parsing, due to the way that themultiprocessing
module works in python. If that inference is correct, this approach could be improved by filtering the resultingDocElement
classes while parsing, as there are many extraneous functions being captured.tl;dr with this change, the doc generation process takes longer, but there's not a simple solution.