convertDoxygen: Add docstrings to more python objects #2526

chadrik · 2023-07-07T14:09:46Z

Description of Change(s)

With these changes, the convertDoxgen build script adds docstrings for many additional python objects.

For example, from Sdf these objects are now documented:

SpecType
Specifier
Permission
Variability
AuthoringError
GetNameForUnit
GetUnitFromName
ValueHasValidType
GetTypeForValueTypeName
GetValueTypeNameForValue
ConvertToValidMetadataDictionary
JustCreatePrimAttributeInLayer
CopySpec
ComputeAssetPathRelativeToLayer
ListOpType
CreatePrimInLayer
JustCreatePrimInLayer
CreateVariantInLayer

The primary change is that we no longer filter "file" xml type: these are necessary to find info on non-class functions.

In addition, this PR also provides a change necessary to support creation of python type stubs: the parser now captures function default arguments.

Performance impact

Casting a wider net during parsing increased parse time from 25s to 1m45s. The addition of slots shaved off ~10s and reduced memory usage.

I've run this through a profiler, and the majority of the time is spent in the xml parsing, with about half of the parsing-time spent in pxr code and the other half in the parser base class. I think there's room to make this more efficient if we can avoid instantiating XMLNode unnecessarily. I noticed that for every class/function there is a useless "doxygen" root node, so I tried collapsing that down to a single root, but it only saved 10s and it creates some other problems.

I also tried using multiprocessing to speed up parsing, but even distributing the parsing of files across 16 processes did not speed it up -- I suspect this was due to the amount of DocElement instances that needed to be serialized and marshaled across to the primary process after parsing, due to the way that the multiprocessing module works in python. If that inference is correct, this approach could be improved by filtering the resulting DocElement classes while parsing, as there are many extraneous functions being captured.

tl;dr with this change, the doc generation process takes longer, but there's not a simple solution.

This also provides changes necessary to support creation of python type stubs Changes: - do not filter "file" xml type: these are necessary to find info on non-class functions - capture function default arguments - add slots for efficiency: there are a lot of xml nodes - fix a bug with Writer.__init__

chadrik · 2023-07-07T14:12:11Z

docs/python/doxygenlib/cdWriterDocstring.py

@@ -52,29 +52,6 @@ class Writer:
    # we can combine property docstrings for getters/setters.
    propertyTable = {}

-    # Default constructor that assumes we're processing a single module
-    def __init__(self):
-        #


python doesn't support overloads like C++: you can't simply define a function twice. This __init__ was being replaced by the one below it, so calling Writer() without args never actually worked. I've fixed the use of this class in convertDoxygen

chadrik · 2023-07-07T14:12:46Z

docs/python/doxygenlib/cdWriterDocstring.py

-            logfile.write('\n'.join(lines))
-            logfile.close()
+            with open(output_file, 'w') as logfile:
+                logfile.write('\n'.join(lines))


use context manager to ensure that the file is closed in case of an exception (also, more idiomatic)

chadrik · 2023-07-07T14:13:23Z

docs/python/doxygenlib/cdParser.py

@@ -42,13 +42,18 @@ class XMLNode:
    """
    Rrepresent a single node in the XML tree.
    """
+    __slots__ = ("parent", "name", "attrs", "text", "childNodes")
+


Added slots for performance: we make a lot of these classes.

chadrik · 2023-07-07T14:13:49Z

docs/python/doxygenlib/cdDocElement.py

+    def isStatic(self):
+        """Is this doc element static?"""
+        return self.static is not None and self.static == 'yes'
+


Added isStatic and __repr__ methods for convenience

chadrik · 2023-07-07T14:15:01Z

docs/python/convertDoxygen.py

        writer = Writer(packageName, moduleName)
        # Parser.traverse builds the docElement tree for all the 
        # doxygen XML files, so we only need to call it once if we're
        # processing multiple modules
-        if (docList == None):
+        if (docList is None):


using is instead of == is considered idiomatic because None is a singleton.

jesschimein · 2023-07-07T14:48:01Z

Filed as internal issue #USD-8482

pmolodo · 2023-07-19T18:24:55Z

.gitignore

@@ -1,4 +1,4 @@
 .p4*
 .DS_Store
 .AppleDouble
-
+*.pyc


Since you're adding this - don't suppose you could add __pycache__ to the list as well?
Had that queued up in another PR, but if you just add that to your MR, would avoid a merge conflict...

Hi @chadrik, is the .gitignore change critical for the main goal of this PR? For tracking changes for USD releases, it helps if all the changes for a PR are related.

If the .gitignore change isn't specifically needed for this PR, would it be possible to move the .gitignore change (maybe combined with @pmolodo's change) to a separate PR?

pmolodo · 2023-07-26T17:51:32Z

docs/python/doxygenlib/cdParser.py

-                if (kind == "file"):
-                    continue


So I think this change is the heart of this PR - unfortunately, when we tested it, it had some negative side effects.
Specifically, many of our classes lost their docstrings after running convertDoxygen.py with this change.

One example was pxr.Usd.ZipFileWriter - in Usd/__DOC.py, we now get:

result["ZipFileWriter"].__doc__ = """"""

instead of:

result["ZipFileWriter"].__doc__ = """ Class for writing a zip file. This class is primarily intended to support the .usdz file format. It is not a general-purpose zip writer, as it does not implement the full zip file specification. However, all files written by this class should be valid zip files and readable by external zip modules and utilities. """

We're using Doxygen 1.8.18... which is old, but also newer than the 1.8.14 listed in VERSIONS.md.

@chadrik, can you confirm whether or not you're seeing the same, and what version of Doxygen you're using?

Hey @pmolodo , you made us realize we neglected to update VERSIONS.md for the addition of doxygen-awesome, which requires 1.9.x (we are using 1.9.6 internally). We will be pushing out updates to VERSIONS.md shortly, but can you retry your experiments with a 1.9.x version?

I tried a test with only removing these lines and comparing build output (using doxygen 1.8.x). The new __DOC.py appears to have duplicate (empty) values for some things, like ZipFileWriter as @pmolodo notes, which end up overwriting the earlier defined values.

This is fixed.

chadrik · 2023-07-28T00:50:41Z

I did test the results before and after my change to ensure proper results but it’s quite possible that I missed something. I was doing a synthetic test of just the docstring build rather than a full cmake build so that could also have led to differences. I will double check this on my end.

…

On Thu, Jul 27, 2023 at 5:45 PM F. Sebastian (spiff) Grassia < ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In docs/python/doxygenlib/cdParser.py <#2526 (comment)> : > - if (kind == "file"): - continue Hey @pmolodo <https://github.com/pmolodo> , you made us realize we neglected to update VERSIONS.md for the addition of doxygen-awesome, which requires 1.9.x (we are using 1.9.6 internally). We will be pushing out updates to VERSIONS.md shortly, but can you retry your experiments with a 1.9.x version? — Reply to this email directly, view it on GitHub <#2526 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAPOE34CWKYUYNUHDZD453XSL4TNANCNFSM6AAAAAA2B32OIU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

…arsing the additional xml entries These elements shadow class elements of the same name, but they lack any valuable information, and thus create empty docstrings.

chadrik · 2023-07-28T14:23:30Z

I pushed a fix for this.

…

On Thu, Jul 27, 2023 at 6:56 PM Dan Y ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In docs/python/doxygenlib/cdParser.py <#2526 (comment)> : > - if (kind == "file"): - continue I tried a test with only removing these lines and comparing build output (using doxygen 1.8.x). The new __DOC.py appears to have duplicate (empty) *doc* values for some things, like ZipFileWriter as @pmolodo <https://github.com/pmolodo> notes, which end up overwriting the earlier defined *doc* values. — Reply to this email directly, view it on GitHub <#2526 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAPOE3ZMO6WTCMP3VX6VLLXSME5XANCNFSM6AAAAAA2B32OIU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

chadrik commented Jul 7, 2023

View reviewed changes

chadrik changed the title ~~Doc stubs~~ convertDoxygen: Add docstrings to more python objects Jul 7, 2023

pmolodo reviewed Jul 19, 2023

View reviewed changes

pmolodo reviewed Jul 26, 2023

View reviewed changes

doxygenlib: filter empty innerclass elements that were generated by p…

6bfee0a

…arsing the additional xml entries These elements shadow class elements of the same name, but they lack any valuable information, and thus create empty docstrings.

Remove change to .gitignore

5ada2eb

sunyab changed the base branch from release to dev August 8, 2023 22:51

sunyab added the pending push label Aug 8, 2023

pixar-oss merged commit 5f55f2d into PixarAnimationStudios:dev Aug 16, 2023
28 of 36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convertDoxygen: Add docstrings to more python objects #2526

convertDoxygen: Add docstrings to more python objects #2526

chadrik commented Jul 7, 2023 •

edited

Loading

chadrik Jul 7, 2023 •

edited

Loading

chadrik Jul 7, 2023

chadrik Jul 7, 2023

chadrik Jul 7, 2023

chadrik Jul 7, 2023

jesschimein commented Jul 7, 2023

pmolodo Jul 19, 2023

dsyu-pixar Aug 2, 2023

chadrik Aug 5, 2023

pmolodo Jul 26, 2023

spiffmon Jul 27, 2023

dsyu-pixar Jul 28, 2023 •

edited

Loading

chadrik Aug 5, 2023

chadrik commented Jul 28, 2023 via email

chadrik commented Jul 28, 2023 via email

convertDoxygen: Add docstrings to more python objects #2526

convertDoxygen: Add docstrings to more python objects #2526

Conversation

chadrik commented Jul 7, 2023 • edited Loading

Description of Change(s)

Performance impact

chadrik Jul 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jesschimein commented Jul 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsyu-pixar Jul 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chadrik commented Jul 28, 2023 via email

chadrik commented Jul 28, 2023 via email

chadrik commented Jul 7, 2023 •

edited

Loading

chadrik Jul 7, 2023 •

edited

Loading

dsyu-pixar Jul 28, 2023 •

edited

Loading