-
-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python dependency inference: add/emit debugging information #17039
Comments
I think there's a lot of information that will be output, so something like |
There is some (informal) prior art here in the JVM backend (and the go backend, as well, actually):
Those goals dump the sources of thirdparty dependencies as JSON, and the exact extracted symbols per file (respectively). If symbol extraction were standardized across languages behind a |
I think that a |
Oh, hm! It just occurred to me that another connection to your |
See #17039. Given a testbed of <details> <summary>input</summary> ```python # Copyright 2022 Pants project contributors (see CONTRIBUTORS.md). # Licensed under the Apache License, Version 2.0 (see LICENSE). import json # unownable, root level import os.path # unownable, not root level import watchdog # dependency not included import yaml # dependency included import yamlpath # owned by other resolve try: import weakimport # weakimport missing except ImportError: ... open("src/python/configs/prod.json") # asset open("testprojects/pants-plugins/src/python/test_pants_plugin/__init__.py") ``` </details> we get <details> <summary>output</summary> ``` { "src/python/pants/backend/python/dependency_inference/t.py": { "imports": [ { "name": "weakimport", "reference": { "lineno": 12, "weak": true }, "resolved": { "status": "ImportOwnerStatus.weak_ignore", "address": [] }, "possible_resolve": null }, { "name": "json", "reference": { "lineno": 4, "weak": false }, "resolved": { "status": "ImportOwnerStatus.unownable", "address": [] }, "possible_resolve": null }, { "name": "os.path", "reference": { "lineno": 5, "weak": false }, "resolved": { "status": "ImportOwnerStatus.unownable", "address": [] }, "possible_resolve": null }, { "name": "watchdog", "reference": { "lineno": 7, "weak": false }, "resolved": { "status": "ImportOwnerStatus.unowned", "address": [] }, "possible_resolve": null }, { "name": "yaml", "reference": { "lineno": 8, "weak": false }, "resolved": { "status": "ImportOwnerStatus.unambiguous", "address": [ "3rdparty/python#PyYAML", "3rdparty/python#types-PyYAML" ] }, "possible_resolve": null }, { "name": "yamlpath", "reference": { "lineno": 9, "weak": false }, "resolved": { "status": "ImportOwnerStatus.unowned", "address": [] }, "possible_resolve": [ [ "src/python/pants/backend/helm/subsystems:yamlpath", "helm-post-renderer" ] ] } ], "assets": [ { "name": "src/python/configs/prod.json", "reference": "src/python/configs/prod.json", "resolved": { "status": "ImportOwnerStatus.unowned", "address": [] }, "possible_resolve": null }, { "name": "testprojects/pants-plugins/src/python/test_pants_plugin/__init__.py", "reference": "testprojects/pants-plugins/src/python/test_pants_plugin/__init__.py", "resolved": { "status": "ImportOwnerStatus.unambiguous", "address": [ "testprojects/pants-plugins/src/python/test_pants_plugin/__init__.py:../../../../pants_plugins_directory" ] }, "possible_resolve": null } ] } } ``` </details> Telling you, for each file, for each import, what dependencies pants thought it could have, and what it decided to do with them. This uses almost all the same code as the main dependency inference code, with the exception of the top-level orchestration of it. I think that's pretty close, there's about 100 lines of semi-duplicate code. There's also a more advanced mode that dumps information about each stage of the process. I think this might be useful for people digging through the dependency inference process but not really for end-users. we get it for free, though. Fixes #17039. --- this is fairly critical for performance, so here are benchmarks (with comparison-of-means t-test) | | main | this | difference | P-score | | --- | --- | --- | --- | --- | | `hyperfine --runs=10 './pants --no-pantsd dependencies --transitive ::'` | 21.839 s ± 0.326 s | 22.142 s ± 0.283 s | 1.38% | 0.0395 | | `hyperfine --warmup=1 --runs=10 './pants dependencies --transitive ::'` | 1.798 s ± 0.074 s | 1.811 s ± 0.076 s | 0.72% | 0.7029 | | `hyperfine --runs=10 './pants --no-pantsd dependencies ::'` | 21.547 s ± 0.640 s | 21.863 s ± 1.072 s | 1.47% | 0.4339 | | `hyperfine --warmup=1 --runs=10 './pants dependencies ::'` | 1.828 s ± 0.091 s | 1.844 s ± 0.105 s | 0.88% | 0.7200 | So it looks like this MR might impact performance, by about 1%, although those p-values are mighty unconvincing. LMK if we want to increase runs and get more statistics, I've run the stats a few times throughout and this looks about right, so I think we can proceed with the review under the assumption that there is currently a 1% performance overhead. I'm open to suggestions on improving performance.
Is your feature request related to a problem? Please describe.
There currently isn't any information emitted about what dependencies were inferred. If dependency inference fails, developers don't have much to pinpoint the error.
For concrete usecases:
Describe the solution you'd like
A good first pass would be having the ability to output,
This would help developers confirm that the imports were found and understand if the targets of the import were found or ignored. For the example, with the second usecase, the imports found I would expect something like the following to let me know that it was deliberate and acceptable that there wasn't an error not resolving the imports:
Describe alternatives you've considered
The current output is just checkpointing. With
-ltrace
, these are what is printed (many messages elided):Additional context
rel: #13283 ; this requests information on dependency inference, that requests graphing the normal dependencies. Discussion in that mentions graphing at different scopes, this mentions a scope below what could ordinarily be graphed by that.
The text was updated successfully, but these errors were encountered: