-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#9570] Fix fixtures in fixtures/subfolders throwing parsing error #9714
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
…olders throwing parsing error
Hello @graciegoheen, I believe this one fixes the issue you encountered, with about 5 code line changes only. Would you be so kind as to extend me a review? Or perhaps @jtcohen6 would be willing and able to share a thumbs up/ thumbs down? 🙈 Cheers |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9714 +/- ##
==========================================
+ Coverage 88.08% 88.10% +0.01%
==========================================
Files 178 178
Lines 22433 22443 +10
==========================================
+ Hits 19761 19774 +13
+ Misses 2672 2669 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I'll deal with coverage for the one line that is missing, shortly~ |
* Transform skip_parsing (private variable of ManifestLoader.load()) into instance-attribute of ManifestLoader(), with default value False (to enable splitting of ManifestLoader.load()) * Split ManifestLoader.load(), to extract operation of PartialParsing into new method called ManifestLoader.safe_update_project_parser_files_partially() (to simplify both cognitive complexity in the codebase and mocking in unittestest) * Add "ignore" type-comments in new ManifestLoader.safe_update_project_parser_files_partially() (to silence mypy warnings regarding instance-attributes which can be initialized as None or as something else, e.g. self.saved_manifest)[1] [1] Although I wanted avoid "ignore" type-comments, it seems like addressing these mypy warnings in a stricter sense requires technical alignment and broader code changes. For example, might need to initialize self.saved_manifest as Manifest, instead of Optional[Manifest], so that PartialParsing gets inputs with type it currently expects. ... perhaps too far beyond the scope of this fix?
Hallo again @graciegoheen, Hitting the missing line in the unittest was more difficult than I originally thought, hence the delay since my last comment.
I am happy to hear any feedback~ Regards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Nice test case. Moving that code out into a separate function is definitely an improvement.
Resolves #9570
Problem
Given that:
tests/fixtures
directorydbt build
succeeds (target/manifest.json
is created)dbt test
also succeedstests/fixtures/my_model
Then:
When the user attempts to run once more
dbt test
the execution fails with an unexpected error, like:--
Although at first glance, this looked to me like the
PartialParsing
class was not doing a good job at detecting changes or deleting fixture nodes in-between compilations, after further testing I managed to reproduce the issue in two different scenarios:test/fixtures
.So the issue is somewhat more general than was originally reported. That leads me to believe is rather related to the logic that triggers partial parsing and what happens after that logic exits, e.g. error handling within the
ManifestLoader
class. In said class, the traceback is being parsed under the assumption that frames in the stack conform to an strict format:dbt-core/core/dbt/parser/manifest.py
Line 407 in e4fe839
However, in this particular case the assumption does not seem to hold, plausibly due to the exception type, how/where it was thrown, or even the presence of extra newline ('\n') or caret ('^') characters in the traceback. As a result, the exception is not gracefully caught.
In this state, the user is left with only a few options, with the easier ones perhaps being:
a) Run
dbt clean
before proceeding (to trick theManifestLoader
into skipping partial parsing altogether)b) Move the csv fixture back to their original directory — needless to say, a downgrade of the feature scope
Solution
Instead of making too many assumptions about what the
traceback.format_exc()
statement in line 405 could return (in regard to list item order or string content and form), I am suggesting to change theManifestLoader
to useException
object attributes andtraceback
object functions/attributes to zero-in on the last frame in the stack of the traceback, and extract from it the last exception details, i.e. the values ofcode
,line
andmethod
that should be added to the exc_info dictionary for graceful exception catching and logging.I argue this should be a safer alternative to adding some additional logic like if-else statements and string comparisons to deal with edge-cases where extra newlines or caret characters are present or missing.
The approach does not affect any function API either, and the content of the exc_info dictionary is filled as I believe it should, something along the lines of:
Last but not least, with the code changes I suggest, the user can of now freely move the csv fixture to subdirectories or out of them, since whenever partial parsing runs and fails due to a fixture node no longer existing, any exception traceback would be properly parsed and full parsing will kick in to ensure a working
manifest.json
is created as part of the unit test execution.Checklist
Additional Context
postgresql
adapter.make test
andpre-commit run
were successfully executed.my_model.sql
definition I used:unit_test.yml
definition I used: