Adding new __repr__ for pyspark StructField such that the error logs explicitly show metadata differences #77
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Helps address #76 Added a new class
StructFieldPrettyPrint
that will allow better representation of theStructFIeld
type to show the name, data type, nullability, and the metadata. Currently pyspark's__repr__
attribute (docs) only returns:This is not ideal when users want to compare all the attributes including metadata since it won't show up in the error message.
The new
__repr__
in theStructFieldPrettyPrint
will override the pysparkStructField
's__repr__
method with something more explicit:Type of change
How has this been tested?
pytest tests/
)