-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support code coverage report with single version jar [skip ci] #4030
Conversation
mkdir -p target/jacoco_classes/ | ||
FILE=$(ls dist/target/rapids-4-spark_2.12-*.jar | grep -v test | xargs readlink -f) | ||
pushd target/jacoco_classes/ | ||
jar xf $FILE com org rapids spark3xx-common "spark${JACOCO_SPARK_VER:-301}/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about jacoco, but with the layout being different and code being under spark3xx-common, jacoco is still able to understand that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, there're some warnings like following
[WARN] Execution data for class org/apache/spark/sql/rapids/GpuBitwiseNot does not match.
[WARN] Execution data for class com/nvidia/spark/rapids/shims/v2/ShimDataSourceRDD does not match.
[WARN] Execution data for class org/apache/spark/sql/rapids/GpuStringRepeat does not match.
...
Main report was still be generated properly for com.nvidia.spark.rapids
, org.apache.spark.sql.rapids.execution
, com.nvidia.spark.rapids.shuffle
etc. Please refer to sample coverage report http://(deleted) for more details. Any suggestions to fix above warnings are welcome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Execution data for class ..." typically means that the checksum for the .class that the data was collected for didn't match the .class provided when generating the report. It would be nice for us to dig in a bit and understand if we have an actual mismatch, or if something else is happening, like the path is getting included in that checksum somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more testing found only udf
related class reported mismatch. All others can be resolved with clean environment build (i.e, spark-premerge-build.sh mvn_verify
). Is there any special handling for udf
?
New coverage report is updated from http://(deleted)
...
[INFO] Loading execution data file /home/alexzhang/dev/github.com/nvidia/spark-rapids/./udf-compiler/target/spark301/jacoco.exec.
[INFO] Loading execution data file /home/alexzhang/dev/github.com/nvidia/spark-rapids/./sql-plugin/target/spark301/jacoco.exec.
[INFO] Loading execution data file /home/alexzhang/dev/github.com/nvidia/spark-rapids/./integration_tests/target/jacoco.exec.
[INFO] Loading execution data file /home/alexzhang/dev/github.com/nvidia/spark-rapids/./tests/target/spark301/jacoco.exec.
[INFO] Loading execution data file /home/alexzhang/dev/github.com/nvidia/spark-rapids/./shims/target/spark301/jacoco.exec.
[INFO] Loading execution data file /home/alexzhang/dev/github.com/nvidia/spark-rapids/./shims/spark301/target/jacoco.exec.
[WARN] Some classes do not match with execution data.
[WARN] For report generation the same class files must be used as at runtime.
[WARN] Execution data for class com/nvidia/spark/udf/Instruction does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Instruction$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/GpuScalaUDFLogical does not match.
[WARN] Execution data for class com/nvidia/spark/udf/State$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/CFG$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$LocalDateTime$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$DateTimeFormatter does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$StringBuilder$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$DateTimeFormatter$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/CatalystExpressionBuilder$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/CatalystExpressionBuilder does not match.
[WARN] Execution data for class com/nvidia/spark/udf/BB does not match.
[WARN] Execution data for class com/nvidia/spark/udf/CFG does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$ArrayBuffer does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$ClassTag does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$StringBuilder does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$ArrayBuffer$ does not match.
[WARN] Execution data for class com/nvidia/spark/udf/LambdaReflection does not match.
[WARN] Execution data for class com/nvidia/spark/udf/State does not match.
[WARN] Execution data for class com/nvidia/spark/udf/LogicalPlanRules$$anonfun$replacePartialFunc$1 does not match.
[WARN] Execution data for class com/nvidia/spark/udf/Repr$LocalDateTime does not match.
[WARN] Execution data for class com/nvidia/spark/udf/LambdaReflection$ does not match.
[INFO] Analyzing 3964 classes.
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have any reports from before it was shut off to compare to? Unfortunately the blossom-jenkins premerge jobs have expired.
for udf classes I don't know any packaging differences there, is the output here saying tests reference com/nvidia/spark/udf/Instruction but can't find that in jar? Something in the classloader part must be happening different there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fine with me, but I would like to have a follow on issue to move the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlowe #3932 is supposed to migrate tests under the sql-plugin sub-project. That is going to mess with code coverage because one will be shaded and the other will not be. So now we need to decide if we need to collect code coverage for unit tests or not. I am inclined to say that we skip it for now unless someone else has a better solution on how to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, let's not worry about coverage for the unshaded code in unit tests for now. Almost all of our tests are integration tests, and we can tackle coverage for unit tests in a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GaryShen2008 then what we have now is fine, and if/when someone starts to complain about the issues described above we can turn off code coverage for the unit tests.
and re-enable code coverage report from premerge job Signed-off-by: Alex Zhang <alex4zhang@gmail.com>
Extract the class files from rapids-4-spark-udf jar to match the Jacoco.exec report Because they'll be modified in aggregator's shade Signed-off-by: gashen <gashen@nvidia.com>
for Jacoco report by using rapids-4-spark-udf jar Signed-off-by: Alex Zhang <alex4zhang@gmail.com>
Signed-off-by: Alex Zhang <alex4zhang@gmail.com>
build |
and re-enable code coverage report from premerge job
Signed-off-by: Alex Zhang alex4zhang@gmail.com
Test passed on test PR pxLi#226