You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The profiling tool path filter out file names containing a . . Thus lz4 event log files produced with the config spark.eventLog.compress=true which is standard for production deployments are skipped. The remainder of the logic relying on the SHS event log reader supports lz4 (default), snappy, zstd , lzf
Steps/Code to reproduce bug
Run some integration tests e.g.:
Describe the bug
The profiling tool path filter out file names containing a
.
. Thus lz4 event log files produced with the configspark.eventLog.compress=true
which is standard for production deployments are skipped. The remainder of the logic relying on the SHS event log reader supports lz4 (default), snappy, zstd , lzfSteps/Code to reproduce bug
Run some integration tests e.g.:
This will generate compressed event logs under
integration_tests/target/run_dir/eventlog_gw0
Then run:
The tool silently quits: I think we need an actionable log message for legitimately empty input directories
Expected behavior
The tool should generate reports for SHS-supported compression codecs. I verified that it works if you modify the filter https://github.com/NVIDIA/spark-rapids/blame/3e303c0dae26d1a8ef01fa271ee5f0f43759e865/tools/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileMain.scala#L73
and
https://github.com/NVIDIA/spark-rapids/blame/3e303c0dae26d1a8ef01fa271ee5f0f43759e865/tools/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileMain.scala#L91
Environment details (please complete the following information)
local dev
Additional context
N/A
The text was updated successfully, but these errors were encountered: