Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Holoscan Flow Benchmarking Logging Updates #555

Merged
merged 5 commits into from
Nov 12, 2024

Conversation

tbirdso
Copy link
Contributor

@tbirdso tbirdso commented Oct 29, 2024

Changes:

  • Update Holoscan Flow Benchmarking benchmark.py to use standard Python logging module
  • Increase missing log file severity from warning to error
  • Fix paths for log file existence check to consider specified log directory

Log files may be missing if the underlying ./run launch command fails to run the application, such as in the event of a bad app configuration or run environment.

Copy link
Contributor

@sohamm17 sohamm17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbirdso Great improvements! I have left a few suggestions which are necessary, in my opinion.

Also, could you kindly show some sample outputs of the logging module in python that we are getting it here.

benchmarks/holoscan_flow_benchmarking/benchmark.py Outdated Show resolved Hide resolved
benchmarks/holoscan_flow_benchmarking/benchmark.py Outdated Show resolved Hide resolved
benchmarks/holoscan_flow_benchmarking/benchmark.py Outdated Show resolved Hide resolved
benchmarks/holoscan_flow_benchmarking/benchmark.py Outdated Show resolved Hide resolved
Changes:
- Adopts standard `logging` module to enhance granular control and
  redirection of benchmark script logging
- Updates missing log files from warning to error severity

Signed-off-by: Tom Birdsong <tbirdsong@nvidia.com>
Refactors code and addresses issue where log file parent directory was
not appended to path search for existence, resulting in only the local
directory being checked for log files existence.

NVBug 4933187

Signed-off-by: Tom Birdsong <tbirdsong@nvidia.com>
@tbirdso tbirdso force-pushed the test-benchmarking-log-failures branch from 157d8ee to 89dfefd Compare November 8, 2024 06:22
Changes:
- Add `--level` option to control logging verbosity and use `logger`
  object throughout
- Refactor logging so that each output filepath is printed exactly once.
  Available files are printed with INFO verbosity, missing files are
  printed with ERROR verbosity.
- Exit with error if log files are missing

Sample log printout:
```
2024-11-08 05:23:47,323 INFO: Log file directory: /workspace/holohub/log_directory_20241108-052346
2024-11-08 05:23:47,324 INFO: Log files are available:
2024-11-08 05:23:47,324 ERROR: Log files are missing: /workspace/holohub/log_directory_20241108-052346/logger_greedy_1_1.log
2024-11-08 05:23:47,324 ERROR: Some log files are missing. Please check the log directory.
```

Signed-off-by: Tom Birdsong <tbirdsong@nvidia.com>
Updates `run` script to propagate errors from `run launch` as the `run`
script exit code.

Allows Holoscan Flow Benchmarking tools to detect when a HoloHub
application process exits abnormally.

Signed-off-by: Tom Birdsong <tbirdsong@nvidia.com>
Changes:
- Adds logging handler to log detailed (DEBUG) output to a
  "benchmark.log" file in the given run log directory. Adds a stream
  handler to preserve control over CLI output verbosity with the
  "--level" CLI argument.
- Logs stdout and stderr from an failing process to aid error investigation.
  Includes thread ID.

Signed-off-by: Tom Birdsong <tbirdsong@nvidia.com>
@tbirdso tbirdso force-pushed the test-benchmarking-log-failures branch from 89dfefd to 06e815f Compare November 8, 2024 06:30
@tbirdso
Copy link
Contributor Author

tbirdso commented Nov 8, 2024

Added quality-of-life updates:

  • Cleaned up output file existence log messages so that it is clear whether files are present or missing, and updated script behavior to exit with error if output files are missing
  • Added --level CLI argument to control console output verbosity
  • Added file handler with logging.DEBUG severity to capture verbose output in the specified log directory (benchmark.log)
  • Log stdout and stderr from each returned concurrent instance with DEBUG verbosity to aid log file failure investigations
  • Updated HoloHub run script to propagate exit code from application command so that benchmarking can detect process failures (@jjomier please review)

Failed run example (headless SSH):

$ python ./benchmarks/holoscan_flow_benchmarking/benchmark.py -a endoscopy_tool_tracking  --sched greedy -i2
2024-11-08 06:34:25,079 INFO: Run 1 started for greedy scheduler.
2024-11-08 06:34:25,375 ERROR: Command "./run launch endoscopy_tool_tracking cpp" exited with code 134
2024-11-08 06:34:25,398 ERROR: Command "./run launch endoscopy_tool_tracking cpp" exited with code 134
2024-11-08 06:34:25,398 INFO: Run 1 completed for greedy scheduler.
2024-11-08 06:34:26,398 INFO: ****************************************************************
2024-11-08 06:34:26,398 INFO: Evaluation completed.
2024-11-08 06:34:26,398 INFO: ****************************************************************
2024-11-08 06:34:26,399 INFO: Log file directory: /workspace/holohub/log_directory_20241108-063425
2024-11-08 06:34:26,399 INFO: Log files are available: 
2024-11-08 06:34:26,399 ERROR: Log files are missing: /workspace/holohub/log_directory_20241108-063425/logger_greedy_1_1.log, /workspace/holohub/log_directory_20241108-063425/logger_greedy_1_2.log
2024-11-08 06:34:26,399 ERROR: Some log files are missing. Please check the log directory.

Corresponding benchmark.log with stdout, stderr output:
benchmark.20241108-063425.log

Successful run example, with an additional run and GPU utilization enabled for purpose of demonstration:

$ xvfb-run -a python ./benchmarks/holoscan_flow_benchmarking/benchmark.py -a endoscopy_tool_tracking  --sched greedy -i2 -r2 -u
2024-11-08 06:35:18,303 INFO: Run 1 started for greedy scheduler.
2024-11-08 06:35:18,304 INFO: Monitoring GPU utilization in a separate thread
2024-11-08 06:35:22,322 INFO: Run 1 completed for greedy scheduler.
2024-11-08 06:35:23,323 INFO: Run 2 started for greedy scheduler.
2024-11-08 06:35:23,324 INFO: Monitoring GPU utilization in a separate thread
2024-11-08 06:35:27,333 INFO: Run 2 completed for greedy scheduler.
2024-11-08 06:35:28,334 INFO: ****************************************************************
2024-11-08 06:35:28,335 INFO: Evaluation completed.
2024-11-08 06:35:28,335 INFO: ****************************************************************
2024-11-08 06:35:28,335 INFO: Log file directory: /workspace/holohub/log_directory_20241108-063518
2024-11-08 06:35:28,335 INFO: Log files are available: /workspace/holohub/log_directory_20241108-063518/logger_greedy_1_1.log, /workspace/holohub/log_directory_20241108-063518/logger_greedy_1_2.log, /workspace/holohub/log_directory_20241108-063518/logger_greedy_2_1.log, /workspace/holohub/log_directory_20241108-063518/logger_greedy_2_2.log
2024-11-08 06:35:28,335 INFO: Gpu files are available: /workspace/holohub/log_directory_20241108-063518/gpu_utilization_greedy_1.csv, /workspace/holohub/log_directory_20241108-063518/gpu_utilization_greedy_2.csv

Corresponding benchmark.log with stdout, stderr output:
benchmark.20241108-063518.log

Failed run with console logging limited to errors only:

$ python ./benchmarks/holoscan_flow_benchmarking/benchmark.py -a endoscopy_tool_tracking  --sched greedy --level error
2024-11-08 06:40:55,442 ERROR: Command "./run launch endoscopy_tool_tracking cpp" exited with code 134
2024-11-08 06:40:56,452 ERROR: Log files are missing: /workspace/holohub/log_directory_20241108-064055/logger_greedy_1_1.log
2024-11-08 06:40:56,452 ERROR: Some log files are missing. Please check the log directory.

These updates aim to improve the benchmark.py CLI experience and make it easier to trace flow tracking runtime errors across runs.

Copy link
Contributor

@sohamm17 sohamm17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tbirdso tbirdso merged commit e248dd4 into nvidia-holoscan:main Nov 12, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants