Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O5 Improve telemetry and visualization #11

Open
tremblerz opened this issue Jul 12, 2024 · 3 comments
Open

O5 Improve telemetry and visualization #11

tremblerz opened this issue Jul 12, 2024 · 3 comments
Assignees
Milestone

Comments

@tremblerz
Copy link
Contributor

Currently we do not have a good way to record logs and visualize the network data. Therefore, we need to improve our log_utils library.

  1. As a starting step, we need to document what all is being logged.
  2. We need to have a network visualization library that shows how often and when nodes talk to each other.
@nikitaparate
Copy link
Contributor

Summary of Changes:

  1. Documentation:

    • Created docs/Logging.md file to document all logging activities.
  2. TensorBoard Logging:

    • Modified the code to log client metrics (loss and accuracy) to TensorBoard.
    • Logs are stored in the following directories:
      • ./expt_dump/<experiment_name>/logs/client_<client_index>/
      • ./expt_dump/<experiment_name>/logs/server/
  3. Summary Logging:

    • Added a summary log method in src/utils/log_utils.py to log client and server details.
    • Summary logs are saved in:
      • ./expt_dump/<experiment_name>/logs/client_<client_index>/summary.txt
      • ./expt_dump/<experiment_name>/logs/server/summary.txt
  • Client metrics are now visualised on TensorBoard, allowing for real-time monitoring of training progress.
  • Both clients and server details are recorded in respective summary.txt files to provide a comprehensive view of the activities.

Please review the changes and let me know if I can create a pull request for these improvements?

@tremblerz tremblerz added this to the O5 milestone Jul 18, 2024
@tremblerz
Copy link
Contributor Author

This sounds like a great start! Please go ahead with the PR

@nikitaparate
Copy link
Contributor

nikitaparate commented Jul 19, 2024

Also, when an experiment results are already present, the ./expt_dump/<experiment_name>/logs/client_<client_index>/ folders are created before the user input('r' or 'e'), resulting in deletion of the ./expt_dump/<experiment_name>/logs/client_<client_index>/ folders after the input from the user.
To solve this, i added signal indicating the client code should run only after the user input 'r' (if existing experiment results are present), otherwise, the folders are created without user input for the first time.
Can i add this change along with the previous improvements in one PR.?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants