Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Logging Statements for Performance Analysis #137

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

TeachMeTW
Copy link
Contributor

@TeachMeTW TeachMeTW commented Oct 2, 2024

Description

Introduces logging statements to various functions primarily located in utils.py and data page callbacks -- more to be added. The logging format has been standardized to provide clarity and consistency, structured as follows:

Format:
[Component] - [Function/Callback Name] - [Stage] - [Details]

Example:
Utils - Trajectories Stage 1 - Execution Time: 150ms

The primary goal of these logging enhancements is to facilitate the identification of performance bottlenecks and to enable the export of timing data for further analysis.

Changes Made

  • Added logging statements to key functions in utils.py.
  • Implemented logging in relevant data page callbacks.
  • Each log entry includes:
    • The component (e.g., Utils or Callback)
    • The specific function or callback name
    • A stage identifier
    • Relevant details, such as execution time

Rationale

By incorporating these logging statements, we aim to:

  • Gain better visibility into the execution time of critical functions.
  • Identify performance bottlenecks within the application.
  • Facilitate data collection for performance analysis.

Testing

  • Verified that logging statements are triggered at the appropriate stages in the application.
  • Confirmed that logs are formatted correctly and contain the expected information.

Next Steps:

  • Add as stats to emission

@TeachMeTW TeachMeTW changed the title Timing Add Logging Statements for Performance Analysis Oct 2, 2024
Copy link
Contributor

@JGreenlee JGreenlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shankari Any suggestions for cleaner ways to instrument the execution of specific sections of code?

My first thought was to create a decorator which records and prints execution time of a function. But this would only work on a per-function basis, and @TeachMeTW thinks that a more granular, per-"Stage" measurement will be necessary.

@TeachMeTW is also wondering if "Time for : x seconds" is adequate or if each of these log statements should include an ISO string.
I thought that logging already included timestamps – is it just a matter of configuring it to show them?

@shankari
Copy link
Contributor

shankari commented Oct 2, 2024

Why are we only adding the times as logs? We should put them in as stats (e.g. add_dashboard_stats, similar to add_server_stats so that we can analyze them without having to download and parse logs. Downloading logs from the production environment is particularly challenging. I am fairly sure that is what we discussed at yesterday's meeting.

@JGreenlee the canonical way to instrument specific sections of code in python is to use a Timer in a with block.
You can see how i have used it in the server

        with ect.Timer() as uit:
            logging.info("*" * 10 + "UUID %s: updating incoming user inputs" % uuid + "*" * 10)
            print(str(arrow.now()) + "*" * 10 + "UUID %s: updating incoming user inputs" % uuid + "*" * 10)
            eaum.match_incoming_user_inputs(uuid)

        esds.store_pipeline_time(uuid, ecwp.PipelineStages.USER_INPUT_MATCH_INCOMING.name,
                                 time.time(), uct.elapsed)

I would suggest using the same Timer object.

@TeachMeTW
Copy link
Contributor Author

@shankari I wanted to clarify; would I be using

def store_stats_entry(user_id, metadata_key, name, ts, reading):
  data = {
    "name": name,
    "ts": ts,
    "reading": reading
  }
  new_entry = ecwe.Entry.create_entry(user_id, metadata_key, data)
  return esta.TimeSeries.get_time_series(user_id).insert(new_entry)

also I had a question with the server implementation. The with ect.Timer() as uit is not used. UCT is used. Is this a typo? uct was defined 2 statements above the with statement you provided

@shankari
Copy link
Contributor

shankari commented Oct 2, 2024

@TeachMeTW

  1. I am suggesting that you create a new store_dashboard_time function similar to store_pipeline_time function, which will use store_stats_entry internally
  2. ah I think that is an error in my code. Thanks for catching it. Please fix it now!

This also means that all the USER_INPUT_MATCH_INCOMING times are incorrect.

@TeachMeTW
Copy link
Contributor Author

@shankari Sounds good, I will implement both 1) and 2) on e-mission/e-mission-server#986

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants