Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporary Logging #141

Merged
merged 2 commits into from
Oct 20, 2024
Merged

Conversation

TeachMeTW
Copy link
Contributor

Description

This pull request introduces a temporary logging setup to enhance debugging and monitor function execution times until the logging mechanism can be fully integrated into the emission server.

Changes

  • Configured the logging system to capture all log levels with detailed formatting.
  • Implemented a log_execution_time decorator to log the start and end times of function executions.

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TeachMeTW

In addition to the decorator here, you can also instrument sections of code. These functions are small, but for bigger functions with multiple steps, you can do something like: https://github.com/e-mission/op-admin-dashboard/pull/141/files#diff-44562706f07395bde42fb1c105fe16e41bbba72b174847e62d3b8f257212717aR204

    (start_date, end_date) = iso_to_date_only(start_date, end_date)
    with ect.Timer() as ctt:
        if not df.empty and has_permission('overview_trips_trend'):
           trend_df = compute_trips_trend(df, date_col = "trip_start_time_str")
    esds.store_dashboard_time("compute_trips_trend", time.time(), ctt.elapsed)
    with ect.Timer() as gbt:
        fig = generate_barplot(trend_df, x = 'date', y = 'count', title = f"Trips trend ({start_date} to {end_date})")
    esds.store_dashboard_time("generate_barplot", time.time(), gbt.elapsed)

Again, these are examples, I don't think this function is complex enough to warrant this level of fine-grained instrumentation. But it is an example for more complex functions, such as the ones in db_utils 😄

@JGreenlee for visibility

To clarify, the suggestions below are intended to be guidelines to show you what I expect. I have not tested them, and I don't promise that they will work without any changes.

pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please:

  • Indicate "Testing done"
  • Clean up imports

Are there are not any functions (e.g. in db_utils) that would benefit from instrumenting sub-parts of the function?

I can anticipate the benefits of being able to distinguish between (for example) the DB read time and the processing time, so we can see if (for example) converting a list from strings to UUID objects is really the bottleneck.

pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
pages/home.py Outdated Show resolved Hide resolved
@TeachMeTW
Copy link
Contributor Author

@shankari Cleaned Up Imports;

Are there are not any functions (e.g. in db_utils) that would benefit from instrumenting sub-parts of the function?

I can anticipate the benefits of being able to distinguish between (for example) the DB read time and the processing time, so we can see if (for example) converting a list from strings to UUID objects is really the bottleneck.

We can do that, but I would have to implement things different; right now I would say its 'good enough', but I can work on a more precise function logger. I already track individual functions in the callbacks so that should narrow things down

@TeachMeTW
Copy link
Contributor Author

TeachMeTW commented Oct 19, 2024

@shankari

I believe I've made a way for it to be more precise; shall I proceed with this? What are your thoughts on this design?

@log_execution_time
def query_demographics():
    """
    Queries and processes demographic survey data, returning a dictionary of DataFrames
    keyed by survey IDs. Each DataFrame contains the processed survey responses.
    
    :return: Dictionary where keys are survey IDs and values are corresponding DataFrames.
    """
    
    logging.debug("Querying the demographics for (no date range)")
    
    # Stage 1: Fetch Aggregate Time Series
    with log_stage_time("Fetch Aggregate Time Series"):
        ts = esta.TimeSeries.get_aggregate_time_series()
    
    # Stage 2: Find Demographic Survey Entries
    with log_stage_time("Find Demographic Survey Entries"):
        entries = ts.find_entries(["manual/demographic_survey"])
        data = list(entries)
    
    # Stage 3: Organize Entries by Survey Key
    with log_stage_time("Organize Entries by Survey Key"):
        available_key = {}
        for entry in data:
            survey_key = list(entry['data']['jsonDocResponse'].keys())[0]
            if survey_key not in available_key:
                available_key[survey_key] = []
            available_key[survey_key].append(entry)

via:

def log_stage_time(stage_name: str):
    """
    Context manager to log and store the execution time of a specific stage within a function.
    Automatically prefixes the stage name with the parent function's name for clarity.
    
    :param stage_name (str): The name of the stage being timed.
    """
    parent_func = current_parent_function.get()
    if parent_func:
        full_stage_name = f"{parent_func} - {stage_name}"
    else:
        full_stage_name = stage_name

    logging.debug(f"Starting stage '{full_stage_name}'")
    logging.info("********** Starting stage: " + full_stage_name + " **********")
    timer = ect.Timer()
    timer.__enter__()
    try:
        yield
    finally:
        timer.__exit__()
        esds.store_dashboard_time(full_stage_name, timer)
        logging.info(f"Finished stage '{full_stage_name}' in {timer.elapsed:.2f} seconds")

It now has IN DETAIL stages of what parts are slow:

{
       "_id" : ObjectId("6713dece1b3e03ae8a444db7"),
       "user_id" : null,
       "metadata" : {
               "key" : "stats/dashboard_time",
               "platform" : "server",
               "write_ts" : 1729355470.9133275,
               "time_zone" : "America/Los_Angeles",
               "write_local_dt" : {
                       "year" : 2024,
                       "month" : 10,
                       "day" : 19,
                       "hour" : 9,
                       "minute" : 31,
                       "second" : 10,
                       "weekday" : 5,
                       "timezone" : "America/Los_Angeles"
               },
               "write_fmt_time" : "2024-10-19T09:31:10.913327-07:00"
       },
       "data" : {
               "name" : "query_demographics",
               "ts" : 1729355470.9132683,
               "reading" : 108.23929199250415
       }
}
{
       "_id" : ObjectId("6713dece1b3e03ae8a444db6"),
       "user_id" : null,
       "metadata" : {
               "key" : "stats/dashboard_time",
               "platform" : "server",
               "write_ts" : 1729355470.9114928,
               "time_zone" : "America/Los_Angeles",
               "write_local_dt" : {
                       "year" : 2024,
                       "month" : 10,
                       "day" : 19,
                       "hour" : 9,
                       "minute" : 31,
                       "second" : 10,
                       "weekday" : 5,
                       "timezone" : "America/Los_Angeles"
               },
               "write_fmt_time" : "2024-10-19T09:31:10.911493-07:00"
       },
       "data" : {
               "name" : "query_demographics - Clean and Modify DataFrames",
               "ts" : 1729355470.9114244,
               "reading" : 15.958874981151894
       }
}
{
       "_id" : ObjectId("6713dece1b3e03ae8a444db5"),
       "user_id" : null,
       "metadata" : {
               "key" : "stats/dashboard_time",
               "platform" : "server",
               "write_ts" : 1729355470.894045,
               "time_zone" : "America/Los_Angeles",
               "write_local_dt" : {
                       "year" : 2024,
                       "month" : 10,
                       "day" : 19,
                       "hour" : 9,
                       "minute" : 31,
                       "second" : 10,
                       "weekday" : 5,
                       "timezone" : "America/Los_Angeles"
               },
               "write_fmt_time" : "2024-10-19T09:31:10.894045-07:00"
       },
       "data" : {
               "name" : "query_demographics - Create DataFrames from Organized Entries",
               "ts" : 1729355470.8939605,
               "reading" : 17.122000019298866
       }
}

It comes in the form main function - Stage

@TeachMeTW
Copy link
Contributor Author

Also detected discrepencies with testing already; for example with trajectories:
INFO:root:Finished 'query_trajectories' in 0.02 seconds
INFO:root:Finished 'query_trajectories' in 51.29 seconds

@shankari
Copy link
Contributor

shankari commented Oct 19, 2024

We can do that, but I would have to implement things different; right now I would say its 'good enough', but I can work on a more precise function logger. I already track individual functions in the callbacks so that should narrow things down

I am not sure why you need a more precise function logger. You can just use the timer directly for any block of code.
It does not need to be a function

    # Stage 1: Fetch Aggregate Time Series
    with ect.Timer() as dfat:
        ts = esta.TimeSeries.get_aggregate_time_series()
    esds.store_pipeline_time("db_utils/fetch_aggregate_time_series", dfat)
    
    # Stage 2: Find Demographic Survey Entries
    with ect.Timer() as dfde:
        entries = ts.find_entries(["manual/demographic_survey"])
        data = list(entries)
    esds.store_pipeline_time("db_utils/find_demographic_survey_entries", dfde)

Your approach works too, but seems like a bit of overkill given that you need to use internal functions of the timer and yield

I would suggest using the timer directly for now, and we can discuss the stage wrapper later.

Note also that you should not be using spaces in the names to avoid weirdnesses down the road.

@TeachMeTW
Copy link
Contributor Author

TeachMeTW commented Oct 19, 2024

Sounds good, I feel like it is a bit cleaner as opposed to having different timer names such as dfat and dfde and allows for additional logging/info; but yes, i'll use the current timer and we can discuss the stage wrapper at a later date once we get a mvp online

@TeachMeTW TeachMeTW force-pushed the feature/Temporary-Logging branch from cf0d415 to 10d5955 Compare October 19, 2024 21:16
@TeachMeTW
Copy link
Contributor Author

@shankari I did what you suggested, I also added mixing doc strings and moved comments in order to help my debugging and getting rid of plotly errors

@TeachMeTW
Copy link
Contributor Author

It should be ready for db_utils, home and data pages for staging.

@shankari
Copy link
Contributor

Sounds good, I feel like it is a bit cleaner as opposed to having different timer names such as dfat and dfde and allows for additional logging/info; but yes, i'll use the current timer and we can discuss the stage wrapper at a later date once we get a mvp online

Agreed and we can discuss this later - see
e-mission/e-mission-server#986 (comment)

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final fix
I am fine with this with one final fix. Since we want to reuse the dashboard stats for both the admin and public dashboards, let's preface all the names by admin. This will allow us to be forwards compatible and support both admin and public dashboards in the future.

pages/data.py Outdated Show resolved Hide resolved
pages/data.py Outdated Show resolved Hide resolved
pages/data.py Outdated Show resolved Hide resolved
@TeachMeTW
Copy link
Contributor Author

@shankari I've modified it

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Squash merging again as usual given the commit churn. Please account for it properly while pulling the changes to build on them.

@shankari
Copy link
Contributor

Actually, there isn't a lot of commit churn - I guess @TeachMeTW has started rebasing. But still squash merging so I can fix the commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Tasks completed
Development

Successfully merging this pull request may close these issues.

3 participants