Scalability fixes - Load model once per user #944

MukuFlash03 · 2023-11-09T00:15:57Z

Implementing code changes for improving scalability as per the requirements in this issue.

Initial approach involves utilizing Singleton design pattern concept on instance variables to check whether model has already been loaded before attempting to load the model.

Thus this should prevent model from being loaded if it is already present in the instance variable of the current class instance of BuiltinModelStorage.

More details present here.

Implementing code changes for improving scalability as per the requirements in this issue# 950 in e-mission-docs. Initial approach involves utilizing Singleton design pattern concept on instance variables to check whether model has already been loaded before attempting to load the model. Thus this should prevent model from being loaded if it is already present in the instance variable of the current class instance of BuiltinModelStorage.

Added import for logging. Added self parameter to pass reference to current instance object.

shankari

Needs redesign

emission/core/get_database.py

emission/storage/modifiable/builtin_model_storage.py

Changes after refactoring label inference pipeline to load model only once. Will be merging with previous PR branch.

Starting from scratch to implement refactored label pipeline approach

Using a dictionary of lists of predictions returned for each trip instead of a simple list. This will help keep predictions of each trip separate but keep all the algorithms_wise predictions of each trip together in a list.

Updated function call to predict_labels_with_n() as per latest changes to include trip_list and user_id Fixed typo in eacilp for trip.get_id()

eamur.predict_labels_with_n() now takes a trip instead of a trip_list. Passed a trip_list in previous commit but forgot to update parameter name.

Receive a prediction list of (prediction, n) for each trip in trip_list. Then for each of the predictions in the prediction list, run the assertion tests. Also, cleaned up some debug print statements from earlier commits / PRs.

Cleaning up previously used debug statements

A couple of pipeline tests were failing - TestLableInferencePipeline, TestExpectationPipeline. Failed since they were calling placeholder_predict_n functions in eacili inferrers. Had to update these to now receive trip_list instead of single trip.

MukuFlash03 · 2023-11-22T00:53:03Z

[SOLVED]
Two pipeline tests were failing - TestLabeIInferencePipeline, TestExpectationPipeline.

FAIL: testPipelineIntegrity (pipelineTests.TestExpectationPipeline.TestExpectationPipeline)
web-server_1  | ----------------------------------------------------------------------
web-server_1  | Traceback (most recent call last):
web-server_1  |   File "/src/e-mission-server/emission/tests/pipelineTests/TestExpectationPipeline.py", line 76, in testPipelineIntegrity
web-server_1  |     self.assertGreaterEqual(len(self.expected_trips), 1)  # Make sure we've actually loaded trips
web-server_1  | AssertionError: 0 not greater than or equal to 1

Took a look at the logs and saw that some more functions were called on running pipeline in these tests.
Failed since they were calling placeholder_predict_n functions in eacili inferrers.py

TypeError: placeholder_predictor_3() takes 1 positional argument but 2 were given

So, added a quick fix to update the functions so that they now receive trip_list instead of single trip and pass in a list of trips.

emission/analysis/classification/inference/labels/pipeline.py

emission/analysis/modelling/trip_model/run_model.py

emission/core/get_database.py

emission/storage/modifiable/builtin_model_storage.py

emission/storage/timeseries/builtin_timeseries.py

emission/tests/storageTests/TestSectionQueries.py

emission/tests/storageTests/TestTimeSeries.py

emission/analysis/classification/inference/labels/inferrers.py

shankari · 2023-11-23T00:31:47Z

emission/analysis/classification/inference/labels/pipeline.py

+            prediction_list = algorithm_fn(user_id, trip_list)
+            start_insert_inference_labels_time = time.process_time()
+            for trip, prediction in zip(trip_list, prediction_list):
+                lp = ecwl.Labelprediction()


future fix??: my one comment here is that if we standardize on an interface for the algorithm_fn, I believe we can simplify this further.

This can become (pseudocode):

for algorithm_id, algorithm_fn in primary_algorithms.items(): algo_model = algorithm.load_model() for trip in trip_list: prediction = model.predict(trip) # and then the original code

Back when this code was written, trip_model, with its clean interfaces to load, predict and save, had not yet been implemented. So algorithm_fn was supposed to be an abstraction layer of the multiple algorithms.

But now that we have the abstractions in trip_model, I am not sure that we need two levels of abstraction. I am open to deferring the cleanup until the next round, but I am also open to just removing one level of abstraction right now, since that will result in far fewer changes.

Right now, there is a map that has algorithm name to algorithm fn and then the algorithm fn (in our case) calls predict_with_n. instead, the map can have algorithm name to dict of model parameters, and then inside the for loop, we can just load the model and continue to predict trip by trip and not load the model within the predict function (which works, but is not pretty).

Alternatively, we can keep the algorithm function, eacili.predict_cluster_confidence_discounting but load the model in there. But I don't see the need for the second level of abstraction if we assume that all our algorithms will follow the L -> P -> S workflow.

If you are not comfortable with removing the algorithm_fn abstraction, you can simply do the following 2 lines of code to be changed.

def predict_cluster_confidence_discounting(trip, max_confidence=None, first_confidence=None, confidence_multiplier=None): # load application config model_type = eamtc.get_model_type() model_storage = eamtc.get_model_storage() ### load model <------ NEW CODE labels, n = eamur.predict_labels_with_n(trip, model) <----- CHANGED CODE if n <= 0: # No model data or trip didn't match a cluster logging.debug(f"In predict_cluster_confidence_discounting: n={n}; returning as-is") return labels confidence_coeff = n_to_confidence_coeff(n, max_confidence, first_confidence, confidence_multiplier) logging.debug(f"In predict_cluster_confidence_discounting: n={n}; discounting with coefficient {confidence_coeff}") labels = copy.deepcopy(labels) for l in labels: l["p"] *= confidence_coeff return labels

I have a few queries which are still confusing to me.
I can go ahead with the minimal change of loading model in eaclii but wanted to at least try and understand scope and efforts on changes required in reducing abstractions before making the simpler changes.

But now that we have the abstractions in trip_model, I am not sure that we need two levels of abstraction.

By "two levels", I believe you are referring to:

eacili.predict_cluster_confidence_discounting -> eamur.predict_labels_with_n -> model.predict()

I see that trip_model is an abstract class and one child class / concrete implementation is GreedySimilarityBinning class.
Do we intend on removing the calls to the first two functions in the above chain and directly call only model.predict() which is in effect a call to GreedySimilarityBinning.predict() depending on the model parameters?

Right now, there is a map that has algorithm name to algorithm fn and then the algorithm fn (in our case) calls predict_with_n. instead, the map can have algorithm name to dict of model parameters...

Alternatively, we can keep the algorithm function, eacili.predict_cluster_confidence_discounting but load the model in there. But I don't see the need for the second level of abstraction if we assume that all our algorithms will follow the L -> P -> S workflow.

I believe I've understood that the same trip model will be loaded for each user irrespective of algorithm if we move up the model loading, model parameters up into eacilp.pipeline.

If this is correct, then yes, using a map of algorithm_name to dict_model_parameters is useful:

primary_algorithms = { ecwl.AlgorithmTypes.CONFIDENCE_DISCOUNTED_CLUSTER: dict_model_params } # In the for loop # Load model for trip in trip_list: prediction = model.predict(trip)

But in doing this (in continuation with my 1st question), this will call model.predict(), and I fail to see how our algorithm_name will have a unique implementation if we are removing the call to its respective function?
For instance, on removing the call to eacili.predict_cluster_confidence_discounting, we are no longer doing the confidence_coeffcalculations.
So what would be the significance of the algorithm name in the map in this case?

We also have placeholder_predictor() functions in inferrers.py which are being called in TestLabelInferencePipeline.py through the compute_and_save_algorithms() as well.
I believe for these functions, loading model and making predictions won't be useful as these don't load the model and just return predictions.

I see that you made the relevant changes - is this still confusing? If so, we can discuss in a 1:1.

emission/analysis/modelling/trip_model/run_model.py

Fetching from trip_list itself. Added assertion to ensure identical and only one unique user_id present for all trips. Assertion fails if multiple user_ids found for the trip_list.

Removed from Test files and inferrers.py which contained placeholder_predictor functions.

Refactored code to pass in the trip model directly to predict_labels_with_n() in eamur. Moved the load model step to eacil.inferrers by using load_model() of eamur. Modified TestRunGreedyModel to use this refactored function.

Failing test due to missed user_id parameter. Passed user_id to rectify.

shankari

I am going to merge this for now, but please note the "future fix" comments and address them in a cleanup pass. You should also look through all the logs that you have added and make sure that you are logging as much detail as possible, so it can actually be useful in understanding the problem.

shankari · 2023-12-02T06:50:38Z

emission/analysis/modelling/trip_model/run_model.py

+        else:
+            predictions, n = model.predict(trip)
+            predictions_list.append((predictions, n))
+    print(f"{arrow.now()} Inside predict_labels_n: Predictions complete for trip_list in time = {time.process_time() - start_predict_time}")


nit, future fix: we should log the length of the list

shankari · 2023-12-02T07:01:20Z

emission/analysis/classification/inference/labels/pipeline.py

-
+            inferred_trip_list.append(inferred_trip)
+
+        if inferred_trip_list:


isn't this always going to be true, given that you define inferred_trip_list in line 62?
Are you actually testing against the "no trip" case?
I suspect this doesn't matter, since we just iterate through the trips downstream, but then we can just skip this check altogether given that it will always be true.

I did test this out and Python evaluates an empty list to be False when used in an if condition.
So, it does not enter the if block unless trips are loaded into inferred_trip_list in the for loop above the if block and the list becomes non-empty which then evaluates to True.

shankari · 2023-12-02T07:09:24Z

emission/analysis/classification/inference/labels/inferrers.py

+    user_id_list = []
+    for trip in trip_list:
+        user_id_list.append(trip['user_id'])
+    assert user_id_list.count(user_id_list[0]) == len(user_id_list), "Multiple user_ids found for trip_list, expected unique user_id for all trips"


nit, future fix: you should add some additional details to the log - what were the unique user IDs, how many is "multiiple"? Otherwise, if I see this log, I am not sure how to debug it.

shankari · 2023-12-02T07:16:56Z

@MukuFlash03 I am squash-merging this to avoid code churn.
Please make sure to handle branch updates accordingly.

Cleanup fixes including log statements for PR #944

The changes in this iteration are improvements in test for forest model : 1. Post discussion last week, the regression test was removed ( `TestForestModel.py` )since it won't be useful when model performance improves. Rather, the structures of predictions is checked. This check is merged with TestForestModel.py 2. After e-mission#944 , `predict_labels_with_n` in `run_model.py` expectes a lists and then iterates over it. The forest model and rest of the tests were updated accordingly.

Mahadik, Mukul Chandrakant added 2 commits November 8, 2023 17:14

Missing imports and self keyword

7d5b5d2

Added import for logging. Added self parameter to pass reference to current instance object.

shankari requested changes Nov 10, 2023

View reviewed changes

emission/core/get_database.py Outdated Show resolved Hide resolved

emission/core/get_database.py Outdated Show resolved Hide resolved

emission/storage/modifiable/builtin_model_storage.py Outdated Show resolved Hide resolved

Mahadik, Mukul Chandrakant and others added 12 commits November 17, 2023 11:04

Refactoring label inference pipeline for scalability issues

f924776

Changes after refactoring label inference pipeline to load model only once. Will be merging with previous PR branch.

Merge branch 'e-mission:master' into label-scalability

a7668f5

Removed singleton approach

d973da3

Starting from scratch to implement refactored label pipeline approach

Merge branch 'refactor-label-pipeline' into label-scalability

d042e1b

Changed results_list to result_dict

fb278ff

Using a dictionary of lists of predictions returned for each trip instead of a simple list. This will help keep predictions of each trip separate but keep all the algorithms_wise predictions of each trip together in a list.

Fixed function call to predict_labels_with_n in TestRunGreedyModel

30c02a4

Updated function call to predict_labels_with_n() as per latest changes to include trip_list and user_id Fixed typo in eacilp for trip.get_id()

Fixed function signature in TestRunGreedyModel

b15a0e0

eamur.predict_labels_with_n() now takes a trip instead of a trip_list. Passed a trip_list in previous commit but forgot to update parameter name.

Updated TestRunGreedyModel + Debug Cleanup

54a42a8

Receive a prediction list of (prediction, n) for each trip in trip_list. Then for each of the predictions in the prediction list, run the assertion tests. Also, cleaned up some debug print statements from earlier commits / PRs.

Cleanup print statements

7f70be9

Cleaning up previously used debug statements

Added debug statements to track time taken for operations.

6c2c9c0

Cleaned up some comments and debug statements.

433b40a

MukuFlash03 mentioned this pull request Nov 22, 2023

⚡ [MLOps] don't load the model before labeling each trip e-mission/e-mission-docs#950

Open

shankari requested changes Nov 23, 2023

View reviewed changes

Mahadik, Mukul Chandrakant added 4 commits November 27, 2023 13:38

Removed user_id parameter

7950981

Fetching from trip_list itself. Added assertion to ensure identical and only one unique user_id present for all trips. Assertion fails if multiple user_ids found for the trip_list.

Removed user_id parameter from Tests

425966a

Removed from Test files and inferrers.py which contained placeholder_predictor functions.

Moved model loading step out of predict_labels_with_n

64232ab

Refactored code to pass in the trip model directly to predict_labels_with_n() in eamur. Moved the load model step to eacil.inferrers by using load_model() of eamur. Modified TestRunGreedyModel to use this refactored function.

Passed user_id in TestRunGreedyModel.py

3372604

Failing test due to missed user_id parameter. Passed user_id to rectify.

shankari approved these changes Dec 2, 2023

View reviewed changes

shankari merged commit 1a1b1e6 into e-mission:master Dec 2, 2023
5 checks passed

MukuFlash03 mentioned this pull request Jan 26, 2024

Cleanup fixes including log statements for PR #944 #955

Merged

shankari added a commit that referenced this pull request Jan 30, 2024

Merge pull request #955 from MukuFlash03/log-statements-model-load

0a8c61d

Cleanup fixes including log statements for PR #944

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalability fixes - Load model once per user #944

Scalability fixes - Load model once per user #944

MukuFlash03 commented Nov 9, 2023

shankari left a comment

MukuFlash03 commented Nov 22, 2023

shankari Nov 23, 2023

shankari Nov 23, 2023

shankari Nov 23, 2023

MukuFlash03 Nov 30, 2023 •

edited

Loading

shankari Dec 2, 2023

shankari left a comment

shankari Dec 2, 2023

shankari Dec 2, 2023

MukuFlash03 Dec 4, 2023

shankari Dec 2, 2023

shankari commented Dec 2, 2023


		inferred_trip_list.append(inferred_trip)

		if inferred_trip_list:

Scalability fixes - Load model once per user #944

Scalability fixes - Load model once per user #944

Conversation

MukuFlash03 commented Nov 9, 2023

shankari left a comment

Choose a reason for hiding this comment

MukuFlash03 commented Nov 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MukuFlash03 Nov 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shankari left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shankari commented Dec 2, 2023

MukuFlash03 Nov 30, 2023 •

edited

Loading