Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalability fixes - Load model once per user #944

Merged
merged 18 commits into from
Dec 2, 2023

Conversation

MukuFlash03
Copy link
Contributor

Implementing code changes for improving scalability as per the requirements in this issue.

Initial approach involves utilizing Singleton design pattern concept on instance variables to check whether model has already been loaded before attempting to load the model.

Thus this should prevent model from being loaded if it is already present in the instance variable of the current class instance of BuiltinModelStorage.

More details present here.

Mahadik, Mukul Chandrakant added 2 commits November 8, 2023 17:14
Implementing code changes for improving scalability as per the requirements in this issue# 950 in e-mission-docs.

Initial approach involves utilizing Singleton design pattern concept on instance variables to check whether model has already been loaded before attempting to load the model.

Thus this should prevent model from being loaded if it is already present in the instance variable of the current class instance of BuiltinModelStorage.
Added import for logging.
Added self parameter to pass reference to current instance object.
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs redesign

emission/core/get_database.py Outdated Show resolved Hide resolved
emission/core/get_database.py Outdated Show resolved Hide resolved
emission/storage/modifiable/builtin_model_storage.py Outdated Show resolved Hide resolved
Mahadik, Mukul Chandrakant and others added 12 commits November 17, 2023 11:04
Changes after refactoring label inference pipeline to load model only once. Will be merging with previous PR branch.
Starting from scratch to implement refactored label pipeline approach
Using a dictionary of lists of predictions returned for each trip instead of a simple list.
This will help keep predictions of each trip separate but keep all the algorithms_wise predictions of each trip together in a list.
Updated function call to predict_labels_with_n() as per latest changes to include trip_list and user_id

Fixed typo in eacilp for trip.get_id()
eamur.predict_labels_with_n() now takes a trip instead of a trip_list.
Passed a trip_list in previous commit but forgot to update parameter name.
Receive a prediction list of (prediction, n) for each trip in trip_list.

Then for each of the predictions in the prediction list, run the assertion tests.

Also, cleaned up some debug print statements from earlier commits / PRs.
Cleaning up previously used debug statements
A couple of pipeline tests were failing - TestLableInferencePipeline, TestExpectationPipeline.
Failed since they were calling placeholder_predict_n functions in eacili inferrers.
Had to update these to now receive trip_list instead of single trip.
@MukuFlash03
Copy link
Contributor Author

[SOLVED]
Two pipeline tests were failing - TestLabeIInferencePipeline, TestExpectationPipeline.

FAIL: testPipelineIntegrity (pipelineTests.TestExpectationPipeline.TestExpectationPipeline)
web-server_1  | ----------------------------------------------------------------------
web-server_1  | Traceback (most recent call last):
web-server_1  |   File "/src/e-mission-server/emission/tests/pipelineTests/TestExpectationPipeline.py", line 76, in testPipelineIntegrity
web-server_1  |     self.assertGreaterEqual(len(self.expected_trips), 1)  # Make sure we've actually loaded trips
web-server_1  | AssertionError: 0 not greater than or equal to 1

Took a look at the logs and saw that some more functions were called on running pipeline in these tests.
Failed since they were calling placeholder_predict_n functions in eacili inferrers.py

TypeError: placeholder_predictor_3() takes 1 positional argument but 2 were given

So, added a quick fix to update the functions so that they now receive trip_list instead of single trip and pass in a list of trips.

emission/analysis/modelling/trip_model/run_model.py Outdated Show resolved Hide resolved
emission/core/get_database.py Outdated Show resolved Hide resolved
emission/storage/modifiable/builtin_model_storage.py Outdated Show resolved Hide resolved
emission/tests/storageTests/TestTimeSeries.py Show resolved Hide resolved
Comment on lines 94 to 97
prediction_list = algorithm_fn(user_id, trip_list)
start_insert_inference_labels_time = time.process_time()
for trip, prediction in zip(trip_list, prediction_list):
lp = ecwl.Labelprediction()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future fix??: my one comment here is that if we standardize on an interface for the algorithm_fn, I believe we can simplify this further.

This can become (pseudocode):

for algorithm_id, algorithm_fn in primary_algorithms.items():
   algo_model = algorithm.load_model()
   for trip in trip_list:
      prediction = model.predict(trip)
      # and then the original code

Back when this code was written, trip_model, with its clean interfaces to load, predict and save, had not yet been implemented. So algorithm_fn was supposed to be an abstraction layer of the multiple algorithms.

But now that we have the abstractions in trip_model, I am not sure that we need two levels of abstraction. I am open to deferring the cleanup until the next round, but I am also open to just removing one level of abstraction right now, since that will result in far fewer changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, there is a map that has algorithm name to algorithm fn and then the algorithm fn (in our case) calls predict_with_n. instead, the map can have algorithm name to dict of model parameters, and then inside the for loop, we can just load the model and continue to predict trip by trip and not load the model within the predict function (which works, but is not pretty).

Alternatively, we can keep the algorithm function, eacili.predict_cluster_confidence_discounting but load the model in there. But I don't see the need for the second level of abstraction if we assume that all our algorithms will follow the L -> P -> S workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are not comfortable with removing the algorithm_fn abstraction, you can simply do the following 2 lines of code to be changed.

def predict_cluster_confidence_discounting(trip, max_confidence=None, first_confidence=None, confidence_multiplier=None):
    # load application config
    model_type = eamtc.get_model_type()
    model_storage = eamtc.get_model_storage()
    ### load model  <------ NEW CODE
    labels, n = eamur.predict_labels_with_n(trip, model) <----- CHANGED CODE
    if n <= 0:  # No model data or trip didn't match a cluster
        logging.debug(f"In predict_cluster_confidence_discounting: n={n}; returning as-is")
        return labels

    confidence_coeff = n_to_confidence_coeff(n, max_confidence, first_confidence, confidence_multiplier)
    logging.debug(f"In predict_cluster_confidence_discounting: n={n}; discounting with coefficient {confidence_coeff}")

    labels = copy.deepcopy(labels)
    for l in labels: l["p"] *= confidence_coeff
    return labels

Copy link
Contributor Author

@MukuFlash03 MukuFlash03 Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few queries which are still confusing to me.
I can go ahead with the minimal change of loading model in eaclii but wanted to at least try and understand scope and efforts on changes required in reducing abstractions before making the simpler changes.

But now that we have the abstractions in trip_model, I am not sure that we need two levels of abstraction.

  1. By "two levels", I believe you are referring to:
eacili.predict_cluster_confidence_discounting -> eamur.predict_labels_with_n -> model.predict()

I see that trip_model is an abstract class and one child class / concrete implementation is GreedySimilarityBinning class.
Do we intend on removing the calls to the first two functions in the above chain and directly call only model.predict() which is in effect a call to GreedySimilarityBinning.predict() depending on the model parameters?


Right now, there is a map that has algorithm name to algorithm fn and then the algorithm fn (in our case) calls predict_with_n. instead, the map can have algorithm name to dict of model parameters...

Alternatively, we can keep the algorithm function, eacili.predict_cluster_confidence_discounting but load the model in there. But I don't see the need for the second level of abstraction if we assume that all our algorithms will follow the L -> P -> S workflow.

  1. I believe I've understood that the same trip model will be loaded for each user irrespective of algorithm if we move up the model loading, model parameters up into eacilp.pipeline.

If this is correct, then yes, using a map of algorithm_name to dict_model_parameters is useful:

primary_algorithms = {
    ecwl.AlgorithmTypes.CONFIDENCE_DISCOUNTED_CLUSTER: dict_model_params
}

# In the for loop
   # Load model
   for trip in trip_list:
      prediction = model.predict(trip)

But in doing this (in continuation with my 1st question), this will call model.predict(), and I fail to see how our algorithm_name will have a unique implementation if we are removing the call to its respective function?
For instance, on removing the call to eacili.predict_cluster_confidence_discounting, we are no longer doing the confidence_coeffcalculations.
So what would be the significance of the algorithm name in the map in this case?

We also have placeholder_predictor() functions in inferrers.py which are being called in TestLabelInferencePipeline.py through the compute_and_save_algorithms() as well.
I believe for these functions, loading model and making predictions won't be useful as these don't load the model and just return predictions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you made the relevant changes - is this still confusing? If so, we can discuss in a 1:1.

emission/analysis/modelling/trip_model/run_model.py Outdated Show resolved Hide resolved
Mahadik, Mukul Chandrakant added 4 commits November 27, 2023 13:38
Fetching from trip_list itself.
Added assertion to ensure identical and only one unique user_id present for all trips.
Assertion fails if multiple user_ids found for the trip_list.
Removed from Test files and inferrers.py which contained placeholder_predictor functions.
Refactored code to pass in the trip model directly to predict_labels_with_n() in eamur.

Moved the load model step to eacil.inferrers by using load_model() of eamur.

Modified TestRunGreedyModel to use this refactored function.
Failing test due to missed user_id parameter.
Passed user_id to rectify.
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to merge this for now, but please note the "future fix" comments and address them in a cleanup pass. You should also look through all the logs that you have added and make sure that you are logging as much detail as possible, so it can actually be useful in understanding the problem.

else:
predictions, n = model.predict(trip)
predictions_list.append((predictions, n))
print(f"{arrow.now()} Inside predict_labels_n: Predictions complete for trip_list in time = {time.process_time() - start_predict_time}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, future fix: we should log the length of the list


inferred_trip_list.append(inferred_trip)

if inferred_trip_list:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this always going to be true, given that you define inferred_trip_list in line 62?
Are you actually testing against the "no trip" case?
I suspect this doesn't matter, since we just iterate through the trips downstream, but then we can just skip this check altogether given that it will always be true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did test this out and Python evaluates an empty list to be False when used in an if condition.
So, it does not enter the if block unless trips are loaded into inferred_trip_list in the for loop above the if block and the list becomes non-empty which then evaluates to True.

user_id_list = []
for trip in trip_list:
user_id_list.append(trip['user_id'])
assert user_id_list.count(user_id_list[0]) == len(user_id_list), "Multiple user_ids found for trip_list, expected unique user_id for all trips"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, future fix: you should add some additional details to the log - what were the unique user IDs, how many is "multiiple"? Otherwise, if I see this log, I am not sure how to debug it.

@shankari
Copy link
Contributor

shankari commented Dec 2, 2023

@MukuFlash03 I am squash-merging this to avoid code churn.
Please make sure to handle branch updates accordingly.

@shankari shankari merged commit 1a1b1e6 into e-mission:master Dec 2, 2023
5 checks passed
shankari added a commit that referenced this pull request Jan 30, 2024
Cleanup fixes including log statements for PR #944
humbleOldSage added a commit to humbleOldSage/e-mission-server that referenced this pull request Feb 5, 2024
The changes in this iteration are improvements in test for forest model :

1. Post discussion last week, the regression test was removed ( `TestForestModel.py` )since it won't be useful when model performance improves. Rather, the structures of predictions is checked. This check is merged with TestForestModel.py

2. After e-mission#944 , `predict_labels_with_n` in `run_model.py` expectes a lists and then iterates over it. The forest model and rest of the tests were updated accordingly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Address code reviews with "future"
Development

Successfully merging this pull request may close these issues.

2 participants