Model Deployement in Sagemaker #191

Gayathri2993 · 2023-04-27T00:45:19Z

What is the best way to deploy deepparse in AWS sagemaker. I have the downloaded model (fasttext) directly and stored it in S3 since i do not want to download the model everytime i run l. I have given the S3 path as path_to_retrained_model. But somehow the the function is not able to read the path file from S3. The error says no such file exists. I am very positive that the file path is correct and model exists in S3. Is there something I am missing or could you please let me know if there is an efficient way to deploy the model. I have millions of records to run this model on.

If i were to retrain the model for another country in future, i will have to save the model. What is the best way to deploy the model in this use case?

github-actions · 2023-04-27T00:45:52Z

Thank you for you interest in improving Deepparse.

davebulaval · 2023-04-27T10:42:13Z

About an URL as a path, it is not a feature, but I can definitely see it. I will look at possible solutions to integrate such a feature with an S3 bucket.

If you retrain a model, I think using the same S3 bucket approach is interesting. If the feature is developed, it will be easy for you. All you will need is to save it after training. I guess botos3 with a predefined pattern URL would allow you to export to S3 and later update a .txt file with the latest URL. Or something like that. (Here a possible approach)

davebulaval · 2023-04-27T11:20:16Z

I found a library that can help handle for AWS S3 bucket. I have implemented a feature and only need to test if it works properly. I will try to finish it today but can promise it.

davebulaval · 2023-04-27T12:54:51Z

You can test the feature using this dev branch:

pip install -U git+https://github.com/GRAAL-Research/deepparse.git@aws_s3_uri.

I also think it would be possible to use CloudPathLib to save a model when directly retraining into an S3-like bucket.
I will like into it after your feedback on the prototype and when I will have more time to test it.

Gayathri2993 · 2023-05-01T01:26:14Z

Hi Dave,

Thanks for the quick prompt response. I did try the feature you developed using CloudPathLib but i ended up with an error. Request you to please help me with this.

Code:
!pip install cloudpathlib
from cloudpathlib import CloudPath

address_parser = AddressParser(model_type="fasttext", device=0, path_to_retrained_model=CloudPath("s3://s3_path/fasttext.ckpt"))

address_parser("350 rue des Lilas Ouest Québec Québec G1L 1B6")

Error:

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/deepparse/parser/address_parser.py:240, in AddressParser.init(self, model_type, attention_mechanism, device, rounding, verbose, path_to_retrained_model, cache_dir, offline)
237 seq2seq_kwargs = {} # Empty for default settings
239 if path_to_retrained_model is not None:
--> 240 if "s3://" in path_to_retrained_model:
241 if CloudPath is None:
242 raise ImportError(
243 "cloudpathlib needs to be installed to use a S3-like " "URI as path_to_retrained_model."
244 )

TypeError: argument of type 'S3Path' is not iterable

davebulaval · 2023-05-01T17:20:34Z

I'm on vacation for the next two weeks. I will take a look at it in mid-may.

pjbull · 2023-05-05T21:03:29Z

cloudpathlib maintainer here. @Gayathri2993, it may be the case that you need to additionally install the S3 dependencies, so instead of !pip install cloudpathlib it should be !pip install cloudpathlib[s3]

davebulaval · 2023-05-07T11:04:40Z

@pjbull tks for the tips. I will take a look at this.

davebulaval · 2023-05-07T11:06:24Z

@Gayathri2993 Try this:

address_parser = AddressParser(model_type="fasttext", device=0, path_to_retrained_model="s3://s3_path/fasttext.ckpt")

The path does not need to be a CloudPath; we handle the conversion to it in the code base. I have added a catch for this kind of behaviour.

davebulaval · 2023-05-07T11:14:51Z

And I also added support to allow the use of a CloudPath directly as the argument instead of a string.

LMK if it works after updating using the branch. I've pushed some modifications.

Gayathri2993 · 2023-05-07T23:36:12Z

Thank you guys for looking into this issue. I tried two methods to load the model directly from S3. Please find the details of two methods below

Method 1:

Code:
!pip install cloudpathlib[s3]
from cloudpathlib import CloudPath

address_parser = AddressParser(model_type="fasttext", device=0, path_to_retrained_model=CloudPath("s3://s3_path/fasttext.ckpt"))

address_parser("350 rue des Lilas Ouest Québec Québec G1L 1B6")

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/deepparse/parser/address_parser.py:225, in AddressParser.init(self, model_type, attention_mechanism, device, rounding, verbose, path_to_retrained_model, cache_dir, offline)
222 seq2seq_kwargs = {} # Empty for default settings
224 if path_to_retrained_model is not None:
--> 225 checkpoint_weights = torch.load(path_to_retrained_model, map_location="cpu")
226 if checkpoint_weights.get("model_type") is None:
227 # Validate if we have the proper metadata, it has at least the parser model type
228 # if no other thing have been modified.
229 raise RuntimeError(
230 "You are not using the proper retrained checkpoint. "
231 "When we retrain an AddressParser, by default, we create a "
(...)
234 "See AddressParser.retrain for more details."
235 )

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:791, in load(f, map_location, pickle_module, weights_only, **pickle_load_args)
788 if 'encoding' not in pickle_load_args.keys():
789 pickle_load_args['encoding'] = 'utf-8'
--> 791 with _open_file_like(f, 'rb') as opened_file:
792 if _is_zipfile(opened_file):
793 # The zipfile reader is going to advance the current file position.
794 # If we want to actually tail call to torch.jit.load, we need to
795 # reset back to the original position.
796 orig_position = opened_file.tell()

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:276, in _open_file_like(name_or_buffer, mode)
274 return _open_buffer_writer(name_or_buffer)
275 elif 'r' in mode:
--> 276 return _open_buffer_reader(name_or_buffer)
277 else:
278 raise RuntimeError(f"Expected 'r' or 'w' in mode but got {mode}")

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:261, in _open_buffer_reader.init(self, buffer)
259 def init(self, buffer):
260 super().init(buffer)
--> 261 _check_seekable(buffer)

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:357, in _check_seekable(f)
355 return True
356 except (io.UnsupportedOperation, AttributeError) as e:
--> 357 raise_err_msg(["seek", "tell"], e)
358 return False

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:350, in _check_seekable..raise_err_msg(patterns, e)
346 if p in str(e):
347 msg = (str(e) + ". You can only torch.load from a file that is seekable."
348 + " Please pre-load the data into a buffer like io.BytesIO and"
349 + " try to load from it instead.")
--> 350 raise type(e)(msg)
351 raise e

AttributeError: 'S3Path' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

Method 2: I tried the below method as well. But i got 'no file exists error'. I am quiet certain that the model exists in S3 and file path is correct as well.

address_parser = AddressParser(model_type="fasttext", device=0, path_to_retrained_model="s3://s3_path/fasttext.ckpt")

Error:
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:791, in load(f, map_location, pickle_module, weights_only, **pickle_load_args)
788 if 'encoding' not in pickle_load_args.keys():
789 pickle_load_args['encoding'] = 'utf-8'
--> 791 with _open_file_like(f, 'rb') as opened_file:
792 if _is_zipfile(opened_file):
793 # The zipfile reader is going to advance the current file position.
794 # If we want to actually tail call to torch.jit.load, we need to
795 # reset back to the original position.
796 orig_position = opened_file.tell()

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:271, in _open_file_like(name_or_buffer, mode)
269 def _open_file_like(name_or_buffer, mode):
270 if _is_path(name_or_buffer):
--> 271 return _open_file(name_or_buffer, mode)
272 else:
273 if 'w' in mode:

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/serialization.py:252, in _open_file.init(self, name, mode)
251 def init(self, name, mode):
--> 252 super().init(open(name, mode))

FileNotFoundError: [Errno 2] No such file or directory: 's3://<s3_path>/fasttext.ckpt'

Gayathri2993 · 2023-05-07T23:39:39Z

I also tried loading the model directly from the library in my sagemaker instance. But i get duplicated results everytime i try to pass multiple addresses. Surprisingly the same code works as expected in sagemaker studio lab (free ML environment platform). Here's a snippet from sagemaker instance. Can you please let me know why am i getting duplicated results here.

davebulaval · 2023-05-08T19:48:21Z

I will take a look next week; I have a busy week.

Uhmm, strange behaviour for the duplicate.

Can I get a restricted secret key to the S3 bucket? It will be easier for me to test. You can forward it to me using my email david.beauchemin@ift.ulaval.ca.

Gayathri2993 · 2023-05-08T22:22:42Z

Dave, I am not sure if i can share restricted secret key because I am using the S3 for official purpose. Is there any other way to test this?

davebulaval · 2023-05-10T10:35:11Z

@Gayathri2993 No problem, it was more that I didn't want to take the time to set up all that in my personal account.
Right now, I plan to look at it next Thursday since I have other deadlines the following week.

Gayathri2993 · 2023-05-11T04:18:03Z

@davebulaval It would be super helpful if you could prioritise the duplicate records issue whenever you start working on this. Since I have millions of records to run, I would like to run deeparse directly in AWS SageMaker.

Gayathri2993 · 2023-05-19T01:09:41Z

Hi @davebulaval. I just wanted to kindly check in and see if there have been any updates regarding the issues we discussed. I understand that you have other deadlines this week, and I appreciate your time. When it's convenient for you, I would appreciate an update on the progress or any further steps that need to be taken.

davebulaval · 2023-05-20T19:03:35Z

Do the following to see if it work now.

Update to the newest branch version using pip install -U git+https://github.com/GRAAL-Research/deepparse.git@dev.
Test using something like:

uri = "s3://<path>/fasttext.ckpt"

address_parser = AddressParser(model_type="fasttext", path_to_retrained_model=uri)
parse_address = address_parser("350 rue des Lilas Ouest Quebec city Quebec G1L 1B6")

or

uri = CloudPath("s3://deepparse/fasttext.ckpt")

address_parser = AddressParser(model_type="fasttext", path_to_retrained_model=uri)
parse_address = address_parser("350 rue des Lilas Ouest Quebec city Quebec G1L 1B6")

, we now support both approaches.

On my side, I was able to use a URI to download a model to parse and also to upload a model after retraining to an URI directory bucket using S3.

davebulaval · 2023-05-20T19:52:39Z

I also tried loading the model directly from the library in my sagemaker instance. But i get duplicated results everytime i try to pass multiple addresses. Surprisingly the same code works as expected in sagemaker studio lab (free ML environment platform). Here's a snippet from sagemaker instance. Can you please let me know why am i getting duplicated results here.

~~For this, I have no idea how to investigate that and the reason for this behaviour.~~

I found the problem. It is fixed in the branch 9943f04 commit.

I have merged all the content into dev since the content seems to be ready for a release.

Gayathri2993 · 2023-05-22T06:59:19Z

@davebulaval Thankyou so much for your efforts. Both the issues are resolved. The code is working in aws sagemaker and I am not getting duplicate records as well.

I just had a quick question. Do you have any idea on maximum number of records it can parse at a one go?

davebulaval · 2023-05-22T12:29:35Z

Great!

If you use a GPU, it depends on your GPU memory size. From our experimentation with batch processing in Deepparse, the optimal size is around 256. Beyond that, it can take longer per address (due to IO) and data preparation. To alleviate that (we have not tested with more workers), you can increase the `num_workers' to something like 4. Too much is not better.

If you do not use a GPU, it depends on the number of CPU and RAM.

The faster is with a GPU, of course.

davebulaval · 2023-05-22T13:11:49Z

Release in 0.9.7 #195

Gayathri2993 added the bug Something isn't working label Apr 27, 2023

davebulaval closed this as completed May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Deployement in Sagemaker #191

Model Deployement in Sagemaker #191

Gayathri2993 commented Apr 27, 2023

github-actions bot commented Apr 27, 2023

davebulaval commented Apr 27, 2023 •

edited

Loading

davebulaval commented Apr 27, 2023 •

edited

Loading

davebulaval commented Apr 27, 2023 •

edited

Loading

Gayathri2993 commented May 1, 2023

davebulaval commented May 1, 2023

pjbull commented May 5, 2023

davebulaval commented May 7, 2023

davebulaval commented May 7, 2023 •

edited

Loading

davebulaval commented May 7, 2023 •

edited

Loading

Gayathri2993 commented May 7, 2023

Gayathri2993 commented May 7, 2023

davebulaval commented May 8, 2023

Gayathri2993 commented May 8, 2023

davebulaval commented May 10, 2023

Gayathri2993 commented May 11, 2023

Gayathri2993 commented May 19, 2023

davebulaval commented May 20, 2023 •

edited

Loading

davebulaval commented May 20, 2023 •

edited

Loading

Gayathri2993 commented May 22, 2023

davebulaval commented May 22, 2023

davebulaval commented May 22, 2023

Model Deployement in Sagemaker #191

Model Deployement in Sagemaker #191

Comments

Gayathri2993 commented Apr 27, 2023

github-actions bot commented Apr 27, 2023

davebulaval commented Apr 27, 2023 • edited Loading

davebulaval commented Apr 27, 2023 • edited Loading

davebulaval commented Apr 27, 2023 • edited Loading

Gayathri2993 commented May 1, 2023

davebulaval commented May 1, 2023

pjbull commented May 5, 2023

davebulaval commented May 7, 2023

davebulaval commented May 7, 2023 • edited Loading

davebulaval commented May 7, 2023 • edited Loading

Gayathri2993 commented May 7, 2023

Gayathri2993 commented May 7, 2023

davebulaval commented May 8, 2023

Gayathri2993 commented May 8, 2023

davebulaval commented May 10, 2023

Gayathri2993 commented May 11, 2023

Gayathri2993 commented May 19, 2023

davebulaval commented May 20, 2023 • edited Loading

davebulaval commented May 20, 2023 • edited Loading

Gayathri2993 commented May 22, 2023

davebulaval commented May 22, 2023

davebulaval commented May 22, 2023

davebulaval commented Apr 27, 2023 •

edited

Loading

davebulaval commented Apr 27, 2023 •

edited

Loading

davebulaval commented Apr 27, 2023 •

edited

Loading

davebulaval commented May 7, 2023 •

edited

Loading

davebulaval commented May 7, 2023 •

edited

Loading

davebulaval commented May 20, 2023 •

edited

Loading

davebulaval commented May 20, 2023 •

edited

Loading