Looking for improvement ideas for Poutyne #99

freud14 · 2020-11-27T16:40:00Z

We are looking for ideas to improve Poutyne. So, if you have any, you can provide them by replying to this issue.

Atharva-Phatak · 2021-03-07T14:53:28Z

Adding metric learning models like ArcFace , CosFace etc ? If so I can create a pull request and implement them ?

freud14 · 2021-03-07T15:37:28Z

Hi @Atharva-Phatak, thank you for the suggestion, I think it's a really good idea. However, before you implement this feature, I would like to know more on how you're planning to implement it. For instance, do you plan to have a unified interface which would be used all learned metrics? In which case, what would this interface look like? Do you plan on providing utilities for re-training them or just pre-trained learned metrics?

Atharva-Phatak · 2021-03-08T05:47:30Z

Well the modules like ArcFace , CosFace have different logic(some calculations are different)
I was thinking of implementing them as simple classes which inherit from nn.Module and have their own forward method.

There is no point in developing a unified interface because the parameters and core logic they require are different.

Also arcface and cosface are more like heads which are used on top of pretrained backbones instead of simple linear layers, they are used extremely in Face Recognition.

Let me give you an example.

Class ArcFace(nn.Module):
        def __init__(self , some_params):
             initialize those params
        def forward(self , x , targets):
             core logic of arcface

#Then you can create arcface head like this
arcface = ArcFace(some_params)

#Then you can add this arcface layer/module on top of any model.
res18 = torchvision.models.resnet18(pretrained = True)
res18.head = ArcFace(some_param)

And Viola you can now train your own metric learning model

I hope I was clear enough. I you have further doubts I can definitely clarify further.

freud14 · 2021-03-08T16:56:29Z

Hi,

I don't know if you've seen but in Poutyne we split the network part and the loss function part. Now, as I understand, ArcFace would be used as a loss function. So either the user would have to do manipulations so that the parameters of the loss function be passed to the optimizer or we do it ourselves in Model when the user passes a string for the optimizer. So I'm just wondering on how you see the integration of this with Poutyne. If you can give me an example (maybe start with your current example?) on how you see this integration.

Thank you.

Atharva-Phatak · 2021-03-11T05:34:35Z

Hi,

So the paper mentions it as loss function, but in implementation it is actually a series of steps(called as margin products) which help you obtain the logits and when you get logits you can apply any loss function on those logits(same as classification).

So lets consider the example.

Class ArcFace(nn.Module):
        def __init__(self , some_params):
             initialize those params
        def forward(self , x , targets):
             core logic of arcface,
             
            returns logits
  
#Then you can create arcface head like this
arcface = ArcFace(some_params)

#Then you can add this arcface layer/module on top of any model.
res18 = torchvision.models.resnet18(pretrained = True)
res18.head = ArcFace(some_param)
logits = res18(inputs , targets) #since while training arcface requires the targets(to compute the margins)
loss = loss_fn(logits , targets)

The above steps are same as what poutyne does, when creating the model the user just defines the ArcFace or ArcFaceHead(names can be discussed later) in the model and then loss can be computed in standard way how poutyne does it.

I hope this is helps.

freud14 · 2021-03-14T17:51:32Z

Hi, sorry for the delay. Could you make a complete working/runnable example with Poutyne just so I can see how everything fits together? I can be with a dummy learned metric to make it simpler.

AlekseyFedorovich · 2021-06-17T14:59:30Z

Like in scikit-learn the model should deduce the number of

training samples
classes
features

from training input data. This is an issue also for other similar libraries like skorch

freud14 · 2021-06-17T15:12:15Z

Hi @AlekseyFedorovich, I am not sure of the need to deduce these. Could you elaborate? By the way, we already deduce the number of training samples so that the aggregation of the batch metrics be exact.

AlekseyFedorovich · 2021-06-21T08:06:56Z

@freud14 if you put your predictor in a pipeline with other transformations it is very unhandy to pass the input dimensions to the predictor inside the pipeline (in many cases you don't know that dimensions a-priori like in feature selections based on statistics), furthermore from the "Do not Repeat Yourself" point of view this is redundant since the dimension of the input X can be deduced from the X itself during training

freud14 · 2021-06-21T15:50:36Z

Hi, Poutyne does not create the architecture of the network for you and thus cannot modified your network in that way. However, it might be a good idea to make some kind of class that calls a user-defined function to instantiate the network with a some input size. I've never really used ML pipelines (I suppose à la sklearn pipeline?) but I would be open for a contribution in that way. Let me know if you have any plan of doing so.

AlekseyFedorovich · 2021-06-22T06:43:24Z

When you put your model in production you must have some sort of pipeline because data are never in a form that can feed the predictor directly. Yes I normally use the sklearn Pipeline functionality for that.
I'd love to contribute to an open-source project like this but I lack the skills and the time for that, I'm sorry. I appreciated the request for improvements of this issue and I wanted to bring my point of view.

freud14 · 2021-06-22T16:07:07Z

Alright. Anyway, if you have an idea of what an API for that would look like, let me know,

adnan33 · 2021-06-28T16:00:57Z

Great work with the library. Is there any plan to support TPU training or creating an abstraction to allow swapping devices with minimal changes to the code?

freud14 · 2021-06-28T17:37:18Z

Hi @adnan33, you can already change from CPU to GPU or GPU to another GPU easily. For TPUs, I never really used them (and don't know anyone who does) so I don't really know the technical implications of integrating them into Poutyne. However, I would be really interested in adding the feature to Poutyne if you think you (or anyone else) can make the contribution.

ShawnSchaerer · 2021-07-28T13:04:10Z

I would like to see better integration with MLFlow and the ability to use the high-level API mlflow.

freud14 · 2021-07-28T13:17:24Z

Hi @ShawnSchaerer, if you haven't seen, we already have an MLFlowLogger. Let me know if you think there are missing features in it.

ShawnSchaerer · 2021-07-28T17:11:04Z

Hi @ShawnSchaerer, if you haven't seen, we already have an MLFlowLogger. Let me know if you think there are missing features in it.

It looks like if you use the mlflow client then the mlflow tags are not set and you need to do that manually. If you use the higher level MLflow API directly then the tags are filled in automatically. I am filling them in manually but they are reserved and not sure if that is an issue or not. You could use the MLflow API directly and instead of the lower level Tracking API

davebulaval · 2021-07-29T16:10:22Z

It looks like if you use the mlflow client then the mlflow tags are not set and you need

Do you have a code example? I'm not sure to follow what you mean.

f you use the higher level MLflow API directly then the tags are filled in automatically. not sure what you mean here.

_ I am filling them in manually but they are reserved and not sure if that is an issue or not._ not sure to follow up on that one. If you mean default tags like this it is not an error. Otherwise, need more explication on my side.

Also, at first, the idea of using the MLFlow client class was to be able to save the model directly into the artifacts but I was not able to manage properly the path since MLFlow is broken (I mean by that that I was able to save it locally OR save it on remote but not the two). We chose at the end to remove that part since it was not working.

SauravMaheshkar · 2021-12-25T19:16:46Z

I would like for there to be a Dockerfile accompanying poutyne. I usually run training jobs on VMs using the NVIDIA Container Toolkit. Having a Docker image for this package would be really helpful for experimentation and building on top of poutyne.

More than happy to work on this.

freud14 · 2021-12-28T17:28:51Z

I would like for there to be a Dockerfile accompanying poutyne. I usually run training jobs on VMs using the NVIDIA Container Toolkit. Having a Docker image for this package would be really helpful for experimentation and building on top of poutyne.

More than happy to work on this.

@ShawnSchaerer Good idea! On what image do you want to base it on?

SauravMaheshkar · 2021-12-28T17:40:57Z

I would like for there to be a Dockerfile accompanying poutyne. I usually run training jobs on VMs using the NVIDIA Container Toolkit. Having a Docker image for this package would be really helpful for experimentation and building on top of poutyne.
More than happy to work on this.

@ShawnSchaerer Good idea! On what image do you want to base it on?

I think the base image should be with the latest stable versions of PyTorch, cuda and cudnn. If size is an issue I'm sure we can reduce the size by using multi-staged builds and wheels for python packages. What do you think ?

freud14 · 2021-12-28T17:52:30Z

I would like for there to be a Dockerfile accompanying poutyne. I usually run training jobs on VMs using the NVIDIA Container Toolkit. Having a Docker image for this package would be really helpful for experimentation and building on top of poutyne.
More than happy to work on this.

@ShawnSchaerer Good idea! On what image do you want to base it on?

I think the base image should be with the latest stable versions of PyTorch, cuda and cudnn. If size is an issue I'm sure we can reduce the size by using multi-staged builds and wheels for python packages. What do you think ?

Sounds about right. Is there already one that exists? Because the one on the PyTorch repo uses the nightly build.

SauravMaheshkar · 2021-12-28T17:56:04Z

I would like for there to be a Dockerfile accompanying poutyne. I usually run training jobs on VMs using the NVIDIA Container Toolkit. Having a Docker image for this package would be really helpful for experimentation and building on top of poutyne.
More than happy to work on this.

@ShawnSchaerer Good idea! On what image do you want to base it on?

I think the base image should be with the latest stable versions of PyTorch, cuda and cudnn. If size is an issue I'm sure we can reduce the size by using multi-staged builds and wheels for python packages. What do you think ?

Sounds about right. Is there already one that exists? Because the one on the PyTorch repo uses the nightly build.

How about pytorch/pytorch:1.10.1-cuda11.3-cudnn8-runtime

freud14 · 2021-12-28T19:11:20Z

How about pytorch/pytorch:1.10.1-cuda11.3-cudnn8-runtime

Sounds good. Just open a PR when you're ready.

gbmarc1 · 2022-07-20T20:55:57Z

I was thinking of linking the Experiment object to an Experiment of determined AI framework (https://www.determined.ai/). The idea would be to train the models on determined ai managed resources if you choose the device="detai" or something similar.
I think of investigating this idea in the next days. If it make sense, I will make a PR for that.

freud14 · 2022-07-20T21:33:16Z

Hi @gbmarc1, this looks interesting indeed. If you can make a PR for that, it would be nice. However, we have to take care of not overcomplexifying the code base since this seems to impact all of the Model class code. Let me know of your exact plan of how you think you can integrate this so that we do not loose each other's time. If you wish, we can make a Zoom call about it.

gbmarc1 · 2022-07-21T15:45:12Z

Hi @gbmarc1, this looks interesting indeed. If you can make a PR for that, it would be nice. However, we have to take care of not overcomplexifying the code base since this seems to impact all of the Model class code. Let me know of your exact plan of how you think you can integrate this so that we do not loose each other's time. If you wish, we can make a Zoom call about it.

Good idea. I will first do a little design first, and then we could have that meeting. I will come back to you when I am ready.

davebulaval · 2022-07-21T16:48:00Z

My two cents on that matter

Ideally, I think the design should allow using a more generic interface for cloud computing. I'm pretty sure there are other similar solutions out there. Thus, if the proposed approach only allows the use of determined AI, it complicates the codebase and only binds Poutyne to this particular solution.

I think the best approach would be to keep the two as decoupled as possible. Poutyne does not need a cloud computing solution to run, it only need a "device" whatever type that is (cpu, gpu, or cloud service). Thus, I think we should inject a dependency rather than add dependency in the Poutyne Experiment interface. Something like device=InterfaceObjectForcloudComputing(). I don't know if it is possible due to networking and all.

gbmarc1 · 2022-07-21T17:02:34Z

My two cents on that matter

Ideally, I think the design should allow using a more generic interface for cloud computing. I'm pretty sure there are other similar solutions out there. Thus, if the proposed approach only allows the use of determined AI, it complicates the codebase and only binds Poutyne to this particular solution.

I think the best approach would be to keep the two as decoupled as possible. Poutyne does not need a cloud computing solution to run, it only need a "device" whatever type that is (cpu, gpu, or cloud service). Thus, I think we should inject a dependency rather than add dependency in the Poutyne Experiment interface. Something like device=InterfaceObjectForcloudComputing(). I don't know if it is possible due to networking and all.

@davebulaval do you have another solution for cloud computing that I could refer to help me generalize the design?

gbmarc1 · 2022-07-26T17:59:10Z

@freud14 ,
I am hitting some roadblocks and I think it would be good to brainstorm with someone else before going further. I added you on LinkedIn to organize the zoom meeting when you are available!

redthing1 · 2023-01-30T22:35:51Z

ONNX/general deployment support? Is there a good way

davebulaval · 2023-01-31T00:17:52Z

@redthing1 IMO ONNX is a pain in the ass to use since you need to specify the batch size beforehand.

freud14 · 2023-02-04T21:08:10Z

ONNX/general deployment support? Is there a good way

Thanks for the suggestion. I've never used ONNX so I'm not sure of the pitfalls and how actually easy it is to work with. If you were able to make some pseudocode/code on how you would like to use it with poutyne, I would be happy.

atremblay · 2023-05-04T13:24:35Z

Any interest in supporting closure for the optimizer step? https://pytorch.org/docs/stable/optim.html#optimizer-step-closure

freud14 · 2023-05-06T13:11:07Z

Hi @atremblay , Would be interested yes but that would require quite a bit of refactoring with the current state of the code base. Also, since theses approaches compute multiple times the predictions and the loss, I'm not sure how we would integrate that with the metrics we return. I never actually used these optimizers so not sure what people do in practice.

freud14 pinned this issue Nov 27, 2020

freud14 added enhancement help wanted labels Nov 27, 2020

freud14 changed the title ~~Looking for improvement ideas for Poutyne?~~ Looking for improvement ideas for Poutyne Dec 9, 2020

Looking for improvement ideas for Poutyne #99

Looking for improvement ideas for Poutyne #99

Comments

freud14 commented Nov 27, 2020

Atharva-Phatak commented Mar 7, 2021

freud14 commented Mar 7, 2021

Atharva-Phatak commented Mar 8, 2021 • edited Loading

freud14 commented Mar 8, 2021

Atharva-Phatak commented Mar 11, 2021

freud14 commented Mar 14, 2021

AlekseyFedorovich commented Jun 17, 2021 • edited Loading

freud14 commented Jun 17, 2021

AlekseyFedorovich commented Jun 21, 2021 • edited Loading

freud14 commented Jun 21, 2021

AlekseyFedorovich commented Jun 22, 2021

freud14 commented Jun 22, 2021

adnan33 commented Jun 28, 2021

freud14 commented Jun 28, 2021

ShawnSchaerer commented Jul 28, 2021

freud14 commented Jul 28, 2021

ShawnSchaerer commented Jul 28, 2021

davebulaval commented Jul 29, 2021

SauravMaheshkar commented Dec 25, 2021

freud14 commented Dec 28, 2021

SauravMaheshkar commented Dec 28, 2021

freud14 commented Dec 28, 2021

SauravMaheshkar commented Dec 28, 2021

freud14 commented Dec 28, 2021

gbmarc1 commented Jul 20, 2022

freud14 commented Jul 20, 2022

gbmarc1 commented Jul 21, 2022

davebulaval commented Jul 21, 2022 • edited Loading

My two cents on that matter

gbmarc1 commented Jul 21, 2022

My two cents on that matter

gbmarc1 commented Jul 26, 2022

redthing1 commented Jan 30, 2023

davebulaval commented Jan 31, 2023

freud14 commented Feb 4, 2023

atremblay commented May 4, 2023

freud14 commented May 6, 2023

Atharva-Phatak commented Mar 8, 2021 •

edited

Loading

AlekseyFedorovich commented Jun 17, 2021 •

edited

Loading

AlekseyFedorovich commented Jun 21, 2021 •

edited

Loading

davebulaval commented Jul 21, 2022 •

edited

Loading