Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate most of the logic that doesn't depend on CLI to a separate shared library #295

Merged
merged 17 commits into from
Jan 21, 2022

Conversation

jpinz
Copy link
Member

@jpinz jpinz commented Dec 15, 2021

Right now, a lot of the code logic is tightly coupled between the actual logic itself and the implementation in the CLI tool for the feature.
This PR attempts to decouple that logic into a Shared-lib that contains the logic without depending on the CLI related work. It doesn't have to be accepted as is, or at all ever. I just wanted to create this to point out some things that needed to be fixed in order for CFS to consume it. Primarily we can't consume it in it's current state due to the Common initialization web client setup that is being used, so I switched everything to DI and using the IHttpClientFactory.
I look forward to reading your feedback!

Note: It appears that the Unit Tests use live http clients, and therefore they're all broken right now.

@gfs
Copy link
Contributor

gfs commented Dec 17, 2021

Right now, a lot of the code logic is tightly coupled between the actual logic itself and the implementation in the CLI tool for the feature.

Which tool/tools are you specifically referring to?

This PR attempts to decouple that logic into a Shared-lib that contains the logic without depending on the CLI related work.

Can you elaborate what you mean by this? Find squats lib (and the shared lib) don't have any dependencies on the cli packages.

I gave a stand-alone usage example here:

#277

What is the rationale here for splitting the shared library into two different libraries?

It doesn't have to be accepted as is, or at all ever. I just wanted to create this to point out some things that needed to be fixed in order for CFS to consume it. Primarily we can't consume it in it's current state due to the Common initialization web client setup that is being used, so I switched everything to DI and using the IHttpClientFactory.

Can you elaborate on why this is a problem? Do you need to configure the http client in a specific way? Is there something else the common initialization is doing that you don't like?

I look forward to reading your feedback!

Note: It appears that the Unit Tests use live http clients, and therefore they're all broken right now.

@jpinz
Copy link
Member Author

jpinz commented Dec 17, 2021

Right now, a lot of the code logic is tightly coupled between the actual logic itself and the implementation in the CLI tool for the feature.

Which tool/tools are you specifically referring to?

One of the tools we are hoping to contribute to and consume after the typo squatting work is completed is find-source. Looking into it now I personally find it confusing as to what the FindSourceTool does, and then there's a layer of RepoSearch that seems to just call a method on the project manager itself. I find it slightly over complicated. I think, that for Terrapin's use case, we would probably just directly call the method in the project manager but why doesn't the find source tool just do that itself?
I don't know if the best solution for us to consuming OSS-Gadget's tools is for us to just create a project manager based on the PackageURL we are working with and using utility/helper methods within the manager, as find source appears to be doing, or if the pattern we went with for typo squatting with the explicit typo squatting library class to be called.

This PR attempts to decouple that logic into a Shared-lib that contains the logic without depending on the CLI related work.

Can you elaborate what you mean by this? Find squats lib (and the shared lib) don't have any dependencies on the cli packages.

It seems that there is a dependency on the CommandLineParser library in Shared.csproj which doesn't really make sense for us to be depending on if we aren't using it as a command line utility. It might just be able to be refactored out in a different way than I did.

It doesn't have to be accepted as is, or at all ever. I just wanted to create this to point out some things that needed to be fixed in order for CFS to consume it. Primarily we can't consume it in it's current state due to the Common initialization web client setup that is being used, so I switched everything to DI and using the IHttpClientFactory.

Can you elaborate on why this is a problem? Do you need to configure the http client in a specific way? Is there something else the common initialization is doing that you don't like?

We use an IHttpClientFactory through DI everywhere in our project, so when I tried creating a TerrapinProjectManager based on your comments in #282 we were facing a bunch of issues with how CommonInitialization is utilized, manually opening the sockets and creating a static web client that gets used across the project manager.
Again, all of the work I did might not make sense 😆 but the general idea is that the way web client is consumed across this repo doesn't appear to be compatible with Terrapin from what I know. The whole library seems overly complex to consume. Which makes sense because none of it was written to be a library initially, but if there's any feedback I can provide to make the process easier, I would be more than happy to.

@gfs
Copy link
Contributor

gfs commented Dec 17, 2021

Right now, a lot of the code logic is tightly coupled between the actual logic itself and the implementation in the CLI tool for the feature.

Which tool/tools are you specifically referring to?

One of the tools we are hoping to contribute to and consume after the typo squatting work is completed is find-source. Looking into it now I personally find it confusing as to what the FindSourceTool does, and then there's a layer of RepoSearch that seems to just call a method on the project manager itself. I find it slightly over complicated. I think, that for Terrapin's use case, we would probably just directly call the method in the project manager but why doesn't the find source tool just do that itself?

I don't believe I wrote this so I'm extrapolating - but RepoSearch is also used in the health tool, so I assume it is in shared to deduplicate code.

Reposearch appears to be a wrapper that allows you to get the source url without having to initialize the package manager yourself. I think for your use you'd probably just want to use the appropriate manager directly.

I don't know if the best solution for us to consuming OSS-Gadget's tools is for us to just create a project manager based on the PackageURL we are working with and using utility/helper methods within the manager, as find source appears to be doing, or if the pattern we went with for typo squatting with the explicit typo squatting library class to be called.

I think it depends on which tool you want to use. Many of them are very thin wrappers were it doesn't make sense/there isn't much to factor put into a library. Find source and metadata stand out as two offhand that fit in that bucket.

This PR attempts to decouple that logic into a Shared-lib that contains the logic without depending on the CLI related work.

Can you elaborate what you mean by this? Find squats lib (and the shared lib) don't have any dependencies on the cli packages.

It seems that there is a dependency on the CommandLineParser library in Shared.csproj which doesn't really make sense for us to be depending on if we aren't using it as a command line utility. It might just be able to be refactored out in a different way than I did.

CommandLineParser seems to only be required for the ossgadget class which is the base class for the CLIs to
If that dependency is a big sticking point I could see if just the ossgadget class could be refactored into a cli-shared lib separate from the main shared lib.

It doesn't have to be accepted as is, or at all ever. I just wanted to create this to point out some things that needed to be fixed in order for CFS to consume it. Primarily we can't consume it in it's current state due to the Common initialization web client setup that is being used, so I switched everything to DI and using the IHttpClientFactory.

Can you elaborate on why this is a problem? Do you need to configure the http client in a specific way? Is there something else the common initialization is doing that you don't like?

We use an IHttpClientFactory through DI everywhere in our project, so when I tried creating a TerrapinProjectManager based on your comments in #282 we were facing a bunch of issues with how CommonInitialization is utilized, manually opening the sockets and creating a static web client that gets used across the project manager.

Okay. I'm haven't used httpclientfactory before so I'm not very familiar with the pattern. I can take a look at what you did here and see how we can accommodate the situation where you want to bring your own httpclient - it is more likely that will come as a non breaking change for current consumers that does something equivalent to the current behavior if one is not provided given that is the desired behavior for the primary consumer of the library (the oss gadgets themselves).

Does this mean that exceptions are being thrown? Can you point to the specific line you are, or provide a simple repro repo for this behavior?

I see there is some socket workaround with a comment that it is no longer needed with .net 6. I recently bumped ossgadget so I should be able to remove that now.

Again, all of the work I did might not make sense 😆 but the general idea is that the way web client is consumed across this repo doesn't appear to be compatible with Terrapin from what I know. The whole library seems overly complex to consume. Which makes sense because none of it was written to be a library initially, but if there's any feedback I can provide to make the process easier, I would be more than happy to.

As much as you can share what you want to ultimately accomplish it helps me to ensure you have the appropriate points to hook into.

@gfs
Copy link
Contributor

gfs commented Dec 17, 2021

I'll take a closer look at the PR itself next week.

I am leaning towards using the ihttpclientfactory and removing the static shared httpclient.

Fix no default parametereless constructor for gadgets.
The usings were causing the default static HttpHandler to become disposed prematurely.

Restore ENvironment variable setting to baseprojectmanager.
@gfs
Copy link
Contributor

gfs commented Jan 3, 2022

I pushed some changes to the PR.
I fixed the library project files so the Shared lib can continue to be published on nuget as is.
I fixed the tests so that they now run.
I restored the default behavior of using a shared httpclient as an option.
I restored the removed functionality to set environment variables.
I reverted many of the unnecessary namespace changes.

This isn't ready to merge yet - the default behavior that exists now of using a default httpcllient for the tools is not acceptable - for example, the Cargo Package manager will 403 if you don't provide a user agent string. This additionally means the loss of changing the user agent string with environment variables for those CLI tools. Will need to be resolved before mrging.

@gfs
Copy link
Contributor

gfs commented Jan 5, 2022

@jpinz can you check if the modified libraries here will work for your purposes? The remaining changes would be with the CLIs afaik.

@jpinz
Copy link
Member Author

jpinz commented Jan 6, 2022

@gfs I think this should do the trick! We can create a custom package manager and use our IHttpClientFactory now.

@gfs
Copy link
Contributor

gfs commented Jan 6, 2022

Sounds good. I'll aim to resolve the remaining issues and merge this by mid next week.

@gfs
Copy link
Contributor

gfs commented Jan 18, 2022

I wasn't able to complete this last week. It's still on my todo list but I won't get to it until the end of this week or early next week.

@gfs
Copy link
Contributor

gfs commented Jan 18, 2022

To clarify what is missing:

You can see in the default http client implementation that we used to use specific timeout settings and a user agent string that was configurable via environment settings. I want to maintain that behavior for the default CLIs so that needs to be added.

Update DefaultHttpClientFactory to use the environment variable for the user agent.
After removal of the static reference to httpclient this no longer did anything.
@gfs
Copy link
Contributor

gfs commented Jan 21, 2022

/azp run

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

These constructors call into the IHTTPClientFactory constructor with DefaultHttpClientFactory.
@gfs
Copy link
Contributor

gfs commented Jan 21, 2022

@jpinz I think this is ready to merge. I ran tests manually. Unclear why they are not runing on the PR. Perhaps because its cross repo?

@gfs gfs merged commit a25747d into microsoft:main Jan 21, 2022
@jpinz
Copy link
Member Author

jpinz commented Jan 21, 2022

@gfs looks awesome! Are you pushing the changes to NuGet too?

@gfs
Copy link
Contributor

gfs commented Jan 21, 2022

@gfs looks awesome! Are you pushing the changes to NuGet too?

Yes. Pipeline is already running. Builds are automatically made and published from the main branch.

@gfs
Copy link
Contributor

gfs commented Jan 21, 2022

Pipeline failed. #297 fixes that and a minor bug. Builds should work again after that merge i believe.

@gfs gfs mentioned this pull request Jan 21, 2022
@jpinz
Copy link
Member Author

jpinz commented Jan 24, 2022

Pipeline failed. #297 fixes that and a minor bug. Builds should work again after that merge i believe.

@gfs I'm getting an error when trying to import the library into our project because the shared library hasn't been updated on Nuget.
error: NU1102: Unable to find package Microsoft.CST.OSSGadget.Shared with version (>= 0.1.314)

@gfs
Copy link
Contributor

gfs commented Jan 24, 2022

Looks like a pipeline issue. I see 3 different versions published for the 3 ossgadget libs. I'll fix today.

@jpinz jpinz deleted the jupinzer/shared_lib branch March 9, 2022 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants