Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File dependencies from arbitrary URLs #114

Closed
isaacabraham opened this issue Sep 17, 2014 · 35 comments
Closed

File dependencies from arbitrary URLs #114

isaacabraham opened this issue Sep 17, 2014 · 35 comments

Comments

@isaacabraham
Copy link
Contributor

There are many useful code snippets etc. on gists that would be good to have access to.

@forki
Copy link
Member

forki commented Sep 17, 2014

Yes I'd like to see a url source type

@bartelink
Copy link
Member

@tpetricek Re #89 (Sorry for the spam :D) is there a way to express a pinned fssnip reference to a particular version (and/or latest) ?

@forki forki changed the title File dependencies from,GitHub gists File dependencies from arbitrary URLs Sep 17, 2014
@isaacabraham
Copy link
Contributor Author

Also I think e should put all downloaded source files in the project under a 'paket' folder. Doesn't have to be nested underneath that, just a flat list, but if I'm referencing files from GitHub they are essentially read-only as far as I'm concerned and I don't want to have then clogging up the view of my code files in the project.

@forki
Copy link
Member

forki commented Sep 18, 2014

Yes that's a good idea and makes things more clear. But I'm not sure if we can do this without copying the files. If yes then please send idea or pull request.

@bartelink
Copy link
Member

The crickets suggest nobody but me thinks it makes any sense to just generate packages out of them...

Why is it important to have 2 of everything downstream from actually getting the file?

Why can't there be an orthogonal process that can take fssnip, gists, commits and 57 other things and unify them as packages ? What's the reasoning behind the bifurcation ? It's 2 .gitignore entries too for what's fundamentally the same thing - packages ?

I'm all for there being different kinds of packages and treating them differently downstream if there is a clear reason to do so, but for now having them as separate concepts and doing lots of special casing throughout that can be avoided just complicates everything.

For me it's interely concievable that a lightweight packaging mechanism will/can happen (e.g. TeamCity's builtin package gen stuff), and Paket should be in a good place to consume those rather than having masses of docs as to how and why they are specifically handled in the different contexts such as download, version checking, caching of downloads, storage, installation, uninstallation etc.

@forki
Copy link
Member

forki commented Sep 18, 2014

I'm think about this.
Not sure it will really work. But yes we could put the stuff in the packages folder and work a bit in this direction.

@isaacabraham
Copy link
Contributor Author

@bartelink didn't you explain your thoughts on this already (at length) in another issue?

@bartelink
Copy link
Member

@forki Whatt's the bit that won't work? why?

@isaacabraham Sure did. And crickets.

But I'm a bit on the dogged side. Ditto re the file naming. Lots of typing. Lots of crickets. I reiterated and the ideas were taken on board and the docs are the better for it - no need to explain in triplicate what everything is for and which is which.

I have mentioned that I'm a chicken here and I understand that should mean in a voluntary effort, but I'm not a total leech either if you look around. But just because I'm a chicken doesn't mean the pigs have to be ostriches and/or not respond at all. Much less give me the "you've said your piece" as if to imply "man I'm tired of this debate".

I have produced source packages and supported them in production.

I think you'll also find that my most recent input in this thread is in the context of recent information here and does not significantly regurgitate previous stuff unecessarily.

I really like the feature. And I appreciate how important it is for it to be light and neat as it has a massive qualitative impact on Paket.

I fully appreciate that merging two separable concepts in a codebase has its costs and is not always a good policy (DRY, Rule of 3, 57 blog posts arguing it either way). And I'm not saying it's an open and shut case here.

But I will be doing editing on the documentation. And there is more of it if the feature is implemented in one way vs another. And I will be here for a long time in the issues, trying to help. And there will be more questions.

@isaacabraham Do you think I need to do PRs with actual code to be allowed to mention a topic more than once or is there something more fundamental?

@isaacabraham
Copy link
Contributor Author

@bartelink All I'm sayng is that repeatedly putting your views across in slightly different ways across multiple issues won't help getting a feature adopted IMHO.

@forki
Copy link
Member

forki commented Sep 18, 2014

cat

calm down everyone. We're are still evaluating directions. Not everything can be answered instantly.

@forki
Copy link
Member

forki commented Sep 18, 2014

What's the bit that won't work? why?

I don't think we can generate a full nuget package on the fly. At least not easily.
what would we need for this?

  • Dependency on Nuget.Core - no way ;-)
  • Info for a nuspec - mhm. I see problems in generating versions from hashes

@bartelink
Copy link
Member

@isaacabraham Don't know what "getting a feature adopted" means.

Everyone wants to be able to point at files with minimum friction. Nobody wants features to go away. I dont want any less to be achieved.

I'm just trying to influence the impl approach for the benefits it brings re simplicity and composability but most importantly the impact on the surface of the product of having to say and/or consider "for A its this, for B its that".

I believe that having the little things right in Paket as a whole has a big impact on adoption and the attendant network effects - which will be critical over time due to changes in NuGet's end-to-end behavior in the context of AspNetVNext and so on. I also want to personally enjoy doing stuff in the codebase over time other than refactoring.

@bartelink
Copy link
Member

@forki AFAIK there's no real magic other than a nuspec having to be parseable (by Paket's code) and it and the file going into a zip. I agree that having to pull in a dep on NuGet.Core is a lot to take. Will think about how versioning can make sense (but I think it may be good to work through those considerations as a NuSpec is only the messenger here - ultimately stuff from github and/or fssnip both need to be able to distil down to something. The fact that Paket will 'own' the package might allow something as simple as a capture of the local date or date from metadata of the download can be relied on to make it be an always increasing value)

UPDATE: Didnt do research yet; Researching how/when WebApi's magic picks up routes [and then doesnt wrt #65 ] - the sooner we have a no-magic web layer the better...; Will research (a lot) later

UPDATE 2: Asking on SO while I look in NuGet.Core. My eyes! That's one sprawling codebase - seeing all the 'business knowledge' re PCL profiles etc. in C# is horrific. Not refactorable - it'll be the Strangler Pattern to get anywhere with it.

UPDATE 3: Looking in Paket - wow, it's so neat and logical - congrats to all. At first glance, there isn't much sign of the rampant special casing I was presuming (but I've only done a scan pass).

OK, no answers on the SO post for now. I guess I could try other channels such as twitter to ask people directly. Got a SO comment from @bricelam verifying the basic concept [and clarifying that it's a Package, not just a zip].

@forki But I think I can answer your questions:

I don't think we can generate a full nuget package on the fly. At least not easily.

A full package has a nuspec xml that doesn't cause the rest of the system to barf. It is a zip with 2 files: the .nuspec and the file. The structure is the standard one dictated by being a Package (i.e. System.IO.Packaging, which as documented elsewhere is not supported on Mono), i.e. with a manifest etc. just like in Office XML docs,

what would we need for this?
Dependency on Nuget.Core - no way ;-)

Maybe if one could do a file source of just the key 22 files involved :P Looking at the code (and the v2 - v3 split at present) makes it very clear you don't want to be chasing this.

If only there was a magic wrapper service - one shouldnt have package gen integrated into the bowels of Paket no more than have it in the bowels of Nuget.Core :)
@hhariri I wonder do JetBrains or similar have an OS lib for package gen - I doubt TC uses NuGet.Core ?
@maartenba MyGet might have stuff in that arena?

It would seem to me that if a Paket.synth lib can do wrapping downloads into NuGets, then it can be a separate thing. The win is that if .proj files begin to ref .nupkgs directly, Paket is ready to play along. And if Resharper and/or FSharper wants to walk to where the source came from, there's scope for that metadata to be discovered whereas if Paket just has a proprietary insertion argorithm which is separate to how it installs NuGet packages, there's just going to be two parallel sets of code to maintain.

(ASIDE: The way the foldering happens when a snippet is inserted into a project should try to align with conventions in http://nikcodes.com/2013/10/23/packaging-source-code-with-nuget/ )

Info for a nuspec - mhm. I see problems in generating versions from hashes

Just put a 1.0.0-downloaded2014013200 or similar per source and one'll be no better or worse off than before. The package name can be Url_XXX or GitHub_XXX (prob using a char Nuget doesnt permit but Unix and Windows do which IIRC might be _)

@hhariri
Copy link

hhariri commented Sep 19, 2014

TC doesn't integrate it. It reads the feed, displays what's available and allows you to pick what you want (if that's what you're asking...)

@maartenba
Copy link

MyGet uses the nuget.exe to do packaging. We also have some re-packaging going on when pushign to upstream feeds, but we're basically opening the nupkg there as a ZIP file and working with the embedded .nuspec.

Haven't read the entire thread, but in order to be able to distribute source files all you would need is a valid .nuspec in a ZIP file, a Content folder in there containing the files to ship and some Open Packaging Format metadata to tie it all together. Nothing that can't be done with System.IO.Packaging from WindowsBase.

@bartelink
Copy link
Member

@hhariri Thanks for the response; I was referring to TeamCity generating packages. TBH I havent actually used the facility so I might be talking complete nonsense.

@maartenba Thanks, interesting. I've asked the question on SO and had a similar response. One constraint here is that Mono is supported, which (I'm pretty sure) rules out using System.IO.Packaging AFAICT - though you have to think there should be something out there given that the Open Packaging Format is used far more widely than just NuGet).

@hhariri
Copy link

hhariri commented Sep 19, 2014

Yes. For package generation it uses the nuget that you indicate you want.
On 19 Sep 2014 10:07, "bartelink" notifications@github.com wrote:

@hhariri https://github.com/hhariri Thanks for the response; I was
referring to TeamCity generating packages. TBH I havent actually used
the facility so I might be talking complete nonsense.


Reply to this email directly or view it on GitHub
#114 (comment).

@bartelink
Copy link
Member

as in the .exe? (Looking now, I see - you can point to one in your tree)
... and it generates a nuspec from params you supply? (Looking, seems you're composing args for NuGet pack and you get to point at .nuspec files as opposed to composing them)

@forki Do you see NuGet.exe happening to be in a packages folder for some time (i.e. will/does/should the boostrapper be reliant on a NuGet.exe for some time anyway)

@forki
Copy link
Member

forki commented Sep 19, 2014

(the bootstapping doesn't need nuget (no feed and no nuget.exe) at all - Nuget.exe is still in master to generate new packages.)

@hhariri
Copy link

hhariri commented Sep 19, 2014

Yes
On 19 Sep 2014 10:55, "bartelink" notifications@github.com wrote:

as in the .exe?


Reply to this email directly or view it on GitHub
#114 (comment).

@maartenba
Copy link

@bartelink Seems Mono has System.IO.Packaging in there as well? https://github.com/mono/mono/tree/master/mcs/class/WindowsBase/System.IO.Packaging

@bartelink
Copy link
Member

@maartenba Good spotm thanks! will fix assertions in SO Q. @forki Any particular reason you didnt see fit to use that in processing the extraction stuff (assuming I'm correct in that assumption)

@forki
Copy link
Member

forki commented Sep 19, 2014

What is the question? We are just unzipping the nupkg

@bartelink
Copy link
Member

@forki Fair question. The question is:

Given that technically a NuGet Package is an Open Package Format package, and that the code Mono Code is designed exactly for that, are there any specific reasons why it would not be used ?

I'm thinking unless the codebase has a specific reason for doing raw zip level processing, one could consider (in the absence of any other constraints, hence the question), using it instead.

The reasons for doing so would be:

  1. that same lib could manage the Open Package Format metadata aspect of creating a Package
  2. it's what NuGet.Core uses and thus is technically more correct (though I can't think of a concrete reason why that would matter)

@forki
Copy link
Member

forki commented Sep 19, 2014

I'm open for using other libs if you still get this:

image

https://twitter.com/PaketManager/status/512960044057382913

@bartelink
Copy link
Member

Saw that; it's kinda good alright :) The bit that doesnt fit into your tweet is that the paket update time changes from [random() with random retries and reading of powershell output,infinite) to something deterministic too.

Also, when there's no Package restore / nuget.targets crap etc., the build time reduces too...

So, what percentage variance from 1.3s are you prepared to tolerate :P

@forki
Copy link
Member

forki commented Sep 19, 2014

package update is a whole new story - see http://fsprojects.github.io/Paket/faq.html#When-I-resolve-the-dependencies-from-NuGet-org-it-is-really-slow-Why-is-that

but two things:

  • we cache every information we get from nuget.org
  • package update is done less often than install.

So, what percentage variance from 1.3s are you prepared to tolerate :P

As I said we can use a different lib. I actually don't really care if it stays in the same ballpark and is compatible to .NET 4.0 and mono. TBH I tried multiple zip libs. Most didn't work in our async code.

@bartelink
Copy link
Member

I don't care about the package update speed. I'll take Paket any day as NuGet:

  1. doesn't beat it perf wise
  2. doesn't allow me to express my semantic requirements
  3. requires me to interactively manage the process of updating stuff, regardless of whether I'm clicking like a monkey or reading errors from Get-Package | Update-Package and any other such messing.
    With the only downside at all being that at this moment on a sunny day Update-Package can manage the Add-BindingRedirect (If using Paket, you can do a Build followed by Alt-T N O and Add-Binding-Redirect)

Agree with optimizing for install speed. I think if we were talking 10s vs 1.3s I'd start looking at leaving the existing unzip regime exactly as it is (with all the benefits that not messing with code has over messing with it)

(Esp given your experiences above) Don't think replacing the zip code should be done unless there's something else that's going to use the alternative.

... Which brings us back full circle to the original question of whether it's viable/desirable :) (And after that who has the time/interest to do the work involved in having non-NuGet downloads yield packages as their output.)

@ilkerde
Copy link
Contributor

ilkerde commented Sep 19, 2014

Let's get the discussion to where it all started: The question how sources different from nuget are going to be utilized from paket perspective.

I'm very clear on this one: Let's don't (artificially) restrict ouselves to one single source format.

That being said, I do very much see some benefits of (nuget) source packages as @bartelink mentions them. They specifically address the typical issues like pinning versions and recognised metadata.

Nonetheless, I strongly prefer to not limit our possibilities and just use the format which fits well for the source in question. Technically speaking, we don't need any packaging whatsoever if the content we want to depend on is already provided as is. We all know that a URL has nice properies as well (unique, resourceful, negotiatable).

We don't do Nuget^2. We do Paket. Let's reinvent things, not replay things.
(My 2 philosophical cents on that topic as a spectator.)

@bartelink
Copy link
Member

@ilkerde How about if #r had intrinsic support for NuGet packages a la the origin of #124 ?

How about if we solved the package storage and installation concerns once each until we actually need to vary those aspects based on something useful ?

I'm talking about an engineering impl decision more than a fundamental benefit that stuffing things that are not in packages into packages would bring.

I'm not proposing to do away with Urls at all on the download side.

But it would be nice if the .references file didnt make them different.

@bartelink
Copy link
Member

@ilkerde The above was written on the run - I both didnt take a lot of time to consider your fundamental point and to express the bits I feel you (and likely others) are missing. will read it in conjunction with your other proposal and will translate to a more well laid out thing like 125 and/or explore overlaps with that.

@bartelink
Copy link
Member

@ilkerde #123 took some time, sorry!

Firstly, I'll explain myself in detail (sorry @isaacabraham this time there definitely will be redundancy :P)

I'm talking of changing a pipeline of:

  1. Deps syntax Nuget / FsSnip / GitHub / Gist - docs and fine tuned syntax x4
  2. Download strategy x4 (but parallell across all)
  3. Intrinisic versioning for nuget, various plans at various stages of maturity for others
  4. NuGet take cached package and warn + 1-3 rate limiting strategies / fallback strategies when target not reachable
  5. Caching/storage location strategy x 2-4 (+ gitignore management)
  6. ref syntax parse strategy / docs x 2-4
  7. install process docs / troubleshooting / understanding code x 2-4 + gitignore / where it lives in my project strategy
  8. installation of nuget packages can do platform-related switches in natural way / source code probably doesnt care
  9. If the item installed was a NuGet package, can emit ref into csproj / proj.json if asp.net for NuGet but not for anythign else
  10. Only if NuGet can FSI, Resharper, FSharper backtrace to where stuff comes from

to

  1. Deps syntax Nuget / FsSnip / GitHub / Gist - docs and fine tuned syntax x4 (same)
  2. Download strategy x4 (but parallell across all) (same)
  3. Intrinisic versioning for nuget, various plans at various stages of maturity for othersneed to define 3 strategies for deriving a version - download date is a good placeholder
    3A: Generate NuGet package
    • define install location
    • add metadata for downstream use as necessary
    • generate a package id that does not clash with 'real' package names but provides an equivalent syntax to what 6 was doing
  4. NuGet take cached package and warn + 1-3 rate limiting strategies / fallback strategies when target not reachable Whatever you were doing for NuGets + any bugfixes/optimizations/refactorings
  5. Caching/storage location strategy x 2-4Whatever you were doing for NuGets + any bugfixes/optimizations/refactorings
  6. ref syntax parse strategy / docs x 2-4 Part of 3A docs re how spec in 1 maps to a package name
  7. install process docs / troubleshooting / understanding code x 2-4Can adjust as desired by changing strategy for generating package in 3A, but prob never would
  8. installation of nuget packages can do platform-related switches in natural way / source code probably doesnt care source code could trigger conditionality based on build props but only if necessary
  9. If the item installed was a NuGet package, can emit ref into csproj / proj.json if asp.net for NuGet but not for anythign else
  10. Only if NuGet can FSI, Resharper, FSharper backtrace to where stuff comes from

Another thing that has yet to be addressed on the files side is how to manage (accidental or on purpose) edits or reames to files - NuGet source packages have a story for this (it's not a short one but the last thing we need is to not have that implemented and have no story for the files side either) -> if you have a package, you can start to reason "the project was referencing package A v 1 and now it references package A V 2, so do a checked delete of helpers/abeta.fs (warning if there have been editslike NuGet) and do a checked add of ./Av2.fs.pp (replacing the namespace with the project's default namespace as NuGet does) )

I'm not saying 3A needs to be a tool. Or that we should change into a 3A system. Or that there cannot possibly be anything ever other than a NuGet package passing after stage 3. But I do see 1-3A as an interesting and complex beast which

a) we want to be able to rev on
b) we might want to be able to pull out
but most importantly
c) we don't want compected into subsequent stages without a damn good reason

But I am saying that if we view 1-3A as "get the package" and the rest as "consume the package' then we can reason about 2 phases and not have to think about how each set of stuff gets used at each point. In other words, if you were to do integrated tests of all the types, how would you reasonably triangulate (Appeal to authority: @ploeh 's Advanced Unit testing course on PluralSight :D) this?

@bartelink
Copy link
Member

@ilkerde So now I'll respond to your points:-

I'm very clear on this one: Let's don't (artificially) restrict ouselves to one single source format.

I am definitely not saying any of

a) we need to standardise on NuGet
b) NuGet is the best format
c) No other formats are allowed

My proposal of expressing a file or set of files as a .nupkg happens to work across known properties of github downloads/gists/fssnips required for the impl of the "consume the package" phase. While it happens to reduce the number of downstream formats from n to 1, it's not permanent (and inside the package, there can easily be metadata that can allow one to switch behavior in relevant ways downstream - ideally declaratively by things like destination folders, platform conditions etc.

That being said, I do very much see some benefits of (nuget) source packages as @bartelink mentions them. They specifically address the typical issues like pinning versions and recognised metadata.

Yes, though for an initial impl, each of these can be nulled out - esp as, depite my armwaving, there are no interesting downstream consumption cases yet...

Nonetheless, I strongly prefer to not limit our possibilities and just use the format which fits well for the source in question. Technically speaking, we don't need any packaging whatsoever if the content we want to depend on is already provided as is. We all know that a URL has nice properies as well (unique, resourceful, negotiatable).

Yes for the "get the package bit" but these properties of URLs (and the simplicty of passing around a primitive string) are only really useful at the front end.

For the rest of the processing (dealing with network outages, installs, uninstalls, downstream tool integration), there are no actual goals of the project on the table that have any demonstrated need to have specific handling per phase.

There are plenty things left to do on the source packages side if there are resources available - why do all that stuff in dupliocate/triplicate?

We don't do Nuget^2. We do Paket. Let's reinvent things, not replay things.
(My 2 philosophical cents on that topic as a spectator)

Yes, but the overwhelming majority of packages right now are NuGet-based. And there are many thorny complexities that have been distilled into pretty DUs as necessary. But there are still plenty edge cases. There are still inadequacies in NuGet source package handling (e.g. https://github.com/damianh/LibLog and many more require source file token substitution).

I'm using Paket now for WebApi stuff because of the strength of the workflow it affords. But I need Paket to make it to AspNetVNext land. I have stacks of loose files I want to use. I see libs like https://github.com/bartelink/FunDomain/ as best delivered as source and having ways to compose simple apps without everything having to turn into a 5000 line ES framework being critical to light maintainable apps.

But If we ride two horses of being great for NuGet and having lots of polished source file integration with lots of special casing, tracking what NuGet does becomes harder.

Having looked at NuGet.Core as part of the research for this, I wouldnt wish being bug compatible with and/or "replaying" Nuget on anyone - it's simply not attainable (The NuGet team are earning their money getting to V3 if you ask me) ! For me, finding ways of cutting out unnecessary complecting of concerns is critical to the long term desire of sane people to want to work on the code.

And there's still the elephant in the room of AspNetVNext and/or a Desktop CLR and/or language which will work off NuGets directly (and plenty trends pointing to mixes of languages within a repo).

@ploeh
Copy link

ploeh commented Sep 20, 2014

FWIW, if you want to appeal to authority regarding Integration Testing, a better source is J.B. Rainsberger's talk Integration Tests Are a Scam in which he clearly explains why, mathematically, Integration Testing can never ensure the Basic Correctness of an even moderately complex system.

There are also other concerns than Basic Correctness, such as performance, robustness, thread-safety, and, indeed, integration, and Integration Tests or full System Tests can sometimes better cover those aspects, but since they are harder to write and maintain, one should have fewer of them. This is the motivation behind the Test Pyramid.

This was referenced Sep 21, 2014
@bartelink
Copy link
Member

@ploeh Thanks - well explained and that article is always worth a reread UPDATE and I can't recall watching the video/mp3 so thanks!.
@forki Could close this in favor of #154

@forki forki closed this as completed Sep 28, 2014
forki pushed a commit that referenced this issue Oct 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants