Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle non-local dependencies #329

Closed
thufschmitt opened this issue Mar 18, 2021 · 18 comments
Closed

Handle non-local dependencies #329

thufschmitt opened this issue Mar 18, 2021 · 18 comments

Comments

@thufschmitt
Copy link
Contributor

Is your feature request related to a problem? Please describe.

The only way in Nickel to load some external code is by hardcoding its path (relative or absolute) in the nickel file. This is fine to refer to files inside of a project, but doesn't scale as soon as someone wants to refer to an external nickel “library”

Describe the solution you'd like

Have a mechanism to reference “external files” in nickel.

I guess designing the final shape will require a bit of work, but right on top of my head I can think of several approaches

  1. Dhall-like semantics where some urls can be directly imported
  2. Unix-style (also Jsonnet's) semantics, with a list of search paths (either provided on the cli with a -I argument, or via a NICKEL_PATH environment variable)
  3. lockfile semantics, with a nickel.lock file (or whatever) that would specify how to map specific inputs to actual files. This lockfile could either be generated by nickel itself (but that seems quite out-of-scope for the language), or could have a public schema so that it could be generated by other tools (Nix, looking at you) or manually.
@francois-caddet
Copy link
Contributor

francois-caddet commented Oct 13, 2021

I think, the second one can be tackled easily.

About the first one, do you mean to have sugaring for the imports? something like:
import "./path/to/some.ncl" would be equivalent to ./path/to/some.ncl ?

I suppose that the 3rd is not in the scope for a MVP. May be the 1st one neither.

@thufschmitt
Copy link
Contributor Author

About the first one, do you mean to have sugaring for the imports? something like: import "./path/to/some.ncl" would be equivalent to ./path/to/some.ncl ?

I had in mind something a bit more involved, allowing for example import "https://some.nickel.file.on/the/internet.ncl". But I’m actually not fond of it as it makes the language notably more complex and creates a whole class of issues.

I suppose that the 3rd is not in the scope for a MVP

It might be as it can be reasonably simple (not really more complex than the 2d approach) and it’s the kind of things that might be worth setting early-enough to make it standard.

@yannham
Copy link
Member

yannham commented Oct 27, 2021

Preamble

In the following, PM is used for package manager.

We explored the following possibilities:

  1. A lockfile based system, Nickel-specific, where a dedicated tool would do all the package management itself (like npm, opam, cargo, etc.)
  2. Lockfile-based, but offloading the pinning, updating and dependencies management in general to another package manager. Nickel would just need to be able to understand lockfiles to know how to resolve non-local imports, but wouldn't have to handle the rest.
  3. Use an even more primitive solution, like include paths. A separate tool could set the right environment from a lockfile while calling nickel from your preferred package manager.
  4. Allow to import URLs, possibly with SHA hashes for basic pinning/security.

This discussions revolved a lot around what should be the exact role of Nickel. We all agreed that an ideal solution would be a composable, lockfile-based solution. The problem is that package management is hard, it is a deep rabbit-hole that requires to figure out a lot of things upfront. It has been re-implemented so many times that would be sad to do it yet another time. What's more, one constraint is that we want to maintain a reasonable closure size as much as possible (for embedding Nickel in many workflows easily, to deploy the WebASM playground, etc.).

Although not mentioned explicitly during the call, we morally based our assessment on the following criterions:

  • Complexity: How complex is the solution to design, implement, and maintain? Does it increase dramatically Nickel's closure size?
  • Featureful/composable: How many practical use-cases can we handle? Can we easily compose package/repo/directories/module (whatever is the chosen notion for a non-local import unit): can we handle the transitive dependencies?
  • Fragmentation/evolutivity: is the chosen solution likely to be superseded by a different solution at some point? If yes, is the transition manageable (locally vs having to change all dependencies transitively for example)? Or will that likely cause ecosystem fragmentation?

Option 1: full-fledged package management

Option 1. is complex. It requires a lot of time, design, implementation and maintenance burden. It could also increase Nickel's closure size, although the package management part would be in a separate tool. It's probably the most featureful and evolutive one, though. However, going for this one would mean not having even a basic mechanism for non-local dependencies for quite some time, until the system is designed and starts to be implemented, which can be prohibitive.

Option 2: lockfile based resolution, offload package management

Option 2. is appealing, as it is still quite featureful and evolutive, while being less complex. A natural choice would be to use Nix/flakes as the PM, and have Nickel understand lockfiles directly. We can also define a simpler, PM-agnostic Nickel-specific lockfile format and a simple tool flake2nickel to convert a flake lockfile to a Nickel one. The issue is we want to be Windows compatible, which exclude Nix as a unique solution.

Using a PM-agnostic lockfile format though, users could use whatever PM they want, such as npm, only requiring an npm2nickel tool. But this is not as simple as it first appears: what about transitive dependencies (NPM packages inside node_modules having themselves dependencies)? Would they also need their own local nickel lockfile? Even so, having several possible PM would fragment the ecosystem, as it is then not trivial to compose a Nickel package served as a flake with one served as a npm package.

Option 3: include paths

Option 3. is something used by e.g. Jsonnet or even the good old C/C++. We think it doesn't really provide any advantage over 2., as it is marginally less complex, but has the disadvantage of making composability even harder, and risk introducing fragmentation if/when at some point we introduce a lockfile-based mechanism.

Option 4: include URLS

Option 4. is used by Dhall. It has the advantage of being simple, not having to handle dependencies, packages and all, although the impact on the closure size may be noticeable because of the requirement of an HTTP(S) stack. It is more bare-bone, as you can't really have a fine-grained management of dependencies nor an easy update: either you pin things via SHAs but update is painful, or you don't pin but this becomes brittle. We can still imagine having an external tool that handle the update, as importing from URLs with SHAs is in some sense a lockfile that is inlined in the source file.

All in all, it could still make for a nice alternative for simple workflows while waiting for a better solution like option 1. or option 2. However, such a transition is always risky and may cause a lot of friction. One way to lower this risk is to make clear in documentation from the beginning that this is a transitory solution.

Conclusion

At the end, our feeling is that going for option 4. while calmly figuring the details of option 2. is a reasonable trade-off.

@yannham yannham closed this as completed Nov 2, 2021
@yannham yannham reopened this Nov 2, 2021
@aspiwack
Copy link
Member

aspiwack commented Nov 4, 2021

I think that only option 4 makes sense today. We don't really have an identified need for anything more complex today. When we know more about how people use Nickle, we can revisit the question.

@Ericson2314
Copy link

The Dhall experience seems much worse than NIx today, when it's really unclear what code needs network access. I do not recommend that.

@Profpatsch
Copy link

Maybe we want to ping @Gabriel439 on this, since they have the most experience with non-local dependencies in a config language.

@Profpatsch
Copy link

From what i gathered you really want a semantics where file A can import file H1 on Host H, which in turn wants to import file H2 on Host H or even File I1 on Host I.

So you need both a more complex notion of what “relative” import means, as well as a concept of cross origin security (CORS).

@Gabriella439
Copy link

Gabriella439 commented Nov 12, 2021

I view the tradeoffs of URL-based imports differently: from my perspective URL imports provide a simpler user experience but they are not necessarily simpler to implement. In fact, they are actually the greatest source of complexity in Dhall and the thing that most Dhall binding authors complain about implementing.

Specifically, URL imports simplify the user experience in the following ways:

  • Purity

    All of the information that you need to reason about the code (including how to fetch dependencies), resides within the code itself. What you see is what you get. For example, if you've ever been bitten by impurities in Nix (e.g. the NIX_PATH) then you will probably understand the importance of this. Having an out-of-band package management process is (in my view) analogous to introducing impurities into your code.

    If I had to pick only one reason for doing "inline" package management within the code this would be the one.

  • Package publication is simpler

    Any web service that can host source code can be used to publish a package (e.g. a gist). Contrast this with a language like Haskell where you need to create a .cabal file and upload the package to a dedicated package repository (Hackage in this case)

  • Package subscription is simpler

    In the simplest case, all you have to do to use an expression is to paste the expression's address in your code directly where you need it. Contrast this with, say, Haskell where you need to add a package to your dependency list (optional: specify a version range), import the package, and reference the imported code

However, URL imports actually complicate the implementation in the following ways:

  • URL imports need to support relative imports correctly

    In other words, if you import https://example.com/A and that contains a relative import of ./B then that relative import needs to resolve to https://example.com/B and not ${PWD}/B

  • You need to disallow remote imports from importing "local" imports that are not relative paths

    Local imports can include absolute paths or environment variable imports.

    Dhall calls this the "referential sanity check" if you want to search for more details on this.

  • You need to support CORS

    … to protect against server-side request forgery

  • You need to support header-based authentication

    This is actually kind of a mess in Dhall. We have two separate ways of doing this since we had to learn over time what use cases users had in mind. I'm not entirely sure we've totally solved the user experience for this.

  • You need to support integrity checks

    … and ideally they should semantic integrity checks and not textual integrity checks, for the reasons outlined in this post. You don't necessarily have to interpret code before computing the hash like Dhall does (I think that might have been a mistake in retrospect), but it should definitely be a hash of the AST and not a hash of the source code.

  • You need to implement a cache

    … and it should be content-addressable and use the integrity check as the lookup key

You will probably also want to read Dhall's Safety Guarantees post, which covers the above topics in more detail.

From the perspective of making your language integrate with an existing package manager (e.g. Nix or Bazel), URL imports with integrity checks can play the same role as lockfiles, meaning that you can generate code for an external package manager from URL imports if they all have integrity checks. See, for example, the Dhall integration for Nixpkgs, which explains this in more detail:

The way I like to think of it is that Dhall technically does have a lockfile, but it's intermingled with the source code (in the form of the URL imports with their associated semantic integrity checks).

@aspiwack
Copy link
Member

Thanks @Gabriel439 !

This makes me think that there is a solution that we haven't been considering yet which is “just use Git submodules”. I'm not terribly fond of Git submodules. But they exist, which makes them quite a bit simpler than what we've been proposing so far.

@Profpatsch
Copy link

Two more points to consider:

  • Blowup of the dependency tree because of dependency on a http client and tls stack
  • Harder to compile the resulting language “to the web”, because you have to stub out the http stack with e.g. browser fetch requests
  • There’s effectively two subsets of the language, one where people can afford to do http requests and one where people can’t (for example in a sandbox, for dhall that means a lot of workarounds to pre-cache things in the nix sandbox and lots of conceptual overhead)

@yannham
Copy link
Member

yannham commented Nov 15, 2021

Thanks for the hindsight, @Gabriel439. It sounds like we underestimated the URL route 😕

This makes me think that there is a solution that we haven't been considering yet which is “just use Git submodules”. I'm not terribly fond of Git submodules. But they exist, which makes them quite a bit simpler than what we've been proposing so far.

I've seldom used git submodules, but indeed that could be a (zero-cost I guess? There's pretty much nothing to do on the Nickel side for us?) portable solution in the meantime. In parallel, we could have some flake template/Nix library that makes it easy to distribute Nickel code as flakes, as it is highly likely that most of first adopters are using it.

@YorikSar
Copy link
Member

  • Git submodules don't get us very far, for example, in nickel-nix where we want to provide Nix flake templates, and Nix templates don't play well with submodules.

  • I think we should be very explicit about when we're importing "local" and "non-local" dependency. I propose to use syntax import "some/path.ncl" from "scheme:package".

  • I would really like to not introduce yet another locking+caching mechanism for dependencies. Every language ecosystem in the world has one, and Nickel could leverage that. In the syntax example above I use scheme:package as source definition. Here scheme could point to which ecosystem to use. For example, nix: could use flakes, cargo: could use Cargo, go: could use go modules, and so on for npm:, pip:, cobolget:, etc.
    All of these ecosystems have their own lock files and somehow cache the "package" locally. We'd need to query relevant tools for the path to package and then append some/path.ncl to it to import it. Note that all querying should happen via tools and their output, so no additional dependencies should be required.

    For example, for nix: schema we could support:

    • nix:flake:nixpkgs - flake:nixpkgs part is passed down to nix flake metadata --json to get .path pointing to where the flake lives in local filesystem. This relies on Nix registry and is not locked in any way. Nix will cache the flake in local store before returning the result.
    • nix:input:nixpgks - similarly to above looks up input from current flake with nix flake metadata --json --inputs-from . nixpkgs. Current flake is defined by closest flake.nix in the filesystem hierarchy or however nix flake changes this definition in the future. The input is locked in the relevant flake.lock and Nix will cache it in the local store.
    • nix:<installable> - for everything else, call nix build --no-link --print-out-paths <installable> to fetch/build the target, and use the result as base.

    With cargo: schema:

    • cargo:package could trigger cargo metadata --locked call that will fetch and cache all dependencies for current crate, then output data for each package, including its cached path. It will be locked in Cargo.lock.

    We could work in similar way with all other ecosystems:

    • call package manager to cache the dependency
    • query path to the cache
    • use the result as base directory for import

@YorikSar
Copy link
Member

Forgot to mention: this is all in favour of option 2 above: reuse other package managers.

@yannham
Copy link
Member

yannham commented Apr 21, 2023

I would really like to not introduce yet another locking+caching mechanism for dependencies.

I wholeheartedly agree.

In the syntax example above I use scheme:package as source definition. Here scheme could point to which ecosystem to use. For example, nix: could use flakes, cargo: could use Cargo, go: could use go modules, and so on for npm:, pip:, cobolget:, etc.

On the other hand, it comes with a risk: fragmentation. Let's say you're a user, and you want to use a bunch of different Nickel libraries: that would be very frustrating if you have to install nix, cargo, npm, and yarn in order to finally run your small config. I'm not saying this is a no-go, but I just want to bring this point. Maybe the solution is to bless one package manager by default - totally randomly, nix 🙄 - while still allowing (but make them harder) ways of using a different one? Something as dead simple as not putting any scheme would default to nix could already go some way.

@YorikSar
Copy link
Member

On the other hand, it comes with a risk: fragmentation. Let's say you're a user, and you want to use a bunch of different Nickel libraries: that would be very frustrating if you have to install nix, cargo, npm, and yarn in order to finally run your small config.

I agree that it would lead to some fragmentation, but I would expect people who mainly use Nix have Nix dependencies that provide some Nickel libraries in them (this is the case for nickel-nix), but also if some JS library provides some Nickel niceties, it would most likely be published on NPM anyway. I doubt there will be a point where JS app wants to use Nickel library provided by Rust library. Of course, it would be nice to have "pure" Nickel libraries, but I would expect that with Nickel not being an independent general-purpose language (it's always tied to some project), it might be too early to implement separate packaging infrastructure for it.

I'm not saying this is a no-go, but I just want to bring this point. Maybe the solution is to bless one package manager by default - totally randomly, nix 🙄 - while still allowing (but make them harder) ways of using a different one? Something as dead simple as not putting any scheme would default to nix could already go some way.

Defaulting to Nix has its downsides: not everybody wants to or can use Nix. It would be a shame if Nickel wouldn't work on Windows just because it requires Nix for packaging, for example. Also, it wouldn't be as hidden as one would prefer: locking would only be ensured if you write a proper flake, and use nix:input: or smth similar.

@yannham
Copy link
Member

yannham commented Apr 24, 2023

Defaulting to Nix has its downsides: not everybody wants to or can use Nix. It would be a shame if Nickel wouldn't work on Windows just because it requires Nix for packaging, for example. Also, it wouldn't be as hidden as one would prefer: locking would only be ensured if you write a proper flake, and use nix:input: or smth similar.

Ah, Windows is a very good point (that I tend to forget about 😅 ). I think my idea of a blessed package manager was to say: please use your own specific package manager for domain-specific librairies, but for pure Nickel library, we just made an arbitrary choice for you. Even in the future I would be sad if we have to re-implement a package manager for the thousand time, so I guess we mostly align on the idea but I wanted to offload the default package manager for pure libraries to an existing one. Maybe it should just be something different from Nix, then. One that is light and portable, if possible.

@YorikSar
Copy link
Member

YorikSar commented May 2, 2023

I'm trying to implement my proposal and I hit a snag: while import is a keyword in Nickel and it's parsed specially, it's actually just a builtin function with some extra sauce (imports are resolved ahead of time). That's why currently import "file.ncl" from "smth" is equivalent to importing a file and passing arguments from and "smth" to the result.

Nix treats import as just a builtin function, there is absolutely nothing special about it, and you can even assign new values to it, or do something nasty like scopedImport {import = throw} ./file.nix to override it completely. I don't think we want this for Nickel since we do want to typecheck before evaluation, and that's impossible in such dynamic case.

What do you think about making import into actual statement? It requires special treatment from Nickel, and it syntactically enforces the "first" argument to be a literal string, so in a way it already is. Also, I think import <nixpkgs> {} is the prevalent use case for "import as a function" in Nix, and it's not the case for Nickel. I found only a couple places in tests where import "smth" args is used. I think (import "smth") args would be more clear for user as it separates place where special rules apply (only literal string in argument) from the rest of the code. This would also allow us to add from to this statement without making it into another reserved identifier.

Other options in the context of this issue would be:

  • Using a different keyword for importing from an external source (i.e. importFrom "nix:input:nickel-nix" "./nix.ncl"). This adds another keyword that should generally be avoided if possible.
  • Switching up the order of import arguments like import from "nix:import:nickel-nix" "./nix.ncl". This is just not pretty and confusing. from might be a variable in scope, but here it's a special marker. Also, now we have 3 "special" arguments to the import "function".

YorikSar added a commit that referenced this issue May 3, 2023
Currently `import` is treated as a very special function that only
accepts literal string as its first argument. It has following
downsides:

* Special handling of `import` is hidden. User can assume that its
  just a function while it's a special keyword. User might expect to
  be able to pass variables to it while it is only handled before
  typechecking and evaluation.
* We can't extend `import` functionality without introducing new
  keywords which is not backward-compatible.

This change makes `import` into another statement like `let` or `fun`,
which means it cannot be confused with a function anymore. It also means
that expressions like `import "foo.ncl" bar` and
`import "foo.ncl" & {..}` are invalid, and `import` statement need to be
put in parethesis: `(import "foo.ncl") bar`.

For more context, see discussion in
#329 (comment)
YorikSar added a commit that referenced this issue May 4, 2023
Currently `import` is treated as a very special function that only
accepts literal string as its first argument. It has following
downsides:

* Special handling of `import` is hidden. User can assume that its
  just a function while it's a special keyword. User might expect to
  be able to pass variables to it while it is only handled before
  typechecking and evaluation.
* We can't extend `import` functionality without introducing new
  keywords which is not backward-compatible.

This change makes `import` into another statement like `let` or `fun`,
which means it cannot be confused with a function anymore. It also means
that expressions like `import "foo.ncl" bar` and
`import "foo.ncl" & {..}` are invalid, and `import` statement need to be
put in parethesis: `(import "foo.ncl") bar`.

For more context, see discussion in
#329 (comment)
YorikSar added a commit that referenced this issue May 4, 2023
Currently `import` is treated as a very special function that only
accepts literal string as its first argument. It has following
downsides:

* Special handling of `import` is hidden. User can assume that its
  just a function while it's a special keyword. User might expect to
  be able to pass variables to it while it is only handled before
  typechecking and evaluation.
* We can't extend `import` functionality without introducing new
  keywords which is not backward-compatible.

This change makes `import` into another statement like `let` or `fun`,
which means it cannot be confused with a function anymore. It also means
that expressions like `import "foo.ncl" bar` and
`import "foo.ncl" & {..}` are invalid, and `import` statement need to be
put in parethesis: `(import "foo.ncl") bar`.

For more context, see discussion in
#329 (comment)
@thufschmitt
Copy link
Contributor Author

Fixed by #1716

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants