Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[discussion] patchShebangs and cross: what tools to use? #33956

Closed
dtzWill opened this issue Jan 16, 2018 · 14 comments
Closed

[discussion] patchShebangs and cross: what tools to use? #33956

dtzWill opened this issue Jan 16, 2018 · 14 comments
Labels
6.topic: cross-compilation Building packages on a different platform than they will be used on
Milestone

Comments

@dtzWill
Copy link
Member

dtzWill commented Jan 16, 2018

Intro

patchShebangs rewrites scripts with shebangs to use tools from $PATH. By "resolving" these references to (absolute) nix store paths the interpreter used can be controlled precisely-- for example, the correct version of python will always be used.

This is great 👍 and very useful. Especially on NixOS where not doing so generally means scripts don't work at all.

Problem

When cross-compiling, there is a bit of a problem: which platform's tools should be used? PATH is misleading because it may contain various build-time dependencies (from buildPlatform) that are inappropriate for use on hostPlatform.

Solution?

So, before I wrote this it seemed this was a challenging problem: how to "guess" whether buildPlatform or hostPlatform tools should be used when rewriting shebangs? Nothing in the script will tell you, so it's anyone's guess how it will be used.....

Buuuut I think the answer is actually this: patchShebangs should always patch scripts to use hostPlatform tools, refusing to use tools from buildPlatform (when the two are not the same).

Should the script need to be executed for the build, then if nativeBuildInputs are used properly then the "hostPlatform" the script was patched for will be the current buildSystem. If you follow, hopefully makes sense.

Feedback on whether it's agreed that "patchShebangs should only patch to host tools" would be appreciated.

Forward

Implementing the Solution

This is a bit tricky, but I haven't thought about it too much. Thoughts/suggestions welcome!

Related

Actually, is there any reason for the outputs of a cross-built derivation to have dependencies on things from buildPackages? I expect checking for this strictly would "break" a lot off things, but perhaps that's for the best since it's unclear those outputs would be useful anyway.

@dezgeg
Copy link
Contributor

dezgeg commented Jan 16, 2018

It's inevitable that you need two (or more) variants of patchShebangs (or a command-line flag maybe). Consider e.g:

   preConfigure = ''
        # assume ./configure has #!/bin/bash
        patchShebangs ./configure
   '';
   postInstall = ''
       # assume $out/bin/foo has also #!/bin/bash
       patchShebangs $out/bin/
   '';

where the two exact same commands (well, modulo filename) need to give a different result.

Actually, is there any reason for the outputs of a cross-built derivation to have dependencies on things from buildPackages?

.dev outputs when compiling stuff that have plugin APIs and such. E.g. if you look in linux.dev there's bunch of binaries that need to be run on the build platform when compiling external kernel modules, so they'd depend on build glibc, etc.

@dezgeg dezgeg added the 6.topic: cross-compilation Building packages on a different platform than they will be used on label Jan 16, 2018
@dtzWill
Copy link
Member Author

dtzWill commented Jan 16, 2018

Agreed on your example-- I've seen things like this and found them required to cross-compile various packages.

Even so--it seems that all of the scripts in the output derivations should be "shifted" in the sense that:

build <- host, host <- target, target <-? (presumably stays the same)

That is, since we're supposedly generating outputs to run on "host" then there's no reason for any of the outputs to contain references to build packages.

Regarding your example--I know right now there certainly /are/ lots of references to build packages but that seems like a failure of our input categorization: If we want scripts we can run on buildPlatform they really need to be built by a stdenv where hostPlatform matches our current buildPlatform.
(buildPackages vs nativeBuildInputs change other details but in this regard are the same)


In a way I'm asking what exactly is the purpose of cross-compilation: is it to produce outputs for further cross-compilation? Or is it to produce outputs meant for hostPlatform? Right now (as in your example, among many others) we're doing a blend of both in what appears to be a somewhat ad-hoc manner.

This is problematic if you actually want to use the resulting outputs on the hostPlatform -- both because your closure will likely be gigantic and because references to build packages may exist everywhere.

The only outputs suitable for hostPlatform are things like bootstrapTools which is very very careful about what it does, and maybe fully-static builds of things (haskell perhaps?). Is this an expected "result" of cross-compiling?

If not, then we need to work on improving things. If so, I'm silly and I'm sorry for my misunderstanding :).

@orivej
Copy link
Contributor

orivej commented Jan 17, 2018

Buuuut I think the answer is actually this: patchShebangs should always patch scripts to use hostPlatform tools, refusing to use tools from buildPlatform (when the two are not the same).

I needed an extended version of patchShebangs (called substitute-paths) to package Blast+, and faced with this issue I have chosen to use build platform tools during the build, and switch them for host platform tools after the install phase. The details are in #33961.

@dtzWill
Copy link
Member Author

dtzWill commented Jan 17, 2018

I needed an extended version of patchShebangs (called substitute-paths) to package Blast+, and faced with this issue I have chosen to use build platform tools during the build, and switch them for host platform tools after the install phase. The details are in #33961.

awesome, I'll take a look! Thanks!

@dezgeg
Copy link
Contributor

dezgeg commented Jan 17, 2018

That is, since we're supposedly generating outputs to run on "host" then there's no reason for any of the outputs to contain references to build packages.

Regarding your example--I know right now there certainly /are/ lots of references to build packages but that seems like a failure of our input categorization: If we want scripts we can run on buildPlatform they really need to be built by a stdenv where hostPlatform matches our current buildPlatform.
(buildPackages vs nativeBuildInputs change other details but in this regard are the same)

But all of these are not a failure on nixpkgs side - that's just how the upstream project works.

In a way I'm asking what exactly is the purpose of cross-compilation: is it to produce outputs for further cross-compilation? Or is it to produce outputs meant for hostPlatform? Right now (as in your example, among many others) we're doing a blend of both in what appears to be a somewhat ad-hoc manner.

For the kernel example, you can't get rid of this without massively patching the kernel build system.

I would say it's a per-output property: .dev contains outputs for further cross-compilation and all the rest are outputs meant for hostPlatform.

@jtojnar
Copy link
Member

jtojnar commented Jan 17, 2018

Could we add /usr/bin/env to the sandbox (NixOS/nix#1205), reducing the need for patching shebangs of build scripts? Currently patchShebangs is used for build scripts in GNOME packages using meson a lot.

We could also add patchShebangsToUseEnv doing s~.*/bin/(.*)~/usr/bin/env \1~ – that should fix the remaining broken build scripts. Though, ideally, we would send patches for this upstream.

patchShebangs could then remain for host platform.

Regarding the target platform, is that a real concern? I doubt there are many packages that generate target specific scripts and they can probably be handled manually.

@Ericson2314
Copy link
Member

Ericson2314 commented Jan 25, 2018

@dtzWill

Or is it to produce outputs meant for hostPlatform?

It is that and that alone.

A long term goal is to kind of allow different outputs to effectively live in different bootstrapping stages. One part of this is giving them different allowedRequisites, as we really should indeed enforce that buildPakages stuff doesn't make it to out, bin, lib, etc. Another part is making dev have a depsTargetTargetPropagated dependency on the runtime ones so that if dev itself is a nativeBuildInput, the offesets cancel out and out is a transitive buildInput (pkgsHostTarget array member in practice).

@Ericson2314
Copy link
Member

Ericson2314 commented Jan 26, 2018

As to the concrete problem with patchShebangs, we should do patchBuildShebangs and patchRunShebangs. It's quite easy to compute what the PATHs should be after the fact. All this will be more maintainable if we are disciplined about the PATH for native builds to.

[On a side note, I want to write an RFC whose implementation is simply reverting 469fd89. That would make force the PATH handling everywhere i wanted above.]

@dezgeg
Copy link
Contributor

dezgeg commented Jan 26, 2018

Or is it to produce outputs meant for hostPlatform?
It is that and that alone.
A long term goal is to kind of allow different outputs to effectively live in different bootstrapping stages.

I don't get this. Again, the kernel build system does things like probing the capabilities of the cross compiler, dynamically generates header files, compiles internal helper programs like fixdep, etc. that are all needed for compiling external kernel modules against that kernel. The development headers and the result itself simply need to be built at the same time.

@Ericson2314
Copy link
Member

Ericson2314 commented Jan 26, 2018

@dezgeg Agreed. To be clear, I was not advocating for building headers separately in general.

On a scale from "bad, but easy on upstream" to "good, fighting upstream", we have

  1. Cross compilation as it was before
  2. Cross compilation as it is now
  3. Headers and binaries still built together, but have separate allowedRequsites and leverage depsTargetTargetPropgated and nativeBuildInputs. (what I was talking about)
  4. Headers and binaries built separately. (nice, but quite infeasible now).

The cost from 1 to 2, for example is the LD=$CC needed in many places, but I think the benefits out-way it.

For 3 the big idea is we arrange the dependencies to match what we'd have if we actually did build the headers separately, but the headers and binaries are in fact separate outputs of the same derivation as today. It's a compromise to try to get at the advantages of 4 while keeping the feasibility of 2.


(Aside) NixOS/nix#1080, btw, is a similar sort of trick. If we

  1. Had the intentional store
  2. Could build against something like Darwin's tbd files
  3. Used a to-be-rewritten run-time-only store-path "real location" in those TBD files

We'd make any tbd- and header-preserving mass rebuild a non-mass rebuild, as if we built headers and binaries separately and also convinced LD to not care if the dynamic libraries were where we said they were.

Ericson2314 pushed a commit to obsidiansystems/nixpkgs that referenced this issue Sep 11, 2018
This hopefully makes patchShebangs respect cross compilation. It
introduces the concept of the HOST_PATH. Nothing is ever executed on
it but instead used as a way to get the proper path using ‘command
-v’. Needs more testing.

/cc @Ericson2314 @dtzWill

Fixes NixOS#33956
Fixes NixOS#21138

(Modified backport of f069423. See
previous commit to understand the differences between this and the
original.)
@FRidh FRidh closed this as completed in f069423 Oct 2, 2018
@Ericson2314
Copy link
Member

Was this commit added to master again? Confused why this issue is suddenly now closed.

@FRidh
Copy link
Member

FRidh commented Oct 3, 2018

Yes and no. Yes, the original one went into master, and no, the revert is in as well.

@FRidh FRidh reopened this Oct 3, 2018
@Ericson2314
Copy link
Member

Ah makes sense. GitHub isn't so clever, of course. Thanks

jbgi pushed a commit to input-output-hk/nixpkgs that referenced this issue Jan 15, 2019
This hopefully makes patchShebangs respect cross compilation. It
introduces the concept of the HOST_PATH. Nothing is ever executed on
it but instead used as a way to get the proper path using ‘command
-v’. Needs more testing.

/cc @Ericson2314 @dtzWill

Fixes NixOS#33956
Fixes NixOS#21138
angerman pushed a commit to input-output-hk/nixpkgs that referenced this issue Jan 23, 2019
This hopefully makes patchShebangs respect cross compilation. It
introduces the concept of the HOST_PATH. Nothing is ever executed on
it but instead used as a way to get the proper path using ‘command
-v’. Needs more testing.

/cc @Ericson2314 @dtzWill

Fixes NixOS#33956
Fixes NixOS#21138
@matthewbauer matthewbauer added this to the 19.09 milestone Jan 27, 2019
@worldofpeace worldofpeace mentioned this issue Apr 1, 2019
10 tasks
@matthewbauer
Copy link
Member

Addressed in #43833

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cross-compilation Building packages on a different platform than they will be used on
Projects
None yet
Development

No branches or pull requests

7 participants