Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plausible, nixos/plausible: Default to listen on localhost. #130297

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nh2
Copy link
Contributor

@nh2 nh2 commented Jul 15, 2021

Motivation for this change

This is a safer default configuration, changing:

  • the plausible HTTP web server
  • the Erlang Beam VM inter-node RPC port
  • the Erlang EPMD port

to be listening on localhost only.

This makes Plausible have a safe default configuration, like all other
networked services in NixOS

For background discussion, see: #130244

TODO

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • 21.11 Release Notes (or backporting 21.05 Relase notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@nh2 nh2 requested review from happysalada and Ma27 July 15, 2021 18:21
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jul 15, 2021
text = ''
#! ${pkgs.elixir}/bin/elixir
{:ok, addr} = :inet.parse_address('${ip}')
IO.puts(inspect(addr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining debug statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's the thing that prints the Erlang-formatted IP address to stdout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made that clearer with a comment.

name = "ip-to-erlang-address-tuple.exs";
text = ''
#! ${pkgs.elixir}/bin/elixir
{:ok, addr} = :inet.parse_address('${ip}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a case statement here? Just in case the parse ip fails.
probably something like

case :inet.parse_address('${ip}') do
  {:ok, addr} -> addr
  {:error, message} -> raise message
end

I am not sure the effect of the raise in elixir in a file, does it really return exit code 1? (just something that has to be tested)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a case statement here?

Great point. I've force-pushed to print a better error message in that case.

I am not sure the effect of the raise in elixir in a file, does it really return exit code 1?

raise does exit with code 1, but it also prints a stack trace. I've implemented an explicit stderr print now to print exactly what the user should see:

Invalid IP address 'garbage'; error: :einval
builder for '/nix/store/05317l8jk4iv7z0zxnvbsv1dgqk6hh6a-plausible-vm-args.drv' failed with exit code 1

@happysalada
Copy link
Contributor

I have to say I'm a little bit afraid that after you get the values for where to listen, you then have to put them in a vm.args.eex (I don't remember if that is the exact name of the file). Once you have the file, you either have to override upstream, or merge with upstream. Upstream doesn't seem to use that file at the moment, but they could in the future, so overriding seems a little dangerous to me. Merging is also fine at the moment, but could get complicated if upstream decides to use the file.

How about we stop removing the RELEASE_COOKIE in nixpkgs?
Yes it's true it's bad to have secrets in the nix store. However it's even worse to have a node that is open.
Perhaps we could just provide a warning if the RELEASE_COOKIE environment variable is not populated at runtime?
Something like
No RELEASE_COOKIE detected, plausible will use the one in the nix store. be careful it is world readable. Ignore this at your own risks.

I am happy to go with your current approach, just outlining a possible alternative.

@nh2
Copy link
Contributor Author

nh2 commented Jul 15, 2021

How about we stop removing the RELEASE_COOKIE in nixpkgs?
Yes it's true it's bad to have secrets in the nix store.

@happysalada That will be futile: Generating a password into a publicly available package makes it not-a-secret. Anybody can just read and use it.

It's not just a value that's a password in your nix store, it's a password that's on cache.nixos.org for everybody to see.

The only reasonable use for password-protected Erlang is when the user entirely generates the cookies for their cluster themselves, known to nobody else. Thus, NixOS cannot do it for them.

Perhaps we could just provide a warning

I don't think a warning would be good enough (as seen in the other PR, we didn't even catch a hard runtime "no such function" error there). If we used the COOKIE stuff for default security, it should be an evaluation error. But most people will just want to use Plausible single-machine, instead of on a multi-node Erlang cluster, so I think making it listen to localhost is likely the better default in any case.

I have to say I'm a little bit afraid that after you get the values for where to listen, you then have to put them in a vm.args.eex (I don't remember if that is the exact name of the file). Once you have the file, you either have to override upstream, or merge with upstream. Upstream doesn't seem to use that file at the moment, but they could in the future, so overriding seems a little dangerous to me.

I don't really understand what you're referring to. Are you referring to something that I'm doing here regarding "vm.args.eex"? Or are you saying that there's some thing upstream could do that would break the setting of inet_dist_use_interface? If yes, could you elaborate a bit more on what you mean?

@nh2 nh2 force-pushed the plausible-listen-address branch from a348267 to 8eb78f3 Compare July 15, 2021 23:34
@nh2
Copy link
Contributor Author

nh2 commented Jul 15, 2021

I even made a mistake with shell scripting here:

set -eu -o pipefail
echo "-kernel inet_dist_use_interface \"$(${erlangAddressTupleCommand})\"" > "$out"

Does not work, swallows the exit code of the Elixir script.

This works:

set -eu -o pipefail
IP=$(${erlangAddressTupleCommand})
echo "-kernel inet_dist_use_interface \"$IP\"" > "$out"

bash/shell is just a lost cause. Even the most trivial two-liners are wrong, and nobody spots it, after we've been debugging and reviewing exactly that type of issue for days.

nh2 added a commit to nh2/nixpkgs that referenced this pull request Jul 16, 2021
@nh2 nh2 force-pushed the plausible-listen-address branch from 8eb78f3 to dac1687 Compare July 16, 2021 00:27
@nh2
Copy link
Contributor Author

nh2 commented Jul 16, 2021

Tangential:

bash/shell is just a lost cause. Even the most trivial two-liners are wrong, and nobody spots it, after we've been debugging and reviewing exactly that type of issue for days.

This is what it would like if we had a runPython3Command to use instead of runCommand:

30e746b#diff-77a3dd66631fe818ad92b50d7abcaa3812f98c6f0b738cdcd64b1c3520b32becR296

I personally would like to have something like that in the future.

@happysalada
Copy link
Contributor

On the shell tangeant.
Did you have a look at oil shell?
They try to be compatible with bash but provide you with a better alternative. I was having a look at trying to replace the default bash shell with it. It wouldn't solve all the problems, but people would have an easier way of writing bugless code.

Let me try to clarify what I meant around the vm.args.eex.
Here you are using the RELEASE_VM_ARGS variable to supply the vm settings to starting erlang. In the plausible repo, they also have a rel/vm.args.eex file for that purpose. So essentially you are overriding their usage of it. At the moment it's fine because they don't seem to use that file, however if they decide to in the future, we might run into conflicts. (I don't see them using it soon, just mentioning this).

@nh2
Copy link
Contributor Author

nh2 commented Jul 16, 2021

Did you have a look at oil shell?

I've looked at various shell replacements in the past, and I like some of them, but I think that for general scripting in nixpkgs, Python would be a better fit over those, because there are probaly 100x-1000x more users that already understand Python well vs. these new shell language variants, it is at least as safe, and it's the same language used as in NixOS VM tests (so contributors have to learn less languages).

Here you are using the RELEASE_VM_ARGS variable to supply the vm settings to starting erlang. In the plausible repo, they also have a rel/vm.args.eex file for that purpose. So essentially you are overriding their usage of it. At the moment it's fine because they don't seem to use that file, however if they decide to in the future, we might run into conflicts.

@happysalada Ah, I see. For reference, that's this line:

https://github.com/elixir-lang/elixir/blob/799b4ceb8b674da91ff5b51699450aa295fb720b/lib/mix/lib/mix/tasks/release.init.ex#L103

Yes, Elixir doesn't seem to provide an easy way to merge the package's vm.args file with your own VM config options.

I've a pushed a commit that adds and assertion that will fail if they start putting something into vm.args, so that we'll notice.

# grep to print all lines that aren't comments, empty, or whitespace.
FOUND_VM_ARG=0
grep -v -E '^(#|$|[[:space:]]+)' "${pkgs.plausible}"/releases/*/vm.args || FOUND_VM_ARG=1
if [ "$FOUND_VM_ARG" -eq 0 ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

(( FOUND_VM_ARG = 0 ))

Which to my knowledge is the recommended way to do arithmetic.
I don't have any strong feelings on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, however I am not doing arithmetic. Just normal exit code handling, for which it is common to use strings. For example $? also works that way.

@happysalada
Copy link
Contributor

I just have one last comment before we merge.
How do you feel about updating to a more recent 'unstable' version of plausible vs the patch?
It seems to me that doing the update now, will make things easier later and we don't need to remember to delete a patch.
I don't have very strong feelings on this, just want to make sure we at least discuss.

@nh2
Copy link
Contributor Author

nh2 commented Jul 16, 2021

How do you feel about updating to a more recent 'unstable' version of plausible vs the patch?

In general for most packages, nixos-unstable focuses on packaging the latest stable released versions (there are some exceptions), as does Debian unstable.

I think that is a good model, as releases are often better tested and run by more people (e.g. because they are present in multiple Linux distributions).

Copy link
Contributor

@happysalada happysalada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to merge.

@nh2 nh2 force-pushed the plausible-listen-address branch from ac4f121 to 2e89c54 Compare July 17, 2021 00:28
@nh2
Copy link
Contributor Author

nh2 commented Jul 17, 2021

I'm also planning to update the NixOS test to check that no non-localhost ports are open with defualt settings.

The IP address on which the server is listening.

When changing listen IPs, also consider
<option>services.plausible.erlang.vmListenAddress</option> and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly prefer to use <xlink linkend="opt-services.plausible.etc"/> here as this makes links clickable :)

Copy link
Contributor Author

@nh2 nh2 Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly prefer to use <xlink linkend="opt-services.plausible.etc"/> here as this makes links clickable :)

You mean <xref I think. Good idea, I forgot about it. Done.

Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we want to factor this out to have it reusable for other erlang applications.

"cachex": {:hex, :cachex, "3.3.0", "6f2ebb8f27491fe39121bd207c78badc499214d76c695658b19d6079beeca5c2", [:mix], [{:eternal, "~> 1.2", [hex: :eternal, repo: "hexpm", optional: false]}, {:jumper, "~> 1.0", [hex: :jumper, repo: "hexpm", optional: false]}, {:sleeplocks, "~> 1.1", [hex: :sleeplocks, repo: "hexpm", optional: false]}, {:unsafe, "~> 1.0", [hex: :unsafe, repo: "hexpm", optional: false]}], "hexpm", "d90e5ee1dde14cef33f6b187af4335b88748b72b30c038969176cd4e6ccc31a1"},
"certifi": {:hex, :certifi, "2.6.1", "dbab8e5e155a0763eea978c913ca280a6b544bfa115633fa20249c3d396d9493", [:rebar3], [], "hexpm", "524c97b4991b3849dd5c17a631223896272c6b0af446778ba4675a1dff53bb7e"},
- "clickhouse_ecto": {:git, "https://github.com/plausible/clickhouse_ecto.git", "b30ccc93a4101a25ff0bba92113e18d8a9a8b28e", []},
+ "clickhouse_ecto": {:git, "https://github.com/plausible/clickhouse_ecto.git", "1969f14ecef7c357b2bd8bdc3e566234269de58c", []},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just use substituteInPlace for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like it to fail loudly when the upstream commit changes though.

nh2 added 3 commits July 22, 2021 00:50
This is a safer default configuration, changing:

* the plausible HTTP web server
* the Erlang Beam VM inter-node RPC port
* the Erlang EPMD port

to be listening on localhost only.

This makes Plausible have a safe default configuration, like all other
networked services in NixOS

For background discussion, see: NixOS#130244
This is a safer default configuration, changing

* the Erlang Beam VM inter-node RPC port
* the Erlang EPMD port

to not listen to any sockets at all.
@nh2 nh2 force-pushed the plausible-listen-address branch from 2e89c54 to 1f99f69 Compare July 22, 2021 00:50
@ofborg ofborg bot requested a review from Ma27 July 22, 2021 01:01
@nh2
Copy link
Contributor Author

nh2 commented Jul 22, 2021

I'm wondering if we want to factor this out to have it reusable for other erlang applications.

@Ma27 Yes. But that's better tracked by issue #130244 with its task:

  • Change all Elixir services to listen to localhost by default, like other NixOS services, adding options to set the listen interface.

and it would make sense for somebody who understands one of the other Elixir services to do it (probably not me for the time being, as Plausible is the first Elixir / Erlang service I use on NixOS).

@happysalada
Copy link
Contributor

in terms of elixir services on nixpkgs, there is basically this and pleroma.
@Ma27 do you have an idea of what the factoring would look like?
(I understand how you can do things in common in build functions, but I'm not sure how to factor something for services, if you had an example, it would help)

@happysalada
Copy link
Contributor

Thinking about it some more, I don't think each application should start and set it's epmd port. The epmd dependency is hidden and there is no way to control it.
There is an epmd service on nixpkgs. I think that each elixir package should have a enableEpmd option that will enable the epmd service.
Checking plausible, they already have a vm.eex.args. I think we should add -start_epmd false at the end so that the release doesn't start epmd by itself and epmd concerns are managed outside of it.

@nh2
Copy link
Contributor Author

nh2 commented Jan 13, 2022

@happysalada Regarding

In 9c71958#diff-77a3dd66631fe818ad92b50d7abcaa3812f98c6f0b738cdcd64b1c3520b32becR174 and #143345 (comment) you enabled

services.epmd.enable = true;

for Plausible. But don't you still have to do

I think we should add -start_epmd false at the end so that the release doesn't start epmd by itself

so that there aren't 2 epmds running (and the one Plausible starts might still go zombie) as you commented in the other thread?

@nh2 nh2 mentioned this pull request Jan 13, 2022
12 tasks
@happysalada
Copy link
Contributor

I forgot to update this thread when I tested.

  • IIRC, you can't start with -epmd false, the release doesn't start at all. The details are fuzzy though.
  • if you update, then there is already an epmd started (by the release) that you have to kill. When you start the system from scratch though, the release detects the globally started epmd and doesn't start its own.

let me know if that doesn't answer your comment.

@nh2
Copy link
Contributor Author

nh2 commented Jan 13, 2022

@happysalada The question is: You switched to services.epmd.enable = true;, but apparently did not change flags passed to Plausible. Does that not mean that every time Plausible starts, it still continues to start its own epmd in addition to the one started by the NixOS service?

Or is there some automatic detection in Plausible that checks whether an epmd is already running on the machine?

@SuperSandro2000 SuperSandro2000 added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jan 13, 2022
@happysalada
Copy link
Contributor

The mechanism is inside the beam. When you run a release, it first checks for an already running epmd and if it exists, connects to that. If none can be found, then it will start its own.
This is not only for plausible, but anything running an elixir release.

@happysalada
Copy link
Contributor

This is the default mode for how releases are run. There is another mode that doesn't require epmd, but it's not ad simple as switching the flag. The setup was a little more complex than that, that's why in the end, I didn't choose to change from the default running mode of using epmd (one reason is because upstream is running this way, but another one is that it would require testing to verify that the new mode doesn't break anything).

@nh2
Copy link
Contributor Author

nh2 commented Jan 21, 2022

When you run a release, it first checks for an already running epmd and if it exists, connects to that.

@happysalada Ah, I see.

I think that means that there's still the race condition that the NixOS epmd is off, crashed, or being restarted, or was restarted by a human, at the time that the Plausible one gets started, so I think for production use cases it would be nice to ensure that it won't fall back to starting its own.

Upstream Plausible just merged my patch to be able to specify 127.0.0.1 as the listen host, and also made it the default: plausible/analytics#1190

This means that with the next release, we can remove my patch from this PR.

@happysalada
Copy link
Contributor

Agreed on checking for the race condition. During the time I was running elixir in production, I've never seen this, but it could happen. One way to add the check would be for the plausible service "require"ing the epmd service to start.

I saw that plausible merged your fix. I thought it was lost in the limbs, I'm happy it finally got merged!

@nh2
Copy link
Contributor Author

nh2 commented Jan 23, 2022

One way to add the check would be for the plausible service "require"ing the epmd service to start.

Even a systemd dependency would still be racy, since the system epmd could be temporarily down, e.g. currently being restarted, or not bound to the socket yet, and so on.

The only safe way is configuration: To tell Plausible that it should never try to start its own epmd.

@nh2
Copy link
Contributor Author

nh2 commented Apr 20, 2022

Updated PR description:

Patch was merged upstream in plausible/analytics#1190 / plausible/analytics@1337b46, but as of writing there is no release yet that contains it.

@RaitoBezarius
Copy link
Member

What is the status of this PR, @nh2 — I think it would be an amazing addition, can I help? :)

@softinio
Copy link
Member

@nh2 Are you still actively working on this PR?

I wonder if the scope of this PR will change if we upgrade plausable to a new version?

@RaitoBezarius
Copy link
Member

@nh2 Are you still actively working on this PR?

I wonder if the scope of this PR will change if we upgrade plausable to a new version?

I think you can take over, @nh2 seems inactive for now. :)

@nh2
Copy link
Contributor Author

nh2 commented Nov 10, 2023

@RaitoBezarius @softinio @happysalada I'm now revisiting this PR. I'll rebase it to NixOS 23.05, if I can within the next couple hours.

Let me know if you discovered any new info on this topic.

My plan is this:

@happysalada
Copy link
Contributor

@azahi @Tom-Hubrecht tagging more people who have expressed interest in plausible in the past.

Personally the approach looks good to me, its just a matter of getting enough people to test.

@nh2
Copy link
Contributor Author

nh2 commented Nov 10, 2023

My plan is this:

Closing in favour of simplified PR #266702.

I also add a NixOS test line there that checks that it's listening only on localhost.

@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 1-10 10.rebuild-darwin: 1 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants