Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: New architecture for v3.0 #88

Closed
olljanat opened this issue Apr 24, 2021 · 44 comments
Closed

Proposal: New architecture for v3.0 #88

olljanat opened this issue Apr 24, 2021 · 44 comments
Labels
architecture enhancement New feature or request help wanted Extra attention is needed

Comments

@olljanat
Copy link
Member

Why?

BurmillaOS's current architecture have some drawbacks which makes cases like:

very complex/impossible.

Additionally we are stuck on quite old system-docker version #28

How?

We want support all known use cases from #6 including support for Kubernetes #47 and avoid need to excluding any use cases out #22

It means that our current slogan is still valid target with small correction: The smallest, easiest way to run Docker Containers in production at scale.

Current architecture

Our readme desribes current architecture like this:
"Everything in BurmillaOS is a Docker container. We accomplish this by launching two instances of
Docker. One is what we call the system Docker which runs as the first process. System Docker then launches
a container that runs the user Docker. The user Docker is then the instance that gets primarily
used to create containers. We created this separation because it seemed logical and also
it would really be bad if somebody did docker rm -f $(docker ps -qa) and deleted the entire OS."

How it works

Proposed architecture

On #9 we did decission to make Debian as only console option which made it easier to verify all needed use cases and I propose that we will continue on same track and start utilize Debian as root file system. It means that BurmillaOS can still be tiny container use optimized OS by default but it also means that any application/3rd party tool which is tested with Debian can be easily installed to it.
v3_architecture

It also means that we can improve security by replacing SELinux with AppArmor (which is more common on Debian based distros) and enable it by default.

Additionally we can also include all needed OS level preparations needed by Docker / Kubernetes rootless mode which will allow uses more easily to switch maximum security configuration.

What?

From OS maintenance point of view this proposal means that these repositories:

would be replaced by os-base-new

From users point of view this proposal means more supported use cases and flexibility but it also means that direct upgrade from older versions without re-install will not be possible (or at least we need create some kind of upgrade/migration tool/script for it).

Voting

Please, vote by leaving either 👍 or 👎 for this message so we can more easily see what people think about this proposal.

Also if you see big issue(s) on this proposal or notice that I have forgot some very critical use case/point from BurmillaOS architecture then please leave comment below about those.

If you want to propose alternative architecture then please open another issue for that dicussion.

@olljanat olljanat added enhancement New feature or request help wanted Extra attention is needed question Further information is requested architecture and removed question Further information is requested labels Apr 24, 2021
@msn62
Copy link

msn62 commented Apr 24, 2021

Burmilla (Rancheros) has for sure its limitations. It is targeted as a safe and very lightweight docker platform. Nothing more nothing less. By moving it to a more full OS it will lose its unique character. An upgraded docker version and uefi boot would already be a great improvmenent. Just my two cents.

@olljanat
Copy link
Member Author

Burmilla (Rancheros) has for sure its limitations. It is targeted as a safe and very lightweight docker platform. Nothing more nothing less.

Yes it is lightweight but I'm not sure if it really can be considered to be safe on todays criterias. Sure we have support for SELinux (at least on theory) but it is not enabled by default and based on #6 no one have not reported that they would have it enabled on their env. On nowadays criterias container environments shoud use either SELinux or AppArmor IMO.

By moving it to a more full OS it will lose its unique character.

Partly yes but we have be still smallest OS for this purpose and handle whole OS upgrades with single sudo ros upgrade/etc command. Also it is worth to mention that I was thinking about solution where upgrades would still do full replacement for rootfs so even if it would be possible more easily to do customization those still would need to be done with cloud-init to get them back after upgrades.

An upgraded docker version and uefi boot would already be a great improvmenent.

These things aren't in conflict. v1.9.x versions we can support until January 2024 when 4.14.x kernels will reach EOL. Those already have access to all latest Docker versions and even default will switch to 20.10.x version soon #71.

Release version of v2.0.0 targets to support UEFI that parly already exist #8 and is mostly missing to just more people to join to board and help finalizing it. Then we can keep those v2.x versions on current architecture and only those who are interested about new capabilities which comes with v3.0.x versions need to migrate there.

@msn62
Copy link

msn62 commented Apr 24, 2021

Thanks for the clarification of the possible roadmap. I got the impression that all attention would be put in the new architecture and current planned releases would be stopped. When limitied resources are available for further development it could be a chalange to keep up with multiple branches. Realy appreciate the effort of the team to keep it going.

This was referenced Apr 25, 2021
@h8liu
Copy link

h8liu commented Apr 25, 2021

In general, this looks like a really big change. It is almost like a new OS. Several questions:

How would users using 1.x or 2.x migrate to this? Especially if we use system-docker today, what work / steps need to be done to use this new architecture?

How would things upgrade in the new debian rootfs world? Or more specifically, is the debian roootfs read-only here (like using squashfs)? Or do we still want people to be able to apt get stuff?

The general high level question behind these questions seems to be: what is burmilla's official user interface that applications / users can rely on (and which we should try to keep it working as much as possible)? Seemingly, system-docker was part of the interface, yet this proposal is going to break it. I see this as a betrayal to RancherOS's old users (if there are..).

In the long run, if "supporting 3rd-party Debian applications/tools" is a goal, why would not people just use a custom build of debian / ubuntu rather than using Burmilla in general? Other than "it is ours and we have control", what are the benefits of using Burmilla for other potential users out there?

To me, Rancher/Burmilla OS was positioned as a "container OS", meaning "only supporting running containers", and rightfully not supporting other "non-container" based use cases. Not supporting non-container stuff is a feature, meaning that the system interface is restricted to docker and system-docker. Once a system architect / admin picks RancherOS/Burmilla OS (in the past), he or she does not need to worry about maintaining apt repositories but only docker registries, and if one wants to run Debian, one needs to put it into a container. Honestly, I am already a bit unhappy with the console change from busybox to debian: it expands the default user interface so much, and in an irreversible way (because the busybox one is dropped). We are already seeing users that logs in to the console, starts "apt update & apt install" stuff, and expects us to support that.

I totally understand the amount of work required to move things forward, and why it is attractive to move towards Debian. To be constructive, I think if we want to continue moving to this Debian rootfs world, there needs to be a new system-level interface being proposed (e.g. to support a system admin to write a system-level plugin), and this interface should be narrow enough so that it is hard to misuse, meaning most of the Debian application-support stuff should be locked away in some form, and "supporting 3rd-party Debian applications" for users should be an explicit non-goal.

My 2 cents.

@olljanat
Copy link
Member Author

@h8liu Thanks. Very good questions and comments.

In general, this looks like a really big change. It is almost like a new OS.

Yes basically it would be new OS which just share idea of minimal container OS and maybe some tooling with current one.

How would users using 1.x or 2.x migrate to this? Especially if we use system-docker today, what work / steps need to be done to use this new architecture?

Because of size of change we probably need go on track that we first only support new installations on v3.0.x and after we have got it working good enough level then provide migration tool on part of 3.1.x, etc...

How would things upgrade in the new debian rootfs world?

I think that k3OS upgrade scripts works fine here too as they did similar change already on RancherOS -> k3OS movement.

Or more specifically, is the debian roootfs read-only here (like using squashfs)? Or do we still want people to be able to apt get stuff?

At least first phase it should writable on way that people can use apt-get. Example I use it a lot to install debug tools like tcpdump, tsshark, etc when needed.

But those should disappear on upgrades same way like they now disappear with sudo ros os upgrade and sudo ros console switch default command combination (which are on our upgrade guide)

The general high level question behind these questions seems to be: what is burmilla's official user interface that applications / users can rely on (and which we should try to keep it working as much as possible)?

ros command which we build from sources on https://github.com/burmilla/os/tree/master/cmd is the main user interface and way how OS configuration is handled with cloud-config.

Seemingly, system-docker was part of the interface, yet this proposal is going to break it. I see this as a betrayal to RancherOS's old users (if there are..).

Very valid point. We probably need create some wrapper script with system-docker name which people are able still use to control system services.

In the long run, if "supporting 3rd-party Debian applications/tools" is a goal, why would not people just use a custom build of debian / ubuntu rather than using Burmilla in general? Other than "it is ours and we have control", what are the benefits of using Burmilla for other potential users out there?

Amount of work needed to make Debian/Ubuntu to optimized for containers workloads is quite big and I see it as waste of time if people need to reinventing wheel on their side to by creating custom builds when we alternatively they can contribute to BurmillaOS project. But that of course is only possible if we are able agreed something which make sense for large amount of people.

To me, Rancher/Burmilla OS was positioned as a "container OS", meaning "only supporting running containers", and rightfully not supporting other "non-container" based use cases. Not supporting non-container stuff is a feature, meaning that the system interface is restricted to docker and system-docker.

This is mostly balancing between requirements. Originally RancherOS contained only default console but they ended up to ability to switch consoles already on version 0.5.0 and based on feedback given on #6 most of the users have ended up to switch to non-default one. I was forced to do so also because my colleagues wanted to be able to install debug tools which are not available on default console.

Also I see that there is growing number of users who want to use BurmillaOS for run Kubernetes and from reason or another they do no want to use k3OS which is build for exactly to that purpose and I hope that this proposal will wake them up to tell about their thinking.

IMO target is still that BurmillaOS is used as a "container OS" but ability to install debug and dev tools especially to build servers is quite critical. I example develop and build BurmillaOS on BurmillaOS which why ability to install tools like git and vim is quite critical 😏

Once a system architect / admin picks RancherOS/Burmilla OS (in the past), he or she does not need to worry about maintaining apt repositories but only docker registries...

That is still target which why those packages should be updated by sudo ros upgrade

Honestly, I am already a bit unhappy with the console change from busybox to debian: it expands the default user interface so much, and in an irreversible way (because the busybox one is dropped). We are already seeing users that logs in to the console, starts "apt update & apt install" stuff, and expects us to support that.

Unfortunately you wasn't part of the discussion when we did that decision on #9 Can you open up another issue where we can discuss about options what we can do for it? (add some message for users, reintroduce busybox option, etc...)

I totally understand the amount of work required to move things forward, and why it is attractive to move towards Debian. To be constructive, I think if we want to continue moving to this Debian rootfs world, there needs to be a new system-level interface being proposed (e.g. to support a system admin to write a system-level plugin), and this interface should be narrow enough so that it is hard to misuse, meaning most of the Debian application-support stuff should be locked away in some form, and "supporting 3rd-party Debian applications" for users should be an explicit non-goal.

Yea this needs definitely more discussion.

@h8liu
Copy link

h8liu commented Apr 26, 2021

Thanks for the reply and efforts. I really appreciate the work and the discussion. Ultimately, I think whoever does the hard maintenance work got the right to make the decision, so my words are mainly just suggestions from my own perspectives, and feel free to ignore my rants :)

install debug and dev tools especially to build servers is quite critical.

I totally agree on this, which is exactly why something like system-docker to host consoles (and other system services) is useful. For 1, it allows not only debian, but also other types of consoles, and for 2, it allows non-debug mode (a.k.a production mode) to be done with a much smaller console or even no console. While marking debian console as default and not supporting other consoles is one thing; forcing users to use debian is another thing that is much stronger a limitation.

I think there are two critical features that I would like to see/maintain in the new world:

  1. A well-defined, narrow interface to run system/admin level customizations, where I can have a promise that it won't break in the future. In the past, this was in some sense the system-docker (maybe also the "ros" binary). Maybe in the future, this can be a grpc service that ros exposes and listening on a unix domain socket or something. Ideally, I prefer this not to be "the Debian distro", which interface is quite wide and complicated, and does not really fit into a "container OS" design. That said, the debian rootfs can also be restricted to present small, slim and secure user customization interface in some way, yet I think more design details are needed.
  2. A mode or some way to remove/hide/lock down as much as possible of the parts that are not critical to hosting containers (e.g. stuff to facilitate debugging). IMO, SSH into a production-mode container OS cluster to debug stuff on the privileged console directly should be discouraged anyways, either for stability reasons or security reasons. This mode can be (sort of) achieved by the system-docker, and I am not sure how the "Debian slim RootFS" can deliver that.

@matthewkrupnik
Copy link

I get this is to discuss a different direction for this OS potentially, but what brought me (and possibly a number of other people) to RancherOS to begin with was the idea that everything is an ephemeral container. Making a shift to having a base OS breaks that core paradigm. While it can be limiting in some ways, the idea that your console is literally just that with nothing installed directly in it and every single system service is it's own container is the selling point of this OS for me.

@tomaswarynyca
Copy link
Collaborator

My opinion is that currently the system has many limitations and moving to a Debian base optimized for containers would be a giant leap forward because most users do that nowadays. As they say it would break the main scheme of the RancherOS project but it would be a change for the good of the BurmillaOS project.

@matthewkrupnik
Copy link

matthewkrupnik commented Apr 28, 2021

My opinion is that currently the system has many limitations and moving to a Debian base optimized for containers would be a giant leap forward because most users do that nowadays. As they say it would break the main scheme of the RancherOS project but it would be a change for the good of the BurmillaOS project.

But if you are looking for Debian OS with docker installed, why not just do that? What would differentiate this project from minimal Debian installation with docker installed?

@olljanat
Copy link
Member Author

Looks that we are not able to reach common understand (at least yet) so probably we need take timeout with this proposal and trying to figure out some alternative one.

Before doing so I still want to clarify some of open questions / misunderstandings here.

BurmillaOS was born as my half serious hobby project where my target was figure out if I can build whole OS with latest 4.14.x kernel and 19.03.x Docker versions to buy me and my colleagues little bit more time to figure out that which OS we want migrate our Docker Swarms. Forking was done from RancherOS 1.5.5 and I decided to jump over to version 1.9.0 to make sure that we do not conflict with their versionins in case Rancher decides still release some new versions. BurmillaOS worked fine on that use case so we agreed with my colleagues to keep using it as long we are using Docker Swarm and figure out some alternative OS when we start using Kubernetes.

Because I wanted to give people opportunity to utilize what I have made here after company where I'm working on does not use BurmillaOS anymore (and I do not contribute the project anymore) I did:

The fact which I'm worry about is that unless we are able to create roadmap and find enough persons who are ready to commit to that one this project cannot survive. When I create roadmap proposals I like to start from long term (>= 5 years) as short term targets are much easiers to figure out after long term targets are decided. Why personally would like Debian RooFS based OS is that then most likely it would be able to run both Docker Swarm and Kubernetes and which why I (maybe) would be able to maintain this project also lon ongterm.

Thanks for the clarification of the possible roadmap. I got the impression that all attention would be put in the new architecture and current planned releases would be stopped. When limitied resources are available for further development it could be a chalange to keep up with multiple branches. Realy appreciate the effort of the team to keep it going.

@msn62, This was explained already above but to be extra clear I want to say that there is no real roadmap yet (only early draft on here) and there is no development team behind of BurmillaOS (at least yet). What we have is me and three individuals @tomaswarynyca , @ToeiRei and @h8liu who have made some contributions to this project and who I have given burmilla organization owner rights so they able to keep project on going without me if they decide to do so.

Remember that we are on very early days with this project. Our first release version was released only four months ago.

But if you are looking for Debian OS with docker installed, why not just do that? What would differentiate this project from minimal Debian installation with docker installed?

@matthewkrupnik, I don't want to be rough but it sounds that you really do not understand how much work is needed to create and maintain production grade custom version of Debian which can be easily scaled to hundreds of servers. It is really waste of time if people on different organizations end up to creating their own version instead of working together and share same solution.

Thanks for the reply and efforts. I really appreciate the work and the discussion. Ultimately, I think whoever does the hard maintenance work got the right to make the decision,

@h8liu, No I don't want to do decisions which does not have wide enough common understanding but still what I said above applies so we need discuss as long we are able to agree roadmap.

PS. This first open source project which I'm trying to lead so it will take a while to figure out good way to do so.

@tomaswarynyca
Copy link
Collaborator

I agree totally with what you propose @olljanat this is an improvement to take the docker/kubernetes system to another level by having a stable base, easily upgradable and easily replicable to many machines.

When I create roadmap proposals I like to start from long term (>= 5 years) as short term targets are much easiers...

About the course it seems to me the right thing to have clear from the beginning where we are going to go to not have problems throughout the project, because it is useless to start this new version if halfway we start to rethink everything that was achieved.

@matthewkrupnik
Copy link

@matthewkrupnik, I don't want to be rough but it sounds that you really do not understand how much work is needed to create and maintain production grade custom version of Debian which can be easily scaled to hundreds of servers. It is really waste of time if people on different organizations end up to creating their own version instead of working together and share same solution.

But that is exactly my point. There is already a production grade OS called Debian, maintained by an established team. So what is the value in making a new OS based on it and putting in all this effort to create a custom version of Debian?

What I came here for is a lightweight OS where I can spin up a new node with nothing more than a cloud-init.yml file. The ephemeral part is key. Nothing is persisted anywhere, nothing is installed permanently to the console, etc. I am just hoping that this could be a viable alternative to RancherOS. If the long term plan for this project is something else, then it's not what I am looking for. I just wanted to chime in on the discussion to give a voice to those who just want RancherOS with latest kernel and docker versions.

@h8liu
Copy link

h8liu commented Apr 29, 2021

Okay, I will speak more then. Long essay coming..

High level direction

Or what is Burmilla OS?

This is the quote from the burmillaos.org website:

BurmillaOS is our reaction of us to the End of RancherOS, which was one of the smallest and easiest ways to run docker as every process including services as udev or even syslog are running in their own containers. As the system is stripped of anything unnecessary to run docker, the resulting system is way smaller than most others of todays operating systems.

In short, Burmilla OS is a container OS, and a container OS's main use is arguably for running stuff in clusters.

I do not see using Debian as a base system for building rootfs to be necessarily good or bad. Yet, I think what burmilla OS should keep and maintain, is the long-term goal to be a great and open container OS, which means that things that are not required to host containers should be aggressively eliminated in the design.

It does not mean that it cannot be something else. For example, we might also want to say that in the future, burmilla OS will become more like:

  • A more general Linux distro, that is Debian compatible but favors running docker in some way.
  • An IoT or embedded board friendly Linux distro, that supports many boards like ARM/MIPS boards, rpi's, ODROID, etc.. suitable for a single node home server. (btw, this is my main usage, but I understand I am like a corner use case.)

But these are conflicting goals in the long run. I think we can only choose one among container, general, and embedded OS, and we should state that explicitly.

For the rest of this post, I will assume "container OS" is the direction, as it seems to be the most natural and expected one.

Other container OS's

There are similar other container-centered Linux OS's:

Amazon's bottlerocket

link
Its core is Amazon Linux.

Google's container-optimized OS

link
Its core is based on ChromeOS, so it is essentially gentoo based.

RedHat's Container Linux (formerly CoreOS)

link
Also based on Gentoo.

Fedora CoreOS

link and more
Free CoreOS branch when RedHat bought CoreOS. This is probably the most related container OS project that is not backed by a big commercial company. Fedora is heavily sponsored by RedHat though.

k3os

link
And of course, Rancher has this one, which is still in its early stage.

There are also some other ones, like Photon from VMware, maybe also Ubuntu Core (?).. Forgive me if I did not include every one of them..

Desired features

So that is the landscape of the market. Now, look at how the marketing pages talk about these Linux distributions, one key feature is to keep it minimum and small. It is more resource efficient; it is easier to maintain; it is more secure as the attack surface is smaller. Additional features are like:

  • super easy to work with k8s (e.g. has kubelet preinstalled) // not yet in burmilla
  • has active-passive dual read-only system partitions for more reliable os upgrades and roll backs // not yet in burmilla
  • supports verified boot and dm-verity // not yet in burmilla (btw, this could be quite challenging for small, custom-built kernels, as you need some one like Microsoft to be willing to sign our releases)
  • auto updates // not yet in burmilla, this needs deeper integration with k8s and etcd.

So I think burmilla's long term road plan should be about: how to get these container OS features implemented with acceptable engineering cost?

Note that, easier to run debug tools on the OS is never a goal of any of those OS's, and should not be. So things like "my colleagues want to be able to apt-get their favorite text editor on the console" should not be listed as a requirement at the first place. In fact, it is against the core philosophy of container OS: minimized to only host containers. In the world of container OS's, almost all debugging activities should be done via container orchestrations and RPCs at a higher level over the networks, rather than on the OS's command line console. The OS should work well in production even without the need to access a tty (SSH or serial port).

Design and Implementation

Rancher's system-docker is an interesting and cool approach to "minimized to only host containers": like why not containerize all the system support parts (network, security, auditing, etc.) into containers too.

My understanding is that this approach now becomes an unnecessary hurdle to implement features like "super easy to work with k8s", and also the official docker is not maintaining this kind of weird usage of docker in the long run. And these are the core reasons why we want to replace it. Is that a correct way to view this?

If that is the case, I am not against moving towards Debian. But I think the rootfs has to be locked down in a minimal version, with only stuff that is required to host containers. For example, the core parts of the filesystem should be read-only, and things like apt install should be locked away.

Points in the original proposal

BurmillaOS's current architecture have some drawbacks which makes cases like:
ISV support (like it is mentioned on RancherOS EOL announcement). Good example is Dynatrace which once supported
RancherOS but have discontinued support on later versions.

This should be a non-goal. Software should just be packaged into container images. It is sad that Dynatrace dropped support, but that is not because ISV support is hard in the system-docker model, but Burmilla/Rancher OS becomes less significant to Dynatrace's business.

Full support for Kubernetes #47

This looks good. That said, K8s talks to containerd directly now, and does not need dockerd. So optimized to support k8s and optimized to support docker (as "docker run") are two slightly different goals.

Use diskless servers with root file system mount from iSCSI

Not sure about this one. If boot from network is required, sounds like verified boot should be implemented first? I am also not sure why having a system-docker makes this complicated.

Port to new platforms like MIPS64el #23

I think this requirement should be dropped, unless someone that needs it wants to contribute.. No real cluster runs on MIPS.

Again, I still think moving to Debian might be okay. It just needs more details on the design, and more clarifications on why.

Economics and branding

I think it is also important to think about who is doing the work, and how the work is being motivated (or funded).

Let's be honest, most of the recent work here is mostly to accommodate RancherOS's sudden death. Like "oh, rancheros is dead, but we are still using it. we don't know where to go, and the OS needs security patches.. shoot...". This is a pretty strong motivation for patching jobs, but I fear that it is not strong enough for long-term commitments.

So I think people need to keep this in mind, and understand where things come from. That is, till now, @olljanat 's company is essentially funding most of the work here, so it is only natural that Burmilla OS evolves in a way that fits the needs of @olljanat 's company.

That said, I think Rancher OS had a reputation to be a container OS, and I think it is valuable to maintain that old brand and that general direction to keep the community growing (hopefully), where specializing the OS to fit one (or several) company's special needs is likely to shrink the community. However on the other hand, there are already several container OS's out there that are well funded, so I honestly also doubt if keeping burmilla as a container OS matters that much, as it will be hard for burmilla to compete and survive in the long run..

Anyways, to summarize my points, I am hoping that, Debian or not, burmilla OS can keep its name as a container OS. If @olljanat and @tomaswarynyca want to go towards a more general-use Debian-compatible Linux distro that also favors running docker in some way, maybe it should be branched off into another name and continue the work there, and keep burmilla in maintenance at 2.x until some one here takes over the torch who can continue on the road of container OS?


Again, just my 2 cents lips service :) Thanks.

@olljanat
Copy link
Member Author

It does not mean that it cannot be something else. For example, we might also want to say that in the future, burmilla OS will become more like:
* A more general Linux distro, that is Debian compatible but favors running docker in some way.
* An IoT or embedded board friendly Linux distro, that supports many boards like ARM/MIPS boards, rpi's, ODROID, etc.. suitable for a single node home server. (btw, this is my main usage, but I understand I am like a corner use case.)

But these are conflicting goals in the long run. I think we can only choose one among container, general, and embedded OS, and we should state that explicitly.

I think that IoT is definitely important thing to mention and keep on requirement list as there is no many good OS to run on containers on IoT devices.

So I think burmilla's long term road plan should be about: how to get these container OS features implemented with acceptable engineering cost?

Yes that is very valid target IMO.

In the world of container OS's, almost all debugging activities should be done via container orchestrations and RPCs at a higher level over the networks, rather than on the OS's command line console.

Yes and that is one big reason why Kubernetes has come so popular as Docker's APIs just sucks on this area...

If that is the case, I am not against moving towards Debian. But I think the rootfs has to be locked down in a minimal version, with only stuff that is required to host containers. For example, the core parts of the filesystem should be read-only, and things like apt install should be locked away.

Yes I understand that lock down would be useful on your use cases where you want provide IoT device + software for customer and make sure that they do not mess up with it. But isn't then enough to have OEM/lock down mode as feature which can be enabled on those use cases? That can also do those other hardenings like disable SSH, etc...

Full support for Kubernetes #47
This looks good. That said, K8s talks to containerd directly now, and does not need dockerd. So optimized to support k8s and optimized to support docker (as "docker run") are two slightly different goals.

Slightly yes but it also critical to understand that original version of containerd existed on Docker and was split to as own project and heavily refactored Kubernetes in mind. Then Docker have been modified to use containerd as backend and duplicate code deprecated/removed (the main reason why there was big gap between Docker 19.03 and 20.10 release times).

Then we are entering the challenging part of BurmillaOS current architecture. Our system-docker is based on Docker 17.06 (which means that it code freeze for features on it have been on July 2017).
Top of that our system-docker is customized on way tht some features have been removed from it burmilla/docker@63c882f burmilla/docker@a625173

Docker 17.06 used containerd v0.2.0 so on those days Docker was needed but nowadays most of the features which we need from system-docker already exist on containerd (at least mostly). More about that on #28

Use diskless servers with root file system mount from iSCSI
Not sure about this one. If boot from network is required, sounds like verified boot should be implemented first? I am also not sure why having a system-docker makes this complicated.

I mean using tools like https://ipxe.org Challenging part is that init need to have logic to mount rootfs before starting any other system services (like system-docker and containers running top of that).

Port to new platforms like MIPS64el #23
I think this requirement should be dropped, unless someone that needs it wants to contribute.. No real cluster runs on MIPS.

Like you said we need well funded companies joining to project so let's look little bit more what happens in East which we living on Western countries easily miss:

IMO, we cannot ignore these. At least yet.

Anyways, to summarize my points, I am hoping that, Debian or not, burmilla OS can keep its name as a container OS. If @olljanat and @tomaswarynyca want to go towards a more general-use Debian-compatible Linux distro that also favors running docker in some way, maybe it should be branched off into another name and continue the work there, and keep burmilla in maintenance at 2.x until some one here takes over the torch who can continue on the road of container OS?

That is also option but I will first try figure out if we can create roadmap which does not force us to split to two projects.

@wonleing maybe you want also give some comment to this discussion from your point of view?

@ToeiRei
Copy link

ToeiRei commented Apr 29, 2021

Let me throw in my 2 cents here (including taxes and duties)

For me, RancherOS was the way to go on a 'docker os' or something to run my containers on without the big 'os' part in the background, making it ideal for IoT and especially my small container VM. Not having to maintain the OS makes it a valuable asset for me to teach people on how to use docker and what to avoid as I can skip right to the point without the 'how to install docker' and all that jazz - even on a Raspberry or any other embedded device that would be able to run this code.

I do not see full blown kubernetes in many small scale lab environments as it's often just way too much trouble for just running their pi-hole or unifi controller on one box doing auto-updates for just having that thing online 24/7. So I see porting it to such devices would be one of those things I'd love to see - although I'm not against having podman onboard or whatever else you want to include. But keep in mind: BurmillaOS for me is a system that does literally take care of itself (for the cat analogy: open the door, let her out to catch mice and let her back in.)

What I would be interested in is, how do people actually do storage. Is iSCSI used? Do they NFS? And please - keep in mind that small home-labs also exist who keep it on a 'not so complex level' out there...

Architecture-wise:

The docker-in-docker concept is something I really love about the system and by moving it to a more full OS it will lose its unique character and especially its easy ways of updating as everything on it is just a container. Surely I am aware of the drawbacks of this concept, but I also see the big plus here.

The debian tools support is something I question here as it may encourage people to tinker with the system - which is something that I see a lot on other software products leading to gaping security problems and support issues. On the other hand we have a debian shell and if you want to have curl on it, you can install it there and limit the damage done and revert it by redeploying the shell container.

Security-wise we should try to be real here. SELinux looks like it's going to do a good job if we get it right. On the other hand we do give access to the docker-sock which somehow feels like a passwordless sudo without logging. Sure, I appreciate the possibility to run portainer or traefik or watchtower - but wouldn't it be wise to have to run them privileged to prevent at least some abuse by broken containers? You know... focusing on those problems we do have and not always the worst case scenario?

My security concerns are mostly "bad containers" and misconfigurations done by a user allowing an easy breach while still having stuff done against the common background noise of the internet...

@wonleing
Copy link

wonleing commented May 8, 2021

Here's my vote (from commercial distribution point of view):

  1. About IoT and VM support: This is the main market, must be optimized for these senarios. but long-tail drivers don't have to be built-in by default, just offer a way to add such support.
  2. K8s/K3s support: This is quite important to baremetal, but doesn't have to be built-in by default. We could support it as an optional 'Service'. It is also true for other feature or value add.
  3. About kernel/rootfs/docker/containerd version: Should provide an easy and stable way for developers updating as their wish (with official or their own code). We don't have to bother this in upstream. This is also true for other optional services.
  4. The main job of upstream: The architecture and build framework. Make it stable and well documented. Enable developers/commercial company build out custom iso with their own taste (services and versions).
  5. About mips/loongarch/sw64 support: leave it to us

@h8liu
Copy link

h8liu commented May 8, 2021

Here's my vote (from commercial distribution point of view):

@wonleing so what is your vote? are you for or against the new arch proposal?

@h8liu
Copy link

h8liu commented May 8, 2021

so.. back to the main argument.. I have an idea, which maybe can be a middle ground. how about this:

  • change the system-docker to a slim debian rootfs, but it is a minimized, small partition/image that is mounted read-only by default. which means a user on the console cannot install additional non-docker tools by default. in fact, the apt binary might not even exist in the rootfs..
  • the default rootfs will aggressively trim out unnecessary parts/tools, to minimize the attack surface of the container os runtime.
  • have a flag in cloud-init that optionally mount a writable overlay, mainly for ease of debugging stuff. the overlay can be optionally erased on upgrade.
  • document how to build the rootfs, so that people can add more stuff into it if required. i.e. make people easy to branch and build their own rootfs to swap the default one.
  • extend the ros binary into a daemon that listens on /var/run/burmilla.sock to serve a grpc service (maybe also accept JSON-based requests too?), so that a user container with this UDS mapped in can perform system operations (such as os upgrade, reboot/shutdown, read or change cloud-init, run an arbitrary process/command, etc.)

the whole architecture is like ubuntu core (but with docker rather than snap) or core os (but with a probably even smaller rootfs), so we aim to be smaller, simpler, better, and hence more secure by default (kind of ambitious goals though :P ).

totally not sure how much work this change will need...

@h8liu
Copy link

h8liu commented May 8, 2021

and the formal burmilla OS's interface spec would be:

  • cloud-init (and its related kernel cmdline flags)
  • ros command (and all its aliases), for mainly human use and maybe simple scripts
  • API at /var/run/burmilla.sock, for robot use
  • docker command and docker API at /var/run/docker.sock and /var/run/containerd (for k8s).

which we won't break easily (like unless it has a security vulnerability).

other parts (like what files are in the rootfs, what shell it is using, or even if the rootfs is debian or just a busybox-ish thing) are burmilla OS's internal implementations, which would have no backward-compatibility promises.

this also preserves the possibility for other people to swap the rootfs with an even smaller busybox one if they desire (for example, on embedded devices with limited resources)

@olljanat
Copy link
Member Author

olljanat commented May 8, 2021

so.. back to the main argument.. I have an idea, which maybe can be a middle ground. how about this:

* change the system-docker to a slim debian rootfs, but it is a minimized, small partition/image that is mounted read-only by default. which means a user on the console cannot install additional non-docker tools by default.

Make sense but I'm not familiar with read-only file systems so we need find some good example where to study. However I know it is common on embedded systems so probably some of those would be place to learn from.

* the default rootfs will aggressively trim out unnecessary parts/tools, to minimize the attack surface of the container os runtime.

I just noticed that Alpine Linux offers ready made minimal root file system on their download page. Extracted size looks to be only 6MB and it still already contains apk command which can be used to include what ever we need. They also looks to have up to date versions of Docker and K3s available. So maybe we should actually look about it instead of Debian rootfs.

* document how to build the rootfs, so that people can add more stuff into it if required. i.e. make people easy to branch and build their own rootfs to swap the default one.

Ultimately people shouldn't need to build custom rootfs but instead of be able to include tools their need our default one.

What if we allow only cloud-init to install new tools? I mean that rootfs would be writable on boot phase but would be remounted as read-only after that. Or if we go even more extreme maybe it would be possible include extra tools only on installation phase?

Like:

  • Boot from ISO
  • ros install ...
  • Install extra tools
  • Boot to read-only system.
* extend the `ros` binary into a daemon that listens on `/var/run/burmilla.sock` to serve a grpc services (maybe also accept JSON-based requests too?), so that a user container with this UDS mapped in can perform system operations (such as os upgrade, reboot/shutdown, read or change cloud-init, run an arbitrary process/command, etc.)

Make sense but we need find someone who is familiar with grpc service to doing that work.

the whole architecture is like ubuntu core (but with docker rather than snap)

I just quickly tested Ubuntu Core on Raspberry now but didn't found yet that how OS upgrades are handled on it?

@olljanat
Copy link
Member Author

I did some study with Alpine Linux and managed to create new prototype which I would like that you guys test and give your feedback.

It uses GPT partitions with UEFI, two root partitions where one is active and another one can be upgraded like e.g. Mellanox Onyx switches (which actually are Linux servers running on x86 platform) upgrades are handled and separate /var partition for persistent data.

For demostration purposes I also made it on way that first root partition contains Docker and second one K3s and top of that there is also option to boot with overlayfs which which makes any changes to active root partition non-persistent.

So partitions looks like this:

localhost:~$ sudo blkid | sort
/dev/sda1: UUID="209C-C48F" TYPE="vfat"
/dev/sda2: LABEL="ROOT1" UUID="5cfec59d-1bcc-4c21-903b-8775a3bf9b29" TYPE="ext4"
/dev/sda3: LABEL="ROOT2" UUID="64ccd79d-a9d4-47ee-bbbf-af7972ace147" TYPE="ext4"
/dev/sda4: LABEL="DATA" UUID="dcd347d9-be1f-4b7d-b5fb-07b043a4610c" TYPE="ext4"

and GRUB menu like this:
alt text

Disk image is available on here.
I tested it with QEMU but any virtualization platform which supports UEFI should works.

Partition and GRUB install I did manually, common part of roofs was made with this script and then it was updated inside of disk image and Docker+K3s installation was done with this script.

And the couple of comments to these:

4. The main job of upstream: The architecture and build framework. Make it stable and well documented. Enable developers/commercial company build out custom iso with their own taste (services and versions).

Make sense but this is also one of the biggest reasons why IMO we need new architecture as current architecture is badly documented, overly complex and have some skeletons in the closet (e.g. heavily customized version of system-docker and libcompose which are not compatible with upstream).

5. About mips/loongarch/sw64 support: leave it to us

From actual implementation point of view for sure but first we need OS architecture which allow us to do things same for all CPU architectures. Example currently ros upgrade feature is disabled on other CPU architectures than amd64 and it just feels very wrong:

os/cmd/control/os.go

Lines 196 to 199 in 4d473e8

func osUpgrade(c *cli.Context) error {
if runtime.GOARCH != "amd64" {
log.Fatalf("ros install / upgrade only supported on 'amd64', not '%s'", runtime.GOARCH)
}

@pwFoo
Copy link

pwFoo commented May 22, 2021

I tried today to delete my BurmillaOS... But not like docker rm -f $(docker ps -qa)... ... it was... sudo system-docker rm -f $(sudo system-docker ps -qa) 💯
It was just a -f too much... I tried to remove old system containers after migration from RancherOS to BurmillaOS
After a reboot all works fine again 👍

v3.0 could be based still based on docker or a custom linuxkit build? I played with a completely custom linuxkit build based on alpine rootfs, custom init, services based on crun containers.
v3.0 should try to keep the initrd as small as possible. A small init and minimal container runtime to download / update containers without more than just a image pull.
And all services should / must still run inside of containers and updates still should "just work". I never had problems with ros os upgrade...

To reduce image size images and console should move to alpine base!
And upgrades should clean up old versions. Just save the previous version if a rollback is needed...

"engine" is user-docker with RancherOS / BurmillaOS. Additional instead or alongside to user-docker additional engines like k3os or podman would be a nice to have,

@olljanat
Copy link
Member Author

v3.0 could be based still based on docker or a custom linuxkit build? I played with a completely custom linuxkit build based on alpine rootfs, custom init, services based on crun containers.

I didn't had earlier experience about linuxkit but after short investigation it looks very interest option. It looks to be also running system services inside of containers like RancherOS / BurmillaOS does but it is able to do that with up to date, non-customized version of containerd 🎉 Definitely worth of further investigation.

And all services should / must still run inside of containers and updates still should "just work". I never had problems with ros os upgrade...

Linuxkit looks to be totally missing upgrade functionalities so creating v3.0 architecture based on linuxkit and modifying sudo ros os upgrade to working with it would definitely make sense.

That would simplify BurmillaOS codebase a lot, allow us to get rid of problematic system-docker and allow users easily also build custom version of ISO/disk images (Linuxkit looks to be supporting these output formats "aws docker dynamic-vhd gcp iso-bios iso-efi kernel+initrd kernel+iso kernel+squashfs qcow2-bios qcow2-efi raw-bios raw-efi rpi3 tar tar-kernel-initrd vhd vmdk").

"engine" is user-docker with RancherOS / BurmillaOS. Additional instead or alongside to user-docker additional engines like k3os or podman would be a nice to have,

Looks to be doable. Linuxkit already contains docker example and looks that someone have already created example how to include k3s to it.

@h8liu @wonleing what you think about linuxkit based solution?

@h8liu
Copy link

h8liu commented May 25, 2021

Honestly, I probably do not know enough about linuxKit to give a judgement. Nevertheless, I just had a look at it, and it feels to me more like "a framework" rather than "a (kernel+rootfs) in place replacement". It uses (and probably enforces) an entirely different workflow to build, and/or best practice to operate. I fear that it will be a lot of migration work for Burmilla to use it, and not sure about how the new framework would integrate with the "ros" binary.

For example, linuxkit claims that it does not support self-update, where nodes are treated as immutable and all updates should be external orchestrated:
https://github.com/linuxkit/linuxkit/blob/master/docs/security.md#external-updates---trusted-provisioning

So I am not sure how "ros update" would work under linuxkit's model. This particular update model works for most(?) cloud / virtual machine environments, but will be challenging for the embedded scenarios. One need to implement an A/B side bootloader and swap between two read-only rootfs images? It might need more experiments here.

It does look more suitable to our needs than "Debian" or "alpine" though.


I would like to bring up a a slightly higher level topic here.

From a program management perspective, I think we should first reach a consensus about what people want, before diving deep into possible technical solutions and details.

I think we should be honest that we have a critical limitation: we do not have enough engineering resources to maintain an OS like Rancher/Burmilla as a stand-alone / unique Linux distro as it was.

One way to move forward is to dock into another larger OS ecosystem, but I think it is going to be deadly in the long run. Even if we manage to continue the maintenance, it will drag the OS design closer to other ecosystems over time, and I fear that there will be a point where it just becomes a subsystem of another more popular thingy, and people will say "isn't this just X?". And eventually it won't make sense to keep maintaining Burmilla as a separate thing. If that would be our destiny, then maybe it is better to just migrate to the more popular thingy right now..

Another way to move forward is to focus our limited efforts on a maybe narrower project scope, with a target to make Burmilla uniquely great on something, so that it can attract more people in the future, hopefully. I think that is the only way an (ambitious) open source project can survive and thrive in the long run.

Currently, we have people with different requirements:

(a) some people want to have a decently maintained OS that is backward compatible with RancherOS, at least as much as possible, so that their things won't break terribly.
(b) some people want to have a continuously-evolving container OS that runs docker+k8s, and can be used in the cloud, vm clusters, or datacenters.
(c) some people want to have a continuously-evolving container OS that runs docker (maybe also k3s?) and can be used on (mostly stand-alone?) bare-metal embedded/IoT devices.

At least to me, requirement (a) is why Burmilla is born, but it probably has no future. My understanding is that possibilities of (b) and (c) are why we are talking about 3.0 here. (right?)

I might be wrong or ignorant on this, but sometimes I feel that, for people of requirement (b), the more rational option might be just to drop Burmilla/Rancher and migrate to another container OS (like Fedora Core OS). Burmilla OS 1.9 and 2.x should have bought people enough time to do so.. If you were truly using RancherOS as a container OS and only running stateless docker containers on it, (I guess) it should not be too hard to migrate to say CoreOS, at least not significantly harder than migrating to a Burmilla OS 3.0.

For requirement (c), if you have not yet started using Rancher/Burmilla OS yet, maybe look into yocto project? or Belena OS?

Essentially, if you are driven by a business problem here, and you are not truly stuck with Rancher/Burmilla OS for long, and do not want/need to own a part of Burmilla OS, maybe it is more rational to not bet on Burmilla to have a bright future, and just move on to something else that has better support at this time..

And from my point of view, the current, most secular value of BurmillaOS, as an open source successor of RancherOS, is to satisfy requirement (a).

At least, I got here mostly for (a).

If (a) is kept, I would be curious about (c), but if (a) will be dropped anyways, I doubt Burmilla 3.0 will be the best choice for my use case moving forward, especially given its limited engineering resource available. My migration out of 1.9/2.x to 3.0 won't be pretty anyways, so why not I just migrate to something that is much more popular and better maintained? (I also use Burmilla in the cloud, but it would be pretty easy for me to migrate to another container OS.)

That said, putting my own personal usage aside now, as one of the "owners" of the project (and trying to sound responsible here) I am okay with 3.0 dropping (a). However, then I think we should first have a consensus about: Where are we going? Or, rather than building something new, why not just tell the users to use some other existing solutions out there? How can we make this project uniquely great on something?

For example, is it that @olljanat or @olljanat 's company wants to own an OS for operational independence? Or some other reason? Or, what is Loongson really looking for here? I need some help or more context here to understand the motivation.

If we seriously want this open source project to continue its evolution, and stand out (or just to survive) among other existing solutions out there, I think it is better if we just pick only one between (b) and (c) at a time.

Or, branch this project/group of people to two branches here, one just for (b) and one just for (c).

Specifically, with the cloud world embracing k8s and dropping support on the Docker interface, I fear that a common solution to both (b) and (c) without a lot of custom design and maintenance work will be increasingly challenging in the future, as the k8s ecosystem (lead by OCI and the cloud industry) and the Docker/IoT ecosystem (driven by IoT and end-user/desktop Docker usages) will likely evolve into very different scenarios.

In the long term, I feel that focusing on (c) has a slightly better chance to stand out; the cloud world is already very crowded. It is just my hunch though.

However, @olljanat 's main requirement is (b) I think. Ha!

So I am a bit torn here. How can I ask @olljanat , the main contributor here, to focus just on IoT? :)

Therefore, it feels to me that the best way moving forward might be something like this:

  1. Continue maintaining 1.9 and 2.x for some time (1 year?), to satisfy (a), but with a clear sunset schedule. i.e. people can/should start migrating to other bigger distros if possible. (and don't feel sad; this frees people from the maintenance burden, and enables Burmilla people to focus on more exciting opportunities.)
  2. Have one branch lead by @olljanat (and @tomaswarynyca ?), to satisfy (b) (and @olljanat 's company's specific needs?). This branch can take the "Burmilla" name if @olljanat prefers to keep it, and can call it 3.0
  3. Have another branch to satisfy (c), maybe called BurmillaEdge(?), led by Loongson folks (?) or I can also shepherd it
  4. The two branches can share stuff whenever possible (e.g. use of linuxKit, common packages used by the ros binary), but do not have to agree with each other on the overall design or roadmaps.

The concrete benefits of this proposal:

  • Break free from the high cost to maintain backward compatibility to the out-of-date RancherOS.
  • People who cares about only one branch do not have to learn about the world on the other side.
    • The IoT branch now does not have to support the latest containerd
    • Do not have to maintain kernels/programs to support archs of both worlds. Like the cloud branch does not need to support MIPS necessarily, and the IoT branch does not need to support building AWS/GCP/Vmware-optimized images
    • The cloud branch can even drop docker and fully embrace k8s if desired. (and go compete with K8OS?)
    • @olljanat has more freedom to accommodate his/her company's need there (i.e. to support some third-party cluster monitoring tool)
  • As a result, things can move a lot faster. People will be more motivated to push things out.
  • And Burmilla as a project brand has a slightly higher possibility to fly higher~

Thoughts? :)

@pwFoo
Copy link

pwFoo commented May 25, 2021

Hi,

I really like linuxkit to build custom linux distributions and if you know how to build docker images and how linux init works, You can do anything with it... Simplified linuxkit just combine "init" images rootfs files to a initrd rootfs and add onboot / services to /containers as runc / crun ready rootfs + config bundles with maybe the need of some overlayfs mounts.
With original linuxkit init containerd is used to handle container start and monitoring. With a custom init there is a service supervisor (finit, systemd, monit, busybox init, ...) needed to start / stop / monitor / respawn container services.

Linuxkit isn't designed to update the services, but it's possible with for example overlayfs or other tricks (git clone, pull and unpack docker images to directories, ...). I think there was a RancherOS branch / test which tried to move the base to linuxkit, but I think that failed... Don't know if so or why...
https://de.slideshare.net/mobyproject/using-linuxkit-to-build-custom-rancheros-systems

BurmillaOS 3.x based on linuxkit could have different ways how it could be done.

  1. Use linuxkit just as build tool for BurmillaOS
  2. replace system-docker with linuxkit services (updates?!)
  3. Build a new OS based on linuxkit and linuxkit services

I like to play with linuxkit and minimal linux / container os examples, but maybe it would be better to not move with 3.x to a completely new OS build and maybe

  1. keep BurmillaOS as is! Just try to update (i.e. system-docker), improve features and try to reduce initrd file size and improve performance?

RancherOS / BurmillaOS is really great! I like the way it works (without systemd). I don't know how good you know the ros source code (I'm not a Golang programer...). Without more knowledge about ros / golang it feels like ros services try to build something similiar to docker swarm services?
So maybe it should be compared and maybe move from some custom parts to swarm services?

@olljanat
Copy link
Member Author

At least to me, requirement (a) is why Burmilla is born, but it probably has no future. My understanding is that possibilities of (b) and (c) are why we are talking about 3.0 here. (right?)

Yes, (a) is why BurmillaOS was born and this discussion is about those (b) and (c) requirements. Reason why I'm thinking about this on very deep technical perspective is that I'm currently trying understand that is it possible even on theory solve both of those with one OS. As far I see it K3s project original target was create lightweight version of K8s but it actually ended up to be modernized version of it for those who want/need DIY Kubernetes cluster(s) instead of managed service on public cloud. So if we can run both Docker and K3s it can handle both (b) and (c) (at least on theory).

For example, is it that @olljanat or @olljanat 's company wants to own an OS for operational independence? Or some other reason?

For clarifying. Company where I'm working on currently uses (a) to run Docker Swarms and have plan to do so until our migration to Kubernetes is ready. That why I will keep maintaining those 1.9.x versions until that (4.14.x kernels support ends on January, 2024 so that is basically deadline for it) and that work I can do during work hours.

This v3.0 thinking is what I do on my spare time from general interest about those use cases how others are running/planning to run BurmillaOS and I'm trying to figure out best possible way tackle those all with one solution (or first of figure out is that possible even on theory or do we need multiple solutions). Reason why I included some of my colleagues requirements to initial post on this thread is that if we eventually decide implement v3.0 I still need convince them that it is best possible OS to run K3s too.

So excepted outcome from this discussion from my perspective is that we either end up changing world by implementing something what is currently missing or we (and especially I) learn how those challenges are already solved on more smart way on other OSes.

Specifically, with the cloud world embracing k8s and dropping support on the Docker interface, I fear that a common solution to both (b) and (c) without a lot of custom design and maintenance work will be increasingly challenging in the future, as the k8s ecosystem (lead by OCI and the cloud industry) and the Docker/IoT ecosystem (driven by IoT and end-user/desktop Docker usages) will likely evolve into very different scenarios.

Yes with vanilla k8s that probably would be issue but based on my study it should not be with k3s as it is single binary implementation with all needed components included to it.

However, then I think we should first have a consensus about: Where are we going? Or, rather than building something new, why not just tell the users to use some other existing solutions out there?

Yes, that is still valid option but are we able figure out good enough alternatives for all known uses cases that we are able to end whole BurmillaOS project or do we still need it for some use cases? Should we put focus to creating some migration tools/instructions to those? Would Belena OS be able to handle also your IoT use case(s)?

I think there was a RancherOS branch / test which tried to move the base to linuxkit, but I think that failed... Don't know if so or why...
https://de.slideshare.net/mobyproject/using-linuxkit-to-build-custom-rancheros-systems

Looks that some draft from those days still exist on our GIT repo too https://github.com/burmilla/os/tree/master/scripts/moby 😆

@h8liu
Copy link

h8liu commented May 25, 2021

Thanks @olljanat .

This v3.0 thinking is what I do on my spare time from general interest about those use cases how others are running/planning to run BurmillaOS and I'm trying to figure out best possible way tackle those all with one solution (or first of figure out is that possible even on theory or do we need multiple solutions). Reason why I included some of my colleagues requirements to initial post on this thread is that if we eventually decide implement v3.0 I still need convince them that it is best possible OS to run K3s too.

No offense, but if your main motivation is general personal interests, what happens when you feel that your interests are no longer? You would need some one to carry on, and maintaining a Linux distro is going to be a lot of work, which is why it needs a long-term purpose, just to survive in the long run. (and it might still be able to meet your need of interests?)

What I mean is that, I see the main challenge of 3.0 here not technical, but on the economics choices: we's better say "yes" to only one thing, and say "no" (or at least "maybe some time in the future") to all the other ones, because we have quite limited hands/funding here.

For example, here is the thinking from my own project (https://homedrive.io): if the 3.0 here wants to satisfy all the various needs (cloud, vm, k3s, embedded. etc.), I am afraid that the project is going to move so slow, that it won't really matter to us.. We will just branch off and do our own "alternative 3.0" in some other way that only supports our own needs, and drop the support to all else. Take k3s for instance: it is cool tech, but we run as a single-node device rather than a cluster, so we really just need docker but has no need for something like k3s, and we will just not bother supporting k3s at all (even though it might be a nice thing to have). And after our own branching out, even we would like to contribute back to upstream here, we won't be motivated to learn the complexities about other use cases, so someone here has to do the integration, which further slows things down.

Without a clear, focused project scope, if we try to be inclusive but without the matching engineering resources to deliver, it will be shrinking the community rather than growing it. The inertia we got from inheriting RancherOS will die out. I do not think that is what we would really like to get from doing 3.0?

:) I think I have already said enough here. I look forward to listening to the thoughts of other parties here, especially the ones that have longer term needs and plan to provide supports. (Loongson folks, I am looking at you :) What specific values are you looking for here?


per @pwFoo

keep BurmillaOS as is! Just try to update (i.e. system-docker), improve features and try to reduce initrd file size and improve performance?

It is already quite non-trivial, and it is going to be even challenging in the future.

(As some one who has dived into the existing code a bit,) here is some background: the OS uses many out-of-date (or gonna-be out-of-date) tech (e.g. dapper), libraries (e.g. docker-compose), services (e.g. system-docker), and custom code that are no longer being well supported. Many of those are legacies from Rancher (the company), and the company has moved on to the world of k8s with its own new agenda. In fact, I would say that the entire world has moved on. For example, Go libraries are moving towards using modules now, and the ros binary is still stuck on vendor'ed, old, some already deprecated packages organized by trash. Upgrading them is going to be a lot of work, some might even be close to impossible. It will be non-trivial to just keep Burmilla secure (patching vulnerabilities), not to say adding new features and/or improving performance.

This is why we are looking for alternative upstreams.

@pwFoo
Copy link

pwFoo commented May 26, 2021

I like to play with linux / initrd / linuxkit / docker / containers and also build a minimal dirty linuxkit + crun containers "OS", but I'm not an os distribution expert and have no knowledge about Golang / Rust... More familiar with bash / shell scripting.

There are some (production ready?) distributions to migrate to: k3os, CoreOS, ...

So do we need (one more) container linux distribution with must be production ready as goal?
Or do you think about a hobby project with a minimal custom container linux "just for fun" for now?

@h8liu
Copy link

h8liu commented May 26, 2021

So do we need (one more) container linux distribution with must be production ready as goal? Or do you think about a hobby project with a minimal custom container linux "just for fun" for now?

My understanding is that most people here are not "just for fun". :) People are gathered here because this is the successor of RancherOS, which was a production OS.

For example, just speaking for myself here, I (or HomeDrive) need a production OS that can run docker applications, and better be backward compatible with RancherOS. Our customers host their personal data on our devices, and run Web services facing the Internet. It is a serious task for us to keep the OS stable and secure.

k3s/k8s is not the thing for us (we do not need k8s). CoreOS might be okay, but migrating existing devices that are currently running BumillaOS gracefully to ignition/systemd is non-trivial work.

Trivia story: CoreOS was our pick at the beginning like 2 years ago, and then RedHat bought CoreOS. At that time, it was unclear if there will still be an opensource version, and we hence moved to use RancherOS, which was the only other container OS that DigitalOcean officially supported.

I kind of agree with your point though, that the world does not need us to build one more k8s-ready (or even k8s-only) OS for use in the cloud and in clusters, or at least I do not see that we are in a position to seriously compete in that ecosystem (or maybe we are, and I am just wrong?)

On the other hand, I see that there is this other small sweet spot: an open OS that just runs docker but nothing more, and in today's world, it implies running k8s is a non-goal (because k8s != docker now). There are all those single-node small server usages, where configuring a k8s cluster does not really make sense, and using docker still seems to be a great choice.

This actually was exactly what RancherOS was: a tiny docker operating system.
https://www.docker.com/blog/tiny-docker-operating-systems/

In summary, I am proposing for v3.0:

  • Let's build/keep Burmilla as a great docker-only OS, which can run great even as a single-node server (either on a VM or on on-prem baremetal).
  • Let's drop/deprioritize the attempt to optimize/support k8s (until we have more contributors/funding), and ask these k8s users to just migrate to other k8s/container operating systems. RancherOS was not really designed to run the latest k8s anyways.
  • If there is really still a need for a RancherOS successor but to run k8s (either for production or just for fun), let's do it in another branch/version/variation. In that branch, we can even kill the docker part, and just keep the containerd.

@h8liu
Copy link

h8liu commented May 26, 2021

(pinging @XiaodongLoong for Loongson's perspectives)

@olljanat
Copy link
Member Author

No offense, but if your main motivation is general personal interests, what happens when you feel that your interests are no longer? You would need some one to carry on, and maintaining a Linux distro is going to be a lot of work, which is why it needs a long-term purpose, just to survive in the long run. (and it might still be able to meet your need of interests?)

Sure that is big risk for BurmillaOS future and one of the reasons why I started this discussion. We need to be able to agree long term architecture and find enough persons who are ready help maintaining it.

What I mean is that, I see the main challenge of 3.0 here not technical, but on the economics choices: we's better say "yes" to only one thing, and say "no" (or at least "maybe some time in the future") to all the other ones, because we have quite limited hands/funding here.

It is techical challenge from point of view that currently we have too much custom stuff which generates too much maintenance work to keep those up to date. That why I see that we should be able to share big part of our OS code/binaries with some existing solution which already have active community (Debian, Alpine, Linuxkit, etc...).

keep BurmillaOS as is! Just try to update (i.e. system-docker), improve features and try to reduce initrd file size and improve performance?

Updating system-docker and reducing initrd size are conflicting tasks as system-docker + user docker are biggest parts of initrd. Problem with increasing system-docker size during upgrade have been discussed on #28 And there are no easy places left performance improvements either.

For example, Go libraries are moving towards using modules now, and the ros binary is still stuck on vendor'ed, old, some already deprecated packages organized by trash. Upgrading them is going to be a lot of work, some might even be close to impossible. It will be non-trivial to just keep Burmilla secure (patching vulnerabilities), not to say adding new features and/or improving performance.

Yes especially those customized versions of containerd, docker and libcompose are near of impossible to update as Rancher did skip it so many years. I collected all those "easy" parts to PR #92 which are pre-requirements for those harder parts update but even that means massive change to our codebase (over 300 files changed) which implications are unknown.

CoreOS might be okay, but migrating existing devices that are currently running BumillaOS gracefully to ignition/systemd is non-trivial work.

Trivia story: CoreOS was our pick at the beginning like 2 years ago, and then RedHat bought CoreOS. At that time, it was unclear if there will still be an opensource version, and we hence moved to use RancherOS, which was the only other container OS that DigitalOcean officially supported.

Also original CoreOS did reach EOL. Fedora CoreOS exists are open source still but it favor Podman instead of Docker which why it might be also bad choice.

What comes to migrating out of BurmillaOS or totally new v3.0 architecture. It should be possible to create special migration version which is compatible with old versions so they can be installed with sudo ros upgrade but which would then load it selves to RAM (like boot from ISO works) on boot, remove old BurmillaOS files and install that new OS but still keep user data immutable.

In summary, I am proposing for v3.0:
* Let's build/keep Burmilla as a great docker-only OS, which can run great even as a single-node server (either on a VM or on on-prem baremetal).

But how you see should we still have system services running as containers? Because if we want to have that and we want to get rid of customized system-docker then we most probably need use Linuxkit/its tooling at least partly as it is only alternative which I'm aware of this have that functionality besides of RancherOS/BurmillaOS.

If we are OK to use OpenRC for system services then my Alpine based dual root system (#88 (comment)) is most probably best option as it was made from IoT usage perspective (but which still can run on any bare metal or VM too) and it would mean that we get 99% binaries from Alpine and we just need update ros binary to support that configuration. Did you btw test it?

@h8liu
Copy link

h8liu commented May 26, 2021

Also original CoreOS did reach EOL. Fedora CoreOS exists are open source still but it favor Podman instead of Docker which why it might be also bad choice.

Fedora CoreOS has both: https://docs.fedoraproject.org/en-US/fedora-coreos/faq/#_which_container_runtimes_are_available_on_fedora_coreos

Though you can only use one of them:
https://docs.fedoraproject.org/en-US/fedora-coreos/faq/#_can_i_run_containers_via_docker_and_podman_at_the_same_time

podman is "recommended", but I believe there are enough users from the old CoreOS so that docker will be properly maintained.

But how you see should we still have system services running as containers? Because if we want to have that and we want to get rid of customized system-docker then we most probably need use Linuxkit/its tooling at least partly as it is only alternative which I'm aware of this have that functionality besides of RancherOS/BurmillaOS.

If we are OK to use OpenRC for system services then my Alpine based dual root system (#88 (comment)) is most probably best option as it was made from IoT usage perspective (but which still can run on any bare metal or VM too) and it would mean that we get 99% binaries from Alpine and we just need update ros binary to support that configuration.

I think either is fine. My dependency on system-docker is honestly not that rigid, and I can code myself out of it.

Between Linuxkit and Alpine, I do prefer Linuxkit, not because Linuxkit is docker-oriented, but just that it feels leaner. But if you prefer Alpine (like if it is less work), I am also fine with it. :)

Did you btw test it?

No..

@h8liu
Copy link

h8liu commented May 27, 2021

it feels leaner

what I mean here is that linuxKit makes the systems parts immutable, and does not have a rootfs based general-purpose application package system, which is/was a feature of RancherOS.

If it is going to be docker on alpine, one might just go ahead and run docker on alpine; it is the same debian argument.

also, BurmillaOS/RancherOS does not always use GPT, and in fact it does not make many assumptions on the partitioning at all: as long as there is a device partition called RANCHER_STATE, burmilla will happily install and run on it. we probably should not change that. repartitioning during an upgrade can be quite hard and/or dangerous (data loss or bricking devices).

it is fine and reasonable to borrow some other distro's kernel and rootfs components (as burmilla's internal implementations), but I think burmilla should keep its "interfaces" as much as possible, including the storage partitioning assumptions.

@XiaodongLoong
Copy link

@h8liu IMO, that port to MIPS64LE is very good. I will contribute to the community on this.

@wonleing
Copy link

@olljanat I took a look at linuxKit, defined as 'a toolkit for building custom minimal, immutable Linux distributions'. Are you suggesting to use linuxKit instead of RancherOS as the code base? or just borrow some feature/component to RancherOS?
The 2nd option is fine. I like the idea of "Everything is replaceable and customisable" and "Easy tooling, with easy iteration"
But if you mean the 1st option, then why not we just fork it or join it?
From commercial perspective, changing main architecture frequently is not a good option. We just need to spot the problem (like system-docker) and try our best to fix it. Better projects always keep poping up.

@wonleing
Copy link

wonleing commented May 27, 2021

Let's build/keep Burmilla as a great docker-only OS, which can run great even as a single-node server (either on a VM or on on-prem baremetal).

agree

Let's drop/deprioritize the attempt to optimize/support k8s (until we have more contributors/funding), and ask these k8s users to just migrate to other k8s/container operating systems. RancherOS was not really designed to run the latest k8s anyways.

K8s shouldn't be a built-in feature. But make the system extendable is a basic policy. We should welcome people who trys to develop and test K8s on this system, and we should try to help them.

If there is really still a need for a RancherOS successor but to run k8s (either for production or just for fun), let's do it in another branch/version/variation. In that branch, we can even kill the docker part, and just keep the containerd.

Don't split this project. leave K8s as a customer extention (don't include it in the main project).

@olljanat
Copy link
Member Author

@olljanat I took a look at linuxKit, defined as 'a toolkit for building custom minimal, immutable Linux distributions'. Are you suggesting to use linuxKit instead of RancherOS as the code base? or just borrow some feature/component to RancherOS?
The 2nd option is fine. I like the idea of "Everything is replaceable and customisable" and "Easy tooling, with easy iteration"

@wonleing I mean to use LinuxKit as toolkit to generate at least rootfs (linuxkit build -format tar) or ISO and disk images too if it looks possible. And eventually BurmillaOS probably should be listed on their adopters list

But if you mean the 1st option, then why not we just fork it or join it?

IMO Maintaing fork does not make sense but most probably we need contribute to LinuxKit on some areas to provide functionalities which we need.

From commercial perspective, changing main architecture frequently is not a good option.

Sure this should be one time thing which why we need do enough testing with new v3.0 architecture before deciding if we release it or not.

Here is example config which uses nerdctl to provide system-docker command which can be used manage those system services like we have BurmillaOS and separe user Docker based on official image.

You can generate EFI bootable ISO from it with command: linuxkit build -docker -format iso-efi burmillaos.yml

kernel:
  image: linuxkit/kernel:5.10.34
  cmdline: "console=tty0 console=ttyS0 console=ttyAMA0 console=ttysclp0"
init:
  - linuxkit/init:78fb57c7da07c4e43c3a37b27755581da087a3b6
  - linuxkit/runc:bf1e0c61fb4678d6428d0aabbd80db5ea24e4d4d
  - linuxkit/containerd:cc02c2af9c928c2faeccbe4edc78bd297ad91866
  - linuxkit/ca-certificates:4df823737c9bf6a9564b736f1a19fd25d60e909a
  - burmilla/os-nerdctl:v0.8.2
onboot:
  - name: sysctl
    image: linuxkit/sysctl:02d2bd74509fd063857ceb4c4f502f09ee4f2e0a
  - name: sysfs
    image: linuxkit/sysfs:3498aa99c90a29439b5a1926f6ffcd75c270372c
  - name: format
    image: linuxkit/format:fdad8c50d594712537f94862dab3d955cbb48fc3
  - name: mount
    image: linuxkit/mount:71c868267a4503f99e84fd7698717a3669d9dfdb
    command: ["/usr/bin/mountie", "/var/lib/docker"]
services:
  - name: getty
    image: linuxkit/getty:ed32c71531f5998aa510847bb07bd847492d4101
    binds.add:
      - /usr/bin/nerdctl:/usr/bin/nerdctl
      - /usr/bin/system-docker:/usr/bin/system-docker
      - /containers/services/docker/lower/usr/local/bin/docker:/usr/bin/docker
    env:
      - INSECURE=true
  - name: rngd
    image: linuxkit/rngd:bdabfe138f05f7d48396d2f435af16f5a6ccaa45
  - name: dhcpcd
    image: linuxkit/dhcpcd:1033f340e2d42f86a60aab70752346f0045ea388
  - name: ntpd
    image: linuxkit/openntpd:66f25a516c7460f5e49195309cf276903741c428
  - name: sshd
    image: linuxkit/sshd:add8c094a9a253870b0a596796628fd4ec220b70
  - name: docker
    image: docker:20.10.6-dind
    capabilities:
     - all
    net: host
    mounts:
     - type: cgroup
       options: ["rw","nosuid","noexec","nodev","relatime"]
    binds:
     - /etc/resolv.conf:/etc/resolv.conf
     - /var/lib/docker:/var/lib/docker
     - /lib/modules:/lib/modules
     - /etc/docker/daemon.json:/etc/docker/daemon.json
     - /run:/run
    command: ["/usr/local/bin/docker-init", "/usr/local/bin/dockerd"]
files:
  - path: var/lib/docker
    directory: true
  - path: etc/docker/daemon.json
    contents: '{}'
  - path: usr/bin/system-docker
    contents: |
      #!/bin/sh
      /usr/bin/nerdctl --namespace services.linuxkit "$@"
    mode: "0555"
  - path: root/.ssh
    directory: true
  - path: etc/hostname
    contents: 'burmilla'
trust:
  org:
    - burmilla
    - linuxkit
    - library

@pwFoo
Copy link

pwFoo commented May 31, 2021

nerdctl is a nice tool! I'll play with it and test some parts.

nerdctl-full package is also big... maybe should be reduced to just manage containers / compose files? build images could be done inside of a container, but buildkit or something else shouldn't increase the initrd file...

For system-docker usage buildkit and maybe just network host and none schould be fine? user-docker shouldn't part of initrd because it should be pulled runtime...

@olljanat
Copy link
Member Author

olljanat commented May 31, 2021

nerdctl-full package is also big... maybe should be reduced to just manage containers / compose files?

We don't need full package. Minimal is enough as LinuxKit already contains all needed containerd parts and we just need add CLI top of it. What comes those initrd size optimizations is much easier just migrate to Zstandard (currently LinuxKit uses gzip for initrd compression) like example Alpine has done but it is way too early to think about those. We need to be first able to agree what is new OS architecture or will we even implement it.

Btw. I also managed to create proof of concept about upgrade from earlier version to this LinuxKit based architecture. Those who want to test it can do upgrade with command:

sudo ros os upgrade --image burmilla/os:v3.0.0-linuxkit-draft1

Note that LinuxKit kernel looks to be missing firmwares so that version most probably is not able to boot on bare metal.

@tomaswarynyca
Copy link
Collaborator

Any news here?

@olljanat
Copy link
Member Author

From my side. I will start my summer vacation tomorrow so will not look this one and most probably any other BurmillaOS things on couple of weeks but go ahead and continue discussion on meantime.

@olljanat
Copy link
Member Author

Side comment: Looks that Rancher guys have decided to create Rancher OS v2 and migrate Harvester to use it instead of k3OS: harvester/harvester#581 (comment) Initial draft is available on https://github.com/ibuildthecloud/os2

I'm not sure if that is something where we can migrate too or which with we can share code on long term but definitely something to keep eye on.

@olljanat
Copy link
Member Author

Closing this one as it starting to look more sure that there will be RancherOSv2.

You can see it's development/prototyping on https://github.com/rancher/os/commits/v2-test

It is created using https://github.com/rancher-sandbox/cOS-toolkit and is based on openSUSE and systemd.

@olljanat
Copy link
Member Author

FYI. I created item #119 about new approach by using same base than RancherOS v2. Please, tell your though about it on there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

9 participants