-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove console=ttyS0 on metal #567
Comments
I'm not sure that's true. In every enterprise environment I've worked in there were always either dedicated console servers attached to serial ports OR out of band management (aka lights out management) provided by the hardware that gives you access to the serial console. If your server crashes or becomes unrepsonsive the output of the serial console is one of the last chances you have to find some clues. I think I'd much prefer to document in our FAQ that this problem can exist and show people how to change it if they are having trouble. |
Though I'm definitely interested in other thoughts and opinions here. Should we bring it up during the next community meeting? |
Another approach which I've thought about a bit when working on #533 is to go the whole way and remove all the |
Right; I guess this boils down to: is the "default" having a LOM attached or not? One or the other case will need to configure things.
Hmm...how is "active" detected for serial consoles? It sounds like the desired behavior in a LOM scenario is that even if a VGA card is present, output still goes to the serial console too. |
I'm not sure if that's possible in the generic case since different LOM systems could be using different port numbers, IIUC?
Here's what the docs say:
So yeah, I guess it'd just be COM1, which wouldn't help if LOM is in COM2. Hmm, was testing the fallback behaviour now with |
Yeah. It's definitely a question of sane defaults. To me we would need to answer questions like:
So we have to dig in to how many people expect and depend on the current default behavior of the serial console getting kernel message output vs how many people experience slow booting systems because we default to outputting on the serial console? In which of these cases is it OK to document the current behavior and the workaround versus changing the default.
Yeah. I speak for only one person, but my preferred LOM scenario is to have serial output even if a VGA is present. I actually much prefer serial console to VGA in any cases where I'm in a non-graphical environment (copy/paste anyone?). |
One thing I just found out the hard way while playing with a FSBCOS install that failed in the initramfs is that on this Lenovo T590 laptop (that doesn't have a serial device), systemd will fail to start emergency mode because it's trying to access the non-existent serial console or something. It works to mount the |
Sounds like we need to delete the entry if there's no serial port attached, to avoid such a problem... |
Yeah, I was bit by this as well when installing FCOS on my local server. I think it's because
One thing I mentioned was the possibility of matching the state of Clearly that doesn't cover all use cases since you could be installing differently than how you plan to run it, but it seems like a better default heuristic overall (that you of course should be able to override). OTOH, magic handling like this can also make things more confusing. |
One thing: you cannot assume console=ttyS0 is correct serial console. It also uses default value 9600n8 (=slow). Depending on server, correct value can be for example console=ttyS1 or console=ttyS0,115200 or there could be no serial console at all. Its better to leave it up to user to specify that or forward what he already specified using first boot as jlebon wrote. https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html |
As far as default speed value goes, our current default is |
We discussed this in the meeting today. While we didn't get unanimous agreement the general consensus was:
The platform specific defaults on other platforms plays into #110 I believe. |
I hit this issue where ttyS0 was locking up my system, ttyS1 worked just fine, any chance we can drop it first and fix a per platform defaults in a follow up release? thanks. |
Yeah, we still need to action this. Note for the time being, you can modify the kargs at install time using |
Yes, this was also part of a big downstream OpenShift/RHCOS customer issue. One middle ground here might be to remove the |
That doesn't really help the UX issue though. The main situation where people need to interact with the console is when Ignition has failed. |
Right, I said remove it after Ignition has completed successfully. Or am I misunderstanding you? EDIT: to clarify e.g. in the OpenShift case this would be part of the MCO firstboot which runs only after Ignition has completed. Or, conceptually with the new Ignition kargs bits, it could be an Ignition fragment with |
A bunch of console links: One thing that came up there is that still today the kernel doesn't distinguish well between "a driver hit an unexpected timeout" and "the machine is about to die". Related to that, one recommendation in this customer case was "set |
Yeah, I'm arguing that doesn't go far enough. When users hit an Ignition failure and try to debug it, they discover that output is being sent to a port they don't use, don't know about, and may not even have on their machine. Per #567 (comment), we should just drop |
OK. For users who want to keep the serial (LOM users?) and are booting from the ISO, it seems to me they would need to use I find it really hard to understand the "blast radius" of this across the platforms. I guess the main concern here would be metal and probably vSphere. |
My 2 cents: In our case (Installer QE), due to our external provider configuration, we need to modify the kernel arguments from "console=ttyS0,115200n8" to "console=ttyS1,115200n8" in order to properly redirect the console output, having the default value in place prevents us from using a simply MachineConfig (MC) solution like the following:
In order to workaround this, we need to push via iPXE a custom ignition hook that changes the kernel arguments on the following manner:
OR, when using TPM encryption, we use live iPXE mode to boot the console hook in memory to avoid tampering with the storage:
Here an example of the console hook if needed:
In summary, if we remove the default "console=ttyS0,115200n8", we can avoid using ignition hooks and simply push master/worker MCs with the kernel argument addition right? to me sounds like a good plan. Best Regards. |
@pamoedom
So maybe you could simply add a MachineConfig to create a getty for ttyS1, and you would get a serial console (maybe not owning /dev/console) on ttyS1 ? I agree it would not solve the slow boot problem. |
OK please ignore my comment above: according to https://www.kernel.org/doc/Documentation/admin-guide/serial-console.rst
|
Another use case where the presence of ttyS0 is problematic, is when you hook a serial console over IPMI, which defaults to ttyS1 on many systems. |
Re-read the meeting notes for when this was discussed. It seems like there's a lot of hesitation because it's not clear of the proportion of systems relying on this vs those being hindered by it. Should we have someone/a few folks just look at what the most prevalent setups are? E.g. let's look at Dell, HP, Intel, Cisco, etc... and see what their LOM systems expect.
To clarify, I guess this is something you probably have to do for most OSes if you're expecting ttyS1? (The adding ttyS1, not the removing ttyS0 part.) |
coreos-installer 0.16.0 includes support for |
The change has been announced to coreos-status. Please see the announcement message for details. The transition schedule is:
|
After reviewing all the communication I didn't see a reason for vSphere not to be included in the What made me discover this work was an issue I am trying to investigate for OCP 4.12 on vSphere: |
The fix for this went into |
@jcpowermac Thanks for the report. coreos/fedora-coreos-config#2038 enables secondary serial console on VMware, and also on OpenStack and VirtualBox. |
The FCOS platform docs for VMware/VirtualBox include instructions on connecting to the serial console to get a console log, and some users make use of this functionality: coreos/fedora-coreos-tracker#567 (comment) Re-enable secondary serial consoles on those platforms, so the serial console gets as much information as possible without interfering with the graphical console.
The fix for this went into |
The fix for this went into |
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Fixes: siderolabs#8695 Fixes: siderolabs#8657 Fixes: siderolabs#8127 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Fixes: siderolabs#8695 Fixes: siderolabs#8657 Fixes: siderolabs#8127 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Fixes: siderolabs#8695 Fixes: siderolabs#8657 Fixes: siderolabs#8127 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Fixes: siderolabs#8695 Fixes: siderolabs#8657 Fixes: siderolabs#8127 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Fixes: #8695 Fixes: #8657 Fixes: #8127 Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop `console=ttyS0` argument for metal images/installer. `console=ttyS0` causes lot of issues with bare metal hardware when trying to use a physical serial port. Ref: * https://bugzilla.redhat.com/show_bug.cgi?id=1839923 * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17 * https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html * coreos/fedora-coreos-tracker#567 Fixes: siderolabs#8695 Fixes: siderolabs#8657 Fixes: siderolabs#8127 Signed-off-by: Noel Georgi <git@frezbo.dev>
Moving this from https://bugzilla.redhat.com/show_bug.cgi?id=1839923
I think we should likely remove the
console=ttyS0
we're injecting on bare metal by default. On real hardware we don't expect anything connected to the serial ports by default, and attempting to write to them can greatly slow down the boot. A web search for "linux console serial slow" turns up things like https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17Today for the Live ISO we are already doing this, but the bare metal image still enables it by default. One can work around this by using the new installer options to remove kargs at least.
Anyone who wants to turn it on can of course.
The text was updated successfully, but these errors were encountered: