Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Faults" view should show all Terminating pods #2738

Closed
akatch opened this issue Jun 13, 2024 · 1 comment · Fixed by #2935
Closed

"Faults" view should show all Terminating pods #2738

akatch opened this issue Jun 13, 2024 · 1 comment · Fixed by #2935

Comments

@akatch
Copy link

akatch commented Jun 13, 2024




Describe the bug
Enabling the "Toggle Faults" view shows some Terminating pods, but not all. Enabling this view should display all Terminating pods (and indeed all pods not in a Running and Ready state). However, it is unclear why some pods show up as Terminating in this view, but others do not. I did some brief digging in the code and it is not entirely clear how k9s determines which pods are considered faulty - it's possible that some Terminating pods meet these criteria but not all.

Further investigation shows that some Terminating pods with Events such as Node Not Ready (which I would absolutely 100% expect to show up in Faults) do not show up in the Fault view. This is the case in the attached screenshots below.

To Reproduce
Steps to reproduce the behavior:

  1. View all pods for a namespace :pods [namespace] where many pods are Terminating
  2. Enable Faults view ctrl+z by default
  3. All pods not in a Running/Ready state should appear in this view, but not all do, in particular not all Terminating pods show up.

Expected behavior
All pods not in a Running/Ready state should appear when Faults view is enabled.

Screenshots
I have had to heavily sanitize these but hopefully they help demonstrate the issue.

A view of all pods, in particular many that are Terminating
A view of all pods, in particular many that are Terminating

The same namespace captured moments later in Fault view. No Terminating pods are seen.
The same namespace captured moments later in Fault view

Versions

  • OS: macOS 14.5 (Sonoma)
  • K9s: 0.32.4
  • K8s: 1.20.15, 1.24.13
gomesdigital added a commit to gomesdigital/k9s that referenced this issue Oct 27, 2024
@gomesdigital
Copy link
Contributor

The reason is because a pod in Terminating does not necessarily mean its containers are not ready, which is what k9s is using to classify faulty pods.

You can see that here:

k9s/internal/render/pod.go

Lines 168 to 177 in be1ec87

func (p Pod) diagnose(phase string, cr, ct int) error {
if phase == Completed {
return nil
}
if cr != ct || ct == 0 {
return fmt.Errorf("container ready check failed: %d of %d", cr, ct)
}
return nil
}

When an error is returned, this is propagated to the filterToast() func:

func (t *TableData) filterToast() *RowEvents {

In my experience we had an EC2 fail causing services to go out - pod phases were Terminating, but not showing in the Faults view because the container statuses were still Ready. So I agree with @akatch. Majority of the time the container-ready/container-total metric works, but there are cases where it doesn't apply. In addition, if we could just see all terminating pods it would help reveal pods that are stuck in terminating. We've dealt with that problem extensively and had to manually search for them because they are filtered out in the faults view.

gomesdigital added a commit to gomesdigital/k9s that referenced this issue Oct 31, 2024
gomesdigital added a commit to gomesdigital/k9s that referenced this issue Oct 31, 2024
tmeijn pushed a commit to tmeijn/dotfiles that referenced this issue Nov 19, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [derailed/k9s](https://github.com/derailed/k9s) | patch | `v0.32.5` -> `v0.32.7` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>derailed/k9s (derailed/k9s)</summary>

### [`v0.32.7`](https://github.com/derailed/k9s/releases/tag/v0.32.7)

[Compare Source](derailed/k9s@v0.32.6...v0.32.7)

<img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/k9s.png" align="center" width="800" height="auto"/>

### Release v0.32.7
#### Notes

Thank you to all that contributed with flushing out issues and enhancements for K9s!
I'll try to mark some of these issues as fixed. But if you don't mind grab the latest rev
and see if we're happier with some of the fixes!
If you've filed an issue please help me verify and close.

Your support, kindness and awesome suggestions to make K9s better are, as ever, very much noted and appreciated!
Also big thanks to all that have allocated their own time to help others on both slack and on this repo!!

As you may know, K9s is not pimped out by corps with deep pockets, thus if you feel K9s is helping your Kubernetes journey,
please consider joining our [sponsorship program](https://github.com/sponsors/derailed) and/or make some noise on social! [@&#8203;kitesurfer](https://twitter.com/kitesurfer)

On Slack? Please join us [K9slackers](https://join.slack.com/t/k9sers/shared_invite/enQtOTA5MDEyNzI5MTU0LWQ1ZGI3MzliYzZhZWEyNzYxYzA3NjE0YTk1YmFmNzViZjIyNzhkZGI0MmJjYzhlNjdlMGJhYzE2ZGU1NjkyNTM)

#### Maintenance Release!

***

#### Videos Are In The Can!

Please dial [K9s Channel](https://www.youtube.com/channel/UC897uwPygni4QIjkPCpgjmw) for up coming content...

-   [K9s v0.31.0 Configs+Sneak peek](https://youtu.be/X3444KfjguE)
-   [K9s v0.30.0 Sneak peek](https://youtu.be/mVBc1XneRJ4)
-   [Vulnerability Scans](https://youtu.be/ULkl0MsaidU)

***

#### Resolved Issues

-   [#&#8203;2970](derailed/k9s#2970) Ctrl-z on events view causes runtime error in v0.32.6
-   [#&#8203;2969](derailed/k9s#2969) When using impersonation user information and permissions not preserved when switching context
-   [#&#8203;2966](derailed/k9s#2966) Go to the Contexts page and filter, contexts that are matched will be filtered ou
-   [#&#8203;2962](derailed/k9s#2962) Small colour/filtering related bug
-   [#&#8203;2961](derailed/k9s#2961) Drain node with the -disable-eviction
-   [#&#8203;2958](derailed/k9s#2958) Restart count in container view associated with the wrong container
-   [#&#8203;2945](derailed/k9s#2945) Could we add ServiceAccount Column in v1/POD view

***

#### Contributed MRs

Please be sure to give `Big Thanks!` and `ATTA Girls/Boys!` to all the fine contributors for making K9s better for all of us!!

-   [#&#8203;2968](derailed/k9s#2968) Update go version to 1.23.X in README
-   [#&#8203;2964](derailed/k9s#2964) feat(dao,used-by-cmd): check imagePullSecrets as well
-   [#&#8203;2960](derailed/k9s#2960) Put log levels in order in cmd help

***

<img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/imhotep_logo.png" width="32" height="auto"/> © 2024 Imhotep Software LLC. All materials licensed under [Apache v2.0](http://www.apache.org/licenses/LICENSE-2.0)

### [`v0.32.6`](https://github.com/derailed/k9s/releases/tag/v0.32.6)

[Compare Source](derailed/k9s@v0.32.5...v0.32.6)

<img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/k9s.png" align="center" width="800" height="auto"/>

### Release v0.32.6
#### Notes

Thank you to all that contributed with flushing out issues and enhancements for K9s!
I'll try to mark some of these issues as fixed. But if you don't mind grab the latest rev
and see if we're happier with some of the fixes!
If you've filed an issue please help me verify and close.

Your support, kindness and awesome suggestions to make K9s better are, as ever, very much noted and appreciated!
Also big thanks to all that have allocated their own time to help others on both slack and on this repo!!

As you may know, K9s is not pimped out by corps with deep pockets, thus if you feel K9s is helping your Kubernetes journey,
please consider joining our [sponsorship program](https://github.com/sponsors/derailed) and/or make some noise on social! [@&#8203;kitesurfer](https://twitter.com/kitesurfer)

On Slack? Please join us [K9slackers](https://join.slack.com/t/k9sers/shared_invite/enQtOTA5MDEyNzI5MTU0LWQ1ZGI3MzliYzZhZWEyNzYxYzA3NjE0YTk1YmFmNzViZjIyNzhkZGI0MmJjYzhlNjdlMGJhYzE2ZGU1NjkyNTM)

#### Maintenance Release!

***

#### Videos Are In The Can!

Please dial [K9s Channel](https://www.youtube.com/channel/UC897uwPygni4QIjkPCpgjmw) for up coming content...

-   [K9s v0.31.0 Configs+Sneak peek](https://youtu.be/X3444KfjguE)
-   [K9s v0.30.0 Sneak peek](https://youtu.be/mVBc1XneRJ4)
-   [Vulnerability Scans](https://youtu.be/ULkl0MsaidU)

***

#### Resolved Issues

-   [#&#8203;2947](derailed/k9s#2947) CTRL+Z causes k9s to crash
-   [#&#8203;2938](derailed/k9s#2938) Critical Vulnerability CVE-2024-41110 in v26.0.1 of docker included in k9s
-   [#&#8203;2929](derailed/k9s#2929) conflicting plugins shortcuts
-   [#&#8203;2896](derailed/k9s#2896) Add a plugin to disable/enable a keda ScaledObject
-   [#&#8203;2811](derailed/k9s#2811) Dockerfile build step fails due to misaligned Go versions (1.21.5 vs 1.22.0)
-   [#&#8203;2767](derailed/k9s#2767) Manually triggered jobs don't get automatically cleaned up
-   [#&#8203;2761](derailed/k9s#2761) Enable "jump to owner" for more kinds
-   [#&#8203;2754](derailed/k9s#2754) Plugins not loaded/shown in UI
-   [#&#8203;2747](derailed/k9s#2747) Combining context and namespace switching only works sporadically (e.g. ":pod foo-ns [@&#8203;ctx-dev](https://github.com/ctx-dev)")
-   [#&#8203;2746](derailed/k9s#2746) k9s does not display "\[::]" string in its logs
-   [#&#8203;2738](derailed/k9s#2738) "Faults" view should show all Terminating pods

***

#### Contributed MRs

Please be sure to give `Big Thanks!` and `ATTA Girls/Boys!` to all the fine contributors for making K9s better for all of us!!

-   [#&#8203;2937](derailed/k9s#2937) Adding Argo Rollouts plugin version for PowerShell
-   [#&#8203;2935](derailed/k9s#2935) fix: show all terminating pods in Faults view ([#&#8203;2738](derailed/k9s#2738))
-   [#&#8203;2933](derailed/k9s#2933) chore: broken url in build-status tag in the readme.md
-   [#&#8203;2932](derailed/k9s#2932) fix: add kubeconfig if k9s is launched with --kubeconfig
-   [#&#8203;2930](derailed/k9s#2930) fixed conflicting plugin shortcuts, and added 2 new plugins
-   [#&#8203;2927](derailed/k9s#2927) Fix "Mark Range": reduce maximum namespaces in favorites, fix shadowing of ctrl+space
-   [#&#8203;2926](derailed/k9s#2926) chore(plugins,remove-finalizers): make sure the resources api group is respected
-   [#&#8203;2921](derailed/k9s#2921) feat: Add plugins for kubectl node-shell
-   [#&#8203;2920](derailed/k9s#2920) eat: added StartupProbes status (S) to the PROBES column in the container render
-   [#&#8203;2914](derailed/k9s#2914) Adding eks-node-viewer plugin
-   [#&#8203;2898](derailed/k9s#2898) Add argocd plugin to community plugins
-   [#&#8203;2896](derailed/k9s#2896) feat(2896): Add toggle keda plugin
-   [#&#8203;2890](derailed/k9s#2890) Update README.md
-   [#&#8203;2881](derailed/k9s#2881) Fix Mark-Range command: ensure that NS Favorite doesn't exceed the limit
-   [#&#8203;2861](derailed/k9s#2861) chore: fix function name
-   [#&#8203;2856](derailed/k9s#2856) fix internal/render/hpa.go merge issue
-   [#&#8203;2848](derailed/k9s#2848) Include sidecar containers requests and limits
-   [#&#8203;2844](derailed/k9s#2844) Update README GO Version Required
-   [#&#8203;2830](derailed/k9s#2830) update tview to fix log escaping problem completely
-   [#&#8203;2822](derailed/k9s#2822) Adding HolmesGPT plugin
-   [#&#8203;2821](derailed/k9s#2821) Add a spark-operator plugin
-   [#&#8203;2817](derailed/k9s#2817) Add comment about Escape keybinding
-   [#&#8203;2812](derailed/k9s#2812) fix: align build image Go version with go.mod
-   [#&#8203;2795](derailed/k9s#2795) add new plugin current-ctx-terminal
-   [#&#8203;2791](derailed/k9s#2791) Add leading space to Kubernetes context suggestions
-   [#&#8203;2789](derailed/k9s#2789) Create kubectl-get-in-shell.yaml
-   [#&#8203;2788](derailed/k9s#2788) Update README.md plugin format
-   [#&#8203;2787](derailed/k9s#2787) Update helm-purge.yaml
-   [#&#8203;2786](derailed/k9s#2786) Update README.md with plugin dangerous field
-   [#&#8203;2780](derailed/k9s#2780) install copyright file into correct location
-   [#&#8203;2775](derailed/k9s#2775) fix freebsd build failure
-   [#&#8203;2780](derailed/k9s#2780) install copyright file into correct location
-   [#&#8203;2772](derailed/k9s#2772) proper handle OwnerReference for manually created job
-   [#&#8203;2771](derailed/k9s#2771) feat: add duplik8s plugin
-   [#&#8203;2770](derailed/k9s#2770) feat: allow plugins block in plugin files
-   [#&#8203;2765](derailed/k9s#2765) fix: Shellin -> ShellIn
-   [#&#8203;2763](derailed/k9s#2763) enable "jump to owner" for more kinds
-   [#&#8203;2755](derailed/k9s#2755) Loki plugin
-   [#&#8203;2751](derailed/k9s#2751) container logs should be escaped when printed
-   [#&#8203;2750](derailed/k9s#2750) fix: should switching ctx before ns

***

<img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/imhotep_logo.png" width="32" height="auto"/> © 2024 Imhotep Software LLC. All materials licensed under [Apache v2.0](http://www.apache.org/licenses/LICENSE-2.0)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants