-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
display component uptime #1223
Comments
/assign |
@lucklove @AstroProfundis PTAL |
That's a good idea, but what if the user didn't deploy prometheus? Can we handle that case? |
I'm sorry, I didn't make myself clear, we are not using Prometheus, but use the component's metric API. Btw, if use Prometheus to query the latest uptime, this is not correct, maybe fetched the staled data, which is the last message before the process dies. |
I've checked the metrics components returned, but I didn't find any metric that records the |
Sorry for the spelling error, is |
Nice, I deployed a cluster and check the metric, it seems pd and tidb has this metric returned but TiKV doesn't. |
I've checked those components under cluster version v4.0.4 and dm nightly, below components have the
Those components doesn't contain the
So most of the components are implemented, and I think via metric api is more convenient and useful for normal usage. And we can implement it firstly through metric api, if not implemented or service was down, then use ssh-systemctl. However, once all the services are down, they will be degraded to be accessed through ssh-systemctl, which will affect the query time, so |
The version I have checked is v4.0.0. So I think there must be compatibility issues... Not every version implements this. |
It's implemented in #1231 |
Feature Request
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
I want to display the uptime time for each component in
tiup {cluster|dm} display xxx
Describe alternatives you've considered:
As some components' Prometheus metric API returns
process_start_time_seconds
metric, which we can use directly to represent the process's start timestamp, and usetime.Now() - start_timestamp
as uptime, those components as below:Those components doesn't contain the
process_start_time_seconds
:For those components that do not include this metric, especially TIFlash, we can wait until the product side is integrated before adding it. During this transition, we can use
ssh then ps
to see how long the process lives.Teachability, Documentation, Adoption, Migration Strategy:
The text was updated successfully, but these errors were encountered: