-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Stackdriver Exporter for Opencensus #492
Add Stackdriver Exporter for Opencensus #492
Conversation
Quick thought - should there be a Gopkg.toml entry, so we are locking the new dependencies to a specific version? |
Build Failed 😱 Build Id: a74102d1-452b-4bac-af4d-e0ba8d4d3efb Build Logs
|
Looks like a lock timeout while the other test was running. Restarting. |
Build Succeeded 👏 Build Id: ea45dd45-9423-4c4d-a4fe-c1581b0f1001 The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
a2251e8
to
edbb084
Compare
Build Succeeded 👏 Build Id: 0b08881e-09ad-4af9-b3d8-a8dcf96d37fb The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
edbb084
to
032861c
Compare
Build Succeeded 👏 Build Id: a9116df4-f101-442f-af74-d4ea497c8cbf The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
032861c
to
f5ae7e5
Compare
Added Gopkg.toml. |
Build Failed 😱 Build Id: 80d47cbf-ca61-4267-85b3-9fba8bf9f1c2 Build Logs
|
f5ae7e5
to
84fcd80
Compare
Build Succeeded 👏 Build Id: 030b3523-9b51-4e0e-bd3f-f10cd9e452d7 The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should document how it finds application credentials. It seems that your cluster already has those by default but people could deactivate or not have the right role for sending metrics.
see https://cloud.google.com/docs/authentication/production and https://cloud.google.com/monitoring/access-control
Adding Project ID to configuration yaml, could be useful to have |
84fcd80
to
9f24f34
Compare
Build Succeeded 👏 Build Id: 1f3e244d-70dc-405e-8fcf-6b2b910fc41d The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
docs/metrics.md
Outdated
|
||
With this configuration only Stackdriver exporter would be used instead of Prometheus exporter. | ||
|
||
Run the following command to install Agones to a cluster with an updated configuration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick; I don't think we need to give to command just ask to reinstall using helm.
docs/metrics.md
Outdated
|
||
Run the following command to install Agones to a cluster with an updated configuration: | ||
``` | ||
make install |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not put make commands in out user documentation if we can avoid it. Let's have explicit commands n our documentation instead.
docs/metrics.md
Outdated
@@ -164,6 +168,45 @@ Open a web browser to [http://127.0.0.1:3000](http://127.0.0.1:3000), you should | |||
|
|||
> Makefile targets `make grafana-portforward`,`make kind-grafana-portforward` and `make minikube-grafana-portforward`. | |||
|
|||
### Stackdriver installation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to get moved over to the new website docs 😢 sorry - someone was going to get bit by this.
docs/metrics.md
Outdated
@@ -50,7 +51,10 @@ Finally include that `ServiceMonitor` in your [Prometheus instance CRD](https:// | |||
|
|||
### Stackdriver | |||
|
|||
We don't yet support the [OpenCensus Stackdriver exporter](https://opencensus.io/exporters/supported-exporters/go/stackdriver/) but you can still use the Prometheus Stackdriver integration by following these [instructions](https://cloud.google.com/monitoring/kubernetes-engine/prometheus). | |||
We support the [OpenCensus Stackdriver exporter](https://opencensus.io/exporters/supported-exporters/go/stackdriver/). | |||
In order to configure it you should enable Stackdriver Monitoring API in Google Cloud Console. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's link to the API here. Make life easy for people
docs/metrics.md
Outdated
We support the [OpenCensus Stackdriver exporter](https://opencensus.io/exporters/supported-exporters/go/stackdriver/). | ||
In order to configure it you should enable Stackdriver Monitoring API in Google Cloud Console. | ||
|
||
Also you can use the Prometheus Stackdriver integration by following these [instructions](https://cloud.google.com/monitoring/kubernetes-engine/prometheus). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to push prom+stackdriver at the same time we are saying use the stackdriver integration directly?
Is there any value add in using both layers together? if not, let's cut this piece I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason for that is to add golang metrics, like goroutines count, however it is not necessary to configure both at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're good for the code, I'd like @markmandel to have a read at the doc.
docs/metrics.md
Outdated
@@ -164,6 +168,45 @@ Open a web browser to [http://127.0.0.1:3000](http://127.0.0.1:3000), you should | |||
|
|||
> Makefile targets `make grafana-portforward`,`make kind-grafana-portforward` and `make minikube-grafana-portforward`. | |||
|
|||
### Stackdriver installation | |||
|
|||
In order to use [Stackdriver monitoring](https://app.google.stackdriver.com) you should enable Stackdriver Monitoring API on Google Cloud Console. You need to grant all necessary permissions to the users (see [Access Control Guide](https://cloud.google.com/monitoring/access-control)). Stackdriver exporter uses a strategy called Application Default Credentials (ADC) to find your application's credentials. Details could be found here [Setting Up Authentication for Server to Server Production Applications](https://cloud.google.com/docs/authentication/production). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to use [Stackdriver monitoring](https://app.google.stackdriver.com) you should enable Stackdriver Monitoring API on Google Cloud Console. You need to grant all necessary permissions to the users (see [Access Control Guide](https://cloud.google.com/monitoring/access-control)). Stackdriver exporter uses a strategy called Application Default Credentials (ADC) to find your application's credentials. Details could be found here [Setting Up Authentication for Server to Server Production Applications](https://cloud.google.com/docs/authentication/production). | |
In order to use [Stackdriver monitoring](https://app.google.stackdriver.com) you should enable Stackdriver Monitoring API on Google Cloud Console. You need to grant all the necessary permissions to the users (see [Access Control Guide](https://cloud.google.com/monitoring/access-control)). Stackdriver exporter uses a strategy called Application Default Credentials (ADC) to find your application's credentials. Details could be found here [Setting Up Authentication for Server to Server Production Applications](https://cloud.google.com/docs/authentication/production). |
while this is interesting for people using stackdriver without GKE, I think we should first explain how to activate the scope for gke see https://cloud.google.com/kubernetes-engine/docs/how-to/monitoring#enabling_stackdriver_monitoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markmandel WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is enabled by default but I will add this useful link
docs/metrics.md
Outdated
|
||
![](stackdriver-metrics-dashboard.png) | ||
|
||
Currently there exists only manual way of configuring Stackdriver Dashboard. So it is up to you to set Alignment Period (minimal is 1 minute), group by parameter where you choose Metric label, filter parameter, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markmandel should we keep this block ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded this paragraph a bit.
install/helm/agones/README.md
Outdated
| `agones.metrics.prometheusServiceDiscovery` | Adds annotations for Prometheus ServiceDiscovery (and also Strackdriver) | `true` | | ||
| `agones.metrics.stackdriverEnabled` | (⚠️ **development feature**⚠️) Enables Stackdriver exporter of controller metrics | `false` | | ||
| `agones.metrics.stackdriverProjectID` | (⚠️ **development feature**⚠️) Project ID where current GKE cluster is deployed, if empty string project id from Application Default Credentials would be used. Used for Stackdriver. | `` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just say that this overrides the default gcp project id for use with stackdriver
docs/metrics.md
Outdated
Currently there exists only manual way of configuring Stackdriver Dashboard. So it is up to you to set Alignment Period (minimal is 1 minute), group by parameter where you choose Metric label, filter parameter, etc. | ||
|
||
#### Troubleshooting | ||
If you could not see agones metrics please check the controller logs for connection errors. Check also that Stackdriver Monitoring API is enabled and user has all necessary permissions to access Stackdriver Monitoring, configure `stackdriverProjectID` manually, if default is not working. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you could not see agones metrics please check the controller logs for connection errors. Check also that Stackdriver Monitoring API is enabled and user has all necessary permissions to access Stackdriver Monitoring, configure `stackdriverProjectID` manually, if default is not working. | |
If you can't see Agones metrics you should have a look at the controller logs for connection errors. Also ensure that your cluster has the necessary credentials to interact with Stackdriver Monitoring. You can configure `stackdriverProjectID` manually, if the automatic discovery is not working. |
docs/metrics.md
Outdated
|
||
#### Troubleshooting | ||
If you could not see agones metrics please check the controller logs for connection errors. Check also that Stackdriver Monitoring API is enabled and user has all necessary permissions to access Stackdriver Monitoring, configure `stackdriverProjectID` manually, if default is not working. | ||
For example if wrong ProjectID is configured you would see such an error, same error will occure with Permissions is not configured in GCP: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example if wrong ProjectID is configured you would see such an error, same error will occure with Permissions is not configured in GCP: | |
Permissions problem example from controller logs: |
9f24f34
to
c99e59e
Compare
Build Failed 😱 Build Id: 44f3e914-07b9-4758-bba3-9c65175512bc Build Logs
|
c99e59e
to
5ca1a61
Compare
Build Failed 😱 Build Id: 18258a69-36cf-45fb-ad78-938cc1b6c9a1 Build Logs
|
5ca1a61
to
d488c8e
Compare
Build Succeeded 👏 Build Id: 65085ba3-e714-409a-8d52-6cc1b1003b1d The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
Updated the docs and links in order to be compliant with the new website format. |
but you can still use the Prometheus Stackdriver integration by following these [instructions](https://cloud.google.com/monitoring/kubernetes-engine/prometheus). | ||
Annotations required by this integration can be activated by setting the `agones.metrics.prometheusServiceDiscovery` | ||
to true (default) via the [helm chart value]({{< relref "../Installation/helm.md" >}}). | ||
We support the [OpenCensus Stackdriver exporter](https://opencensus.io/exporters/supported-exporters/go/stackdriver/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You get to play with our new documentation tools! We have "shortcodes" which you can wrap content in, to hide and show previous versions. See: https://agones.dev/site/docs/contribute/ for details.
TL;DR -
The old version should we wrapped in:
{{% feature expiryVersion="0.8.0" %}}
We don't yet support the OpenCensus Stackdriver exporter...
{{\% /feature %}}
And the new documentation should be wrapped in:
{{% feature publishVersion="0.8.0" %}}
We support the OpenCensus Stackdriver exporter...
{{\% /feature %}}
The tooling will show/hide each section appropriately as we release new versions.
@@ -174,6 +176,48 @@ Open a web browser to [http://127.0.0.1:3000](http://127.0.0.1:3000), you should | |||
|
|||
> Makefile targets `make grafana-portforward`,`make kind-grafana-portforward` and `make minikube-grafana-portforward`. | |||
|
|||
### Stackdriver installation | |||
|
|||
In order to use [Stackdriver monitoring](https://app.google.stackdriver.com) you should enable Stackdriver Monitoring API on Google Cloud Console. You need to grant all the necessary permissions to the users (see [Access Control Guide](https://cloud.google.com/monitoring/access-control)). Stackdriver exporter uses a strategy called Application Default Credentials (ADC) to find your application's credentials. Details could be found here [Setting Up Authentication for Server to Server Production Applications](https://cloud.google.com/docs/authentication/production). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to use [Stackdriver monitoring](https://app.google.stackdriver.com) you should enable Stackdriver Monitoring API on Google Cloud Console. You need to grant all the necessary permissions to the users (see [Access Control Guide](https://cloud.google.com/monitoring/access-control)). Stackdriver exporter uses a strategy called Application Default Credentials (ADC) to find your application's credentials. Details could be found here [Setting Up Authentication for Server to Server Production Applications](https://cloud.google.com/docs/authentication/production). | |
In order to use [Stackdriver monitoring](https://app.google.stackdriver.com) you should [enable Stackdriver Monitoring API](https://cloud.google.com/monitoring/api/enable-api) on Google Cloud Console. You need to grant all the necessary permissions to the users (see [Access Control Guide](https://cloud.google.com/monitoring/access-control)). Stackdriver exporter uses a strategy called Application Default Credentials (ADC) to find your application's credentials. Details could be found here [Setting Up Authentication for Server to Server Production Applications](https://cloud.google.com/docs/authentication/production). |
|
||
Note that Stackdriver monitoring is enabled by default on GKE clusters, however you can follow this [guide](https://cloud.google.com/kubernetes-engine/docs/how-to/monitoring#enabling_stackdriver_monitoring) if it was disabled on your GKE cluster. | ||
|
||
Default metrics exporter is Prometheus. In order to change it to Stackdriver update following helm parameters in {{< ghlink href="/install/helm/agones/values.yaml" branch="master" >}}values file{{< /ghlink >}}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kuqd is this the right way to tell people to do this? Or should we be directing people to the command line arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helm users should know how set their values, I would give hint for that and not for modifying our own default values files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markmandel I need to update this, you are right
@@ -91,8 +91,10 @@ The following tables lists the configurable parameters of the Agones chart and t | |||
| `agones.rbacEnabled` | Creates RBAC resources. Must be set for any cluster configured with RBAC | `true` | | |||
| `agones.crds.install` | Install the CRDs with this chart. Useful to disable if you want to subchart (since crd-install hook is broken), so you can copy the CRDs into your own chart. | `true` | | |||
| `agones.crds.cleanupOnDelete` | Run the pre-delete hook to delete all GameServers and their backing Pods when deleting the helm chart, so that all CRDs can be removed on chart deletion | `true` | | |||
| `agones.metrics.enabled` | Enables controller metrics on port `8080` and path `/metrics` | `true` | | |||
| `agones.metrics.prometheusEnabled` | Enables controller metrics on port `8080` and path `/metrics` | `true` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave the old part's in here (we may need to move it to a new table, and do the same as we did above with the feature
shortcode, and then take the new values, and move them to the table below that is wrapped already in the feature
shortcode for the next version 👍
d488c8e
to
bb09feb
Compare
Build Succeeded 👏 Build Id: d2281d09-86f7-41e7-89da-7d7b41c64dc1 The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
bb09feb
to
a96c1db
Compare
Build Succeeded 👏 Build Id: 01d7fb86-f30f-4e1e-bc25-c3de38aef34c The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
Added feature shortcode and helm upgrade instructions. In order to test all three exporter options enabled you could run this helm command with image.tag: helm upgrade --install --wait --set agones.metrics.stackdriverEnabled=true \
--set agones.metrics.prometheusEnabled=true --set agones.metrics.prometheusServiceDiscovery=true \
agones /go/src/agones.dev/agones/install/helm/agones/ --set agones.image.tag=0.8.0-a96c1db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small docs change from me - otherwise, this looks good to go.
@@ -60,7 +60,7 @@ or to hide a section from 0.8.0 onward: | |||
|
|||
```markdown | |||
{{\% feature expiryVersion="0.8.0" %}} | |||
This is my special content that she be hidden <= 0.8.0 | |||
This is my special content that will be hidden >= 0.8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch 👍
|
||
Default metrics exporter is Prometheus. In order to change it to Stackdriver upgrade Agones release using helm with next three chart parameters changed: | ||
``` | ||
helm upgrade --install --wait --set agones.metrics.stackdriverEnabled=true --set agones.metrics.prometheusEnabled=false --set agones.metrics.prometheusServiceDiscovery=false agones ../install/helm/agones/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helm upgrade --install --wait --set agones.metrics.stackdriverEnabled=true --set agones.metrics.prometheusEnabled=false --set agones.metrics.prometheusServiceDiscovery=false agones ../install/helm/agones/ | |
helm upgrade --install --wait --set agones.metrics.stackdriverEnabled=true --set agones.metrics.prometheusEnabled=false --set agones.metrics.prometheusServiceDiscovery=false agones/agones my-release-name |
@Kuqd to confirm, but I think that is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but you need agones before the release name, also does that works without adding our repo ? I'm wondering if helm hub expose our chart in the default repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So:
helm upgrade --install --wait --set agones.metrics.stackdriverEnabled=true --set agones.metrics.prometheusEnabled=false --set agones.metrics.prometheusServiceDiscovery=false agones/agones my-release-name
That's better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, helm hub doesn't add anything to the default repo - it's lists the installation instructions as:
helm repo add agones https://agones.dev/chart/stable
|
||
Note that Stackdriver monitoring is enabled by default on GKE clusters, however you can follow this [guide](https://cloud.google.com/kubernetes-engine/docs/how-to/monitoring#enabling_stackdriver_monitoring) if it was disabled on your GKE cluster. | ||
|
||
Default metrics exporter is Prometheus. In order to change it to Stackdriver upgrade Agones release using helm with next three chart parameters changed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default metrics exporter is Prometheus. In order to change it to Stackdriver upgrade Agones release using helm with next three chart parameters changed: | |
Default metrics exporter is Prometheus. If you are using the [Helm installation]({{< ref "/docs/Installation/helm.md" >}}), you can install or upgrade Agones to use Stackdriver, using the following chart parameters: |
Add Opencensus Stackdriver Exporter functionality for Agones metrics. New docs on how to set Stackdriver Dashboard and configure permissions. Add helm config variable as well as change reporting period to 1 minute if StackdriveExporter is enabled.
a96c1db
to
e8475d4
Compare
Build Succeeded 👏 Build Id: e3c45479-966b-4395-9b09-b3d40675a30b The following development artifacts have been built, and will exist for the next 30 days:
To install this version:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@Kuqd will merge once you have removed your request for changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add Stackdriver exporter functionality into Opencensus metrics collecting system. Add agones.metrics.stackdriver helm config variable as well as change reporting period to 1 minute.
Now we can use Stackdriver or Prometheus to collect and monitor Agones stats.
Resulting metrics would have an agones/ prefix. For #144