-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[processor/resourcedetection]: Add support for k8s.cluster.name where possible #26794
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I wonder if this would be better suited for the resourcedetectionprocessor where we already have some GKE and EKS specific functionality |
The resource detection processor is the best place for this. The k8s cluster name is actually already included for GKE, but not for EKS. I think it's a good idea. |
Pinging code owners for processor/resourcedetection: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I'll work to add this. |
re: EKS cluster name, the example linked library is getting the cluster name from configmap The EC2 instances run as workers in eks should have tags such as |
Thanks for calling this out @jinja2. I was planning on re-using the opentelemetry-go-contrib's method of detecting the cluster name, but it looks like it's using the same methodology you're referring to here. There's also an open issue that echoes your comment here. (There's a comment here pointing out the same cloudwatch dependency problem.) I'll try these others options, thanks again! |
I ran into this today, I need a way to detect cluster name using a generic config running in multiple clouds (GKE, AKS, EKS). It looks like EKS is actively being worked on. Does AKS require a contributor? |
@jsirianni It'd be great to add that as well, you're more than welcome to work on it if you'd like! |
Great, I will give it a shot 👍 |
@crobert-1 I have spent some time playing with the Azure metadata service used by resource detection here https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/internal/metadataproviders/azure/metadata.go. They do not expose the cluster name in a useful way. I am not sure we can detect it easily without using the Azure SDK (and probably requiring some form of authentication). I did notice that the DataDog exporter attempts to detect the cluster name, but the solution is not very robust. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/datadogexporter/internal/hostmetadata/internal/azure/provider.go#L36. When you create an AKS cluster, the default behavior is for Azure to create a resource group for the node pool, with its name derived from the cluster's resource group, name and location. It usually looks like this:
But it could easily be something like This is not a great solution because resource name and cluster name can contain underscores. The user can also specify their cluster's node pool's resource group name, instead of using the default option. Do you have an opinion on how we should proceed? |
I agree that name parsing doesn't seem like a very good option here. I'm not very familiar with AKS and Azure's go SDK, so I don't have any good direction there either. I think it comes down to what the user wants. If they just want something to identify the cluster with the understanding it could be wrong, then parsing may not be too bad here. We'd have to make sure it was clear to users that it may be wrong if we went this route. It took a while to put together the proper EKS methods and libraries to use to get to the cluster name, so it might be hidden somewhere in the AKS documentation, I'm not sure. |
I think it is acceptable to best effort detect the cluster name the way the exporter does it (bail if more than 3 underscores) as long as it is documented as such. From AKS docs, I see they have a reserved node label |
Good call out on the label. My test cluster gives me the following where my cluster name is
The label's value looks similar to what we get from the metadata service. I will play around with different resource group and cluster names to see how it behaves. If it is always the same as the metadata service, perhaps we can just use the value of
|
I tested and this is what I have observed. The node label "kubernetes.azure.com/cluster" matches the value for This value differs from the actual name of the cluster, but it might be enough to uniquely identify the cluster without over-complicating the implementation. My cluster with name "stage" was deployed to resource group "devops" and used the default generated infrastructure resource group. It ends up with the following value:
My second cluster,
If I try to deploy a third cluster with name "test" but with infrastructure group
Additionally, Azure prefers that users do not override the default infrastructure resource group. In summary, the IMDS endpoint already in use provides this exact value, and I think we can rely on it to uniquely identify the cluster. @crobert-1 @jinja2 Unless there is an objection, I am going to move forward with a pull request using the value of ResourceGroupName to set |
To clarify, if the nodepool is created in a different resource group, the Is the label value |
I may have worded it poorly, it is not "node pool resource group", it is "Infrastructure Resource group" and it is set when creating the cluster. It is also not usable by more than one cluster. Azure does not allow you to define the resource group that the node pool will belong to, so all node pools should fall under the same infrastructure resource group. To test this, I have deployed a new cluster and attached a second node pool. I was unable to define an alternative resource group. When checking the node labels, all nodes in the cluster match. service_azure@Azure:~$ kubectl get node -o yaml | grep kubernetes.azure.com/cluster:
kubernetes.azure.com/cluster: MC_devops_test_northcentralus
kubernetes.azure.com/cluster: MC_devops_test_northcentralus
kubernetes.azure.com/cluster: MC_devops_test_northcentralus |
Thank you for trying out the different combinations. To summarize my understanding of AKS cluster and resource group - There are 2 resourceGroups in play in an AKS cluster. One is the resource group in which AKS k8s service is created, and second is the resource group in which VMs, network, etc. infra resources are created. This 2nd group name is what we see when querying a VM's metadata endpoint. By default, this 2nd group is named as I assume we want the |
Hi @jinja2 Sorry for the delayed response, I have a draft PR here that I will open against this repo if we think it is satisfactory.
I think your assessment is accurate. When querying the IMDS endpoint, the resource group name in the response is sufficient to identify the cluster. If we can extract the cluster name, that is ideal, however, falling back onto the full group name is fine because it is unique within the azure account. I tried to create multiple AKS clusters (in different resource groups) using the same name for the infrastructure resource group, and Azure prevented me from proceeding. It shows an error saying that the name is already in use. |
…nvironment (#28649) **Description:** This enhancement detects the k8s cluster name in EKS. The solution uses EC2 instance tags to determine the cluster name, which means it will only work on EC2 (as noted in documentation updates). Resolves #26794 --------- Co-authored-by: bryan-aguilar <46550959+bryan-aguilar@users.noreply.github.com>
…nvironment (open-telemetry#28649) **Description:** This enhancement detects the k8s cluster name in EKS. The solution uses EC2 instance tags to determine the cluster name, which means it will only work on EC2 (as noted in documentation updates). Resolves open-telemetry#26794 --------- Co-authored-by: bryan-aguilar <46550959+bryan-aguilar@users.noreply.github.com>
EKS is done. AKS is still in progress #29328 |
… name (#29328) **Description:** Added best effort support for detecting Azure Kubernetes Service cluster name: `k8s.cluster.name`. The cluster name can be extracted from the cluster's "resource group name" which is retrieved using existing functionality. The `parseClusterName` function has comments explaining the limitations. **Link to tracking Issue:** #26794 **Testing:** I added unit tests for each scenario, and have tested against live AKS clusters that fit each scenario. I am happy to spin these up if anyone has any questions. Added `k8s.cluster.name` to the list of AKS resource attributes.
… name (open-telemetry#29328) **Description:** Added best effort support for detecting Azure Kubernetes Service cluster name: `k8s.cluster.name`. The cluster name can be extracted from the cluster's "resource group name" which is retrieved using existing functionality. The `parseClusterName` function has comments explaining the limitations. **Link to tracking Issue:** open-telemetry#26794 **Testing:** I added unit tests for each scenario, and have tested against live AKS clusters that fit each scenario. I am happy to spin these up if anyone has any questions. Added `k8s.cluster.name` to the list of AKS resource attributes.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Closing as |
Component(s)
processor/k8sattributes
Is your feature request related to a problem? Please describe.
When using the
k8sattributes
processor, I shouldn't need to declare any attributes manually. I still need to manually specifyk8s.cluster.name
as the processor doesn't support it yet.There is no standard API to get the cluster name, but GKE and EKS make it easy to detect. We should support this in the processor and provide
k8s.cluster.name
where possible.Describe the solution you'd like
For GKE, See: https://github.com/open-telemetry/opentelemetry-js-contrib/blob/deb9aa441dc7d2b0fd5ec11b41c934a1e93134fd/detectors/node/opentelemetry-resource-detector-gcp/src/detectors/GcpDetector.ts#L81 (it gets it from
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
)For EKS see: https://github.com/open-telemetry/opentelemetry-js-contrib/blob/deb9aa441dc7d2b0fd5ec11b41c934a1e93134fd/detectors/node/opentelemetry-resource-detector-aws/src/detectors/AwsEksDetector.ts#L86 (it gets it from the cert name on
kubernetes.default.svc
, iiuc)Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: