Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StatefulSet as a deployment option for Elastic Agent #7357

Merged
merged 14 commits into from
Dec 19, 2023

Conversation

pkoutsovasilis
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis commented Dec 1, 2023

This PR enriches the possible types of deploying agent by adding support for statefulSet. A use-case, that this PR adds support for, is deploying ksm with sharding (requires statefulset) with agent containers that extract only kube-state metrics, e.g. as being done here. Having this feature will also benefit this one elastic/elastic-agent#3847.

@pkoutsovasilis pkoutsovasilis added >feature Adds or discusses adding a feature to the product elastic-agent For tasks related to Elastic Agent support labels Dec 1, 2023
@pkoutsovasilis pkoutsovasilis changed the title feat: Aadd support for statefulset agent type feat: Add support for statefulset agent type Dec 1, 2023
@pkoutsovasilis pkoutsovasilis force-pushed the pkoutsovasilis/agent_statefulset branch from db9ab00 to f2bcb0b Compare December 1, 2023 10:56
@barkbay barkbay self-requested a review December 13, 2023 09:09
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A use-case, that this PR adds support for, is deploying ksm with sharding (requires statefulset) with agent containers that extract only kube-state metrics,

Do you think it would be possible to add an example in config/recipes/elastic-agent?

Also I believe GetPodTemplate() is not tested, but it's not from your PR. It could be interesting to check somewhere that the expected "native" resource (daemonset, deployment or statefulet) is created using the expected template.

test/e2e/test/agent/builder.go Outdated Show resolved Hide resolved
pkg/controller/common/statefulset/reconcile.go Outdated Show resolved Hide resolved
pkg/controller/common/statefulset/reconcile.go Outdated Show resolved Hide resolved
@pkoutsovasilis
Copy link
Contributor Author

A use-case, that this PR adds support for, is deploying ksm with sharding (requires statefulset) with agent containers that extract only kube-state metrics,

Do you think it would be possible to add an example in config/recipes/elastic-agent?

Also I believe GetPodTemplate() is not tested, but it's not from your PR. It could be interesting to check somewhere that the expected "native" resource (daemonset, deployment or statefulet) is created using the expected template.

@barkbay I added a recipe for ksm-sharding specifically that utilises the introduced statefulset type of the agent. Would you like me to try to tackle GetPodTemplate() in this PR?

barkbay
barkbay previously approved these changes Dec 18, 2023
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/apis/agent/v1alpha1/validations.go Show resolved Hide resolved
pkg/apis/agent/v1alpha1/agent_types.go Show resolved Hide resolved
@barkbay barkbay dismissed their stale review December 19, 2023 07:29

Sample does not seem to be working as expected.

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a test using the sample and Agent is stuck in a red state:

k get es,agent,sts,pods                  
NAME                                                       HEALTH   NODES   VERSION   PHASE   AGE
elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch   green    3       8.10.4    Ready   118s

NAME                                       HEALTH   AVAILABLE   EXPECTED   VERSION   AGE
agent.agent.k8s.elastic.co/elastic-agent   red                  1          8.10.4    119s

NAME                                        READY   AGE
statefulset.apps/elastic-agent-agent        1/1     116s
statefulset.apps/elasticsearch-es-default   3/3     117s

NAME                             READY   STATUS    RESTARTS   AGE
pod/elastic-agent-agent-0        2/2     Running   0          116s
pod/elasticsearch-es-default-0   1/1     Running   0          116s
pod/elasticsearch-es-default-1   1/1     Running   0          116s
pod/elasticsearch-es-default-2   1/1     Running   0          116s
pod/kibana-kb-7d57f7cf77-qvkxj   1/1     Running   0          116s

I have the following error warning in Agent logs (k logs pod/elastic-agent-agent-0 -c agent):

{
    "log.level": "warn",
    "@timestamp": "2023-12-19T07:30:40.848Z",
    "message": "Cannot index event publisher.Event ... (status=400): {\"type\":\"document_parsing_exception\",\"reason\":\"[1:2047] failed to parse: data stream timestamp field [@timestamp] is missing\",\"caused_by\":{\"type\":\"illegal_argument_exception\",\"reason\":\"data stream timestamp field [@timestamp] is missing\"}}, dropping event!",
    "component": {
        "binary": "filebeat",
        "dataset": "elastic_agent.filebeat",
        "id": "filestream-monitoring",
        "type": "filestream"
    },
    "log": {
        "source": "filestream-monitoring"
    },
    "log.origin": {
        "file.line": 446,
        "file.name": "elasticsearch/client.go"
    },
    "service.name": "filebeat",
    "ecs.version": "1.6.0",
    "log.logger": "elasticsearch"
}

Also ecs.version seems to be duplicated in the original document.

@barkbay
Copy link
Contributor

barkbay commented Dec 19, 2023

I did a test using the sample and Agent is stuck in a red state:

Actually Agent became green once I restarted the ECK Pod:

NAME                                       HEALTH   AVAILABLE   EXPECTED   VERSION   AGE
agent.agent.k8s.elastic.co/elastic-agent   green    1           1          8.10.4    15m

I think we may have a bug in Agent where reconciliations don't always happen when they should (in case of state change for example) 😕 .

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re LGTM

@naemono naemono changed the title feat: Add support for statefulset agent type Add StatefulSet as a deployment option for Elastic Agent Dec 19, 2023
@naemono naemono merged commit 4477544 into main Dec 19, 2023
6 checks passed
@naemono naemono deleted the pkoutsovasilis/agent_statefulset branch December 19, 2023 16:08
@barkbay barkbay mentioned this pull request Dec 20, 2023
robbavey pushed a commit to robbavey/cloud-on-k8s that referenced this pull request Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
elastic-agent For tasks related to Elastic Agent support >feature Adds or discusses adding a feature to the product v2.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants