Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving apmpackage to Kibana #124342

Closed
simitt opened this issue Feb 2, 2022 · 10 comments
Closed

Moving apmpackage to Kibana #124342

simitt opened this issue Feb 2, 2022 · 10 comments
Labels
discuss Team:APM All issues that need APM UI Team support Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@simitt
Copy link
Contributor

simitt commented Feb 2, 2022

Kibana >= 8.1 bundles the apmpackage. There is a manual process in place how to keep the package updated. Before investing into automation to update the package and to push the package to the package registry, let's reconsider if the package development should still live in the apm-server repo. This has led to a duplication in validation logic and default values in the past. #123483 summarizes current pain points.

The apmpackage contains definitions for

  • input variables, including type and default values (relevant for starting and configuring the server, but actively used and managed in the custom policy editor in the Kibana code base)
    If the package lived in Kibana, then the future workflow could be that the server team creates an issue for the UI team to add new config options. Such a task would include updating the config options in the package and the UI code, both part of Kibana. This sounds more efficient to me, then updating the config options in the server and then requesting alignment with it in Kibana. It would also make testing easier, if the UI team wasn't depending on the server team implementing the changes first.

  • data stream definitions (relevant for the server and APM UI; owned & maintained by the server team)
    Adding new fields to data streams is driven by the server team, when new fields are supported. The server team could either directly be responsible for opening the relevant PRs in Kibana, or could request this from the UI team. Either way, it might be beneficial to have the data stream definitions in the Kibana code base, so that the UI team can make informed decisions for performant queries without having to check APM Server code.

  • ILM policies (owned & maintained by the server team)
    No clear advantage in having them maintained in the Kibana code; but the policies rarely change, so not a lot of overhead expected either.

  • ingest pipelines (owned & maintained by the server team)
    No clear advantage in having them maintained in the Kibana code; but the policies rarely change, so not a lot of overhead expected either.

The server team has some code in place for creating fields definitions out of the server code. If moving to Kibana, we need to think about how to ensure the server and package templates are kept in sync.

@elastic/apm-ui and @elastic/apm-server please raise arguments and concerns for maintaining the package in the server vs. kibana repository.

@simitt simitt added the discuss label Feb 2, 2022
@botelastic botelastic bot added the needs-team Issues missing a team label label Feb 2, 2022
@joshdover joshdover added Team:APM All issues that need APM UI Team support Team:Fleet Team label for Observability Data Collection Fleet team labels Feb 2, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Feb 2, 2022
@axw
Copy link
Member

axw commented Feb 3, 2022

I'm a pretty strong -1 on moving out of apm-server.

There's a strong coupling between apm-server code, ingest pipeline, and field definitions; and a strong coupling between integration package vars and apm-server config. Splitting the code from package will slow our velocity when introducing new features/fields, when moving logic out of APM Server and into an ingest pipeline, and when updating config. To run a system test we would need to go through a multi-step process: make the changes in Kibana, waiting for a new snapshot, and then implement the system test. Made a mistake? Go back to step one. Currently we can make these kinds of changes, and functionally test them, atomically.

This has led to a duplication in validation logic and default values in the past. #123483 summarizes current pain points.

I'm struggling to see how having the package in the Kibana repo would address the issues. Do you already have an idea of how it would help? I have some thoughts and questions which I'll leave on the other issue.

input variables, including type and default values (relevant for starting and configuring the server, but actively used and managed in the custom policy editor in the Kibana code base)
If the package lived in Kibana, then the future workflow could be that the server team creates an issue for the UI team to add new config options. Such a task would include updating the config options in the package and the UI code, both part of Kibana. This sounds more efficient to me, then updating the config options in the server and then requesting alignment with it in Kibana. It would also make testing easier, if the UI team wasn't depending on the server team implementing the changes first.

How often are we anticipating that we'll modify vars? We've been focused on this recently as the integration has been brought up from nothing, but now that all config is in place I wouldn't expect to have very frequent changes.

Re testing: bear in mind also that there's a coupling between package vars and the apm-server config structure, i.e. the handlebars template. We would make it a little more efficient in the UI, but a little less efficient in the server.

data stream definitions (relevant for the server and APM UI; owned & maintained by the server team)
Adding new fields to data streams is driven by the server team, when new fields are supported. The server team could either directly be responsible for opening the relevant PRs in Kibana, or could request this from the UI team. Either way, it might be beneficial to have the data stream definitions in the Kibana code base, so that the UI team can make informed decisions for performant queries without having to check APM Server code.

Can you elaborate on what the possible benefits would be to having data stream definitions in the Kibana code base, as opposed to the apm-server codebase, for guiding query performance? I don't see how it makes a difference.

ILM policies (owned & maintained by the server team)
No clear advantage in having them maintained in the Kibana code; but the policies rarely change, so not a lot of overhead expected either.

Agreed on this one, doesn't matter so let's ignore ILM policies.

ingest pipelines (owned & maintained by the server team)
No clear advantage in having them maintained in the Kibana code; but the policies rarely change, so not a lot of overhead expected either.

We haven't historically updated the ingest pipelines frequently, but now that we have per-data stream ingest pipelines I expect the rate to pick up a little bit, particularly as part of elastic/apm-server#3565. We may make changes to the pipeline and code in conjunction, such that the final documents remain identical but the intermediate documents sent to the ingest pipeline are different, e.g. by moving some logic to an ingest processor. That would be more difficult to coordinate when they're in different repositories.

@simitt
Copy link
Contributor Author

simitt commented Feb 8, 2022

Thanks for the insightful response Andrew. After some more conversations offline, I agree that it is not the right step to move the ampackage from the server to Kibana. While it would reduce some dependencies and pain for APM UI developers, it would come at a huge cost for existing development setup in apm-server. Overall the package ties closer into apm-server code, as the apm-server is the writing component. If there is a discrepancy, it would lead to corrupted or lost data. Discrepancies between APM UI and ES setup leads to read issues, that can be fixed by upgrading to a fixed version, without corrupting data.

Nevertheless we should keep having the conversation of how to improve the dev experience for APM UI developers when working on the custom policy editor or other features related to Fleet and the Integration.

Please raise your opinions if you are not convinced that we should keep the apmpackage in the server and move forward with better automation around syncing and releasing the package for improved speed and testability and less errors.

@ogupte
Copy link
Contributor

ogupte commented Feb 28, 2022

From APM UI perspective it would be best to always align package versions with stack versions, including all prerelease versions and BCs. Since the package release is such a manual process today, it's currently unsustainable, but it would be ideal if we can always assume that any policy that comes through the custom APM integration settings editor is of the same version to avoid having to support multiple versions in one page.

cc @joshdover

@joshdover
Copy link
Contributor

Nevertheless we should keep having the conversation of how to improve the dev experience for APM UI developers when working on the custom policy editor or other features related to Fleet and the Integration.

I believe this would be best solved by #70582 which we are making progress on. This would allow UI developers to build the apmpackage from the apm-server repo locally and test it directly in Kibana via upload.

I agree on closing this issue and we will explore better automation that does not require package source to be included in the Kibana repo. In the meantime, the Fleet UI team will likely need to maintain a manual release checklist for updating these packages.

@axw
Copy link
Member

axw commented Jun 27, 2023

Reopening this for discussion.

The current package bundling process is a constant source of pain. It is too complex, with too many things that can go wrong.

I think most of the points I made in #124342 (comment) are still valid, but I'm thinking the benefits of having the package source directly in the Kibana tree may outweigh the downsides now.

@axw axw reopened this Jun 27, 2023
@axw
Copy link
Member

axw commented Jun 28, 2023

If we are going to move the package out of the apm-server repo, then I think we should strongly consider removing the data stream definitions from the package, and instead ship them with Elasticsearch, as the Universal Profiling team has done: https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/core/src/main/resources/org/elasticsearch/xpack/profiler

With that approach, the index templates would always be available, and we could remove any possibility of apm-server indexing data before things are set up, and auto-creating plain old indices with broken mappings: elastic/elasticsearch#97032 (comment).

@dgieselaar
Copy link
Member

@axw YES!

@simitt
Copy link
Contributor Author

simitt commented Jun 28, 2023

we should strongly consider removing the data stream definitions from the package, and instead ship them with Elasticsearch, as the Universal Profiling team has done

💯

@axw
Copy link
Member

axw commented Sep 14, 2023

I think we can close this under the assumption that elastic/elasticsearch#97546 will be the solution. If that turns out not to be the case, then we can come back and reopen this.

@axw axw closed this as not planned Won't fix, can't repro, duplicate, stale Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Team:APM All issues that need APM UI Team support Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

6 participants