-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capture VCS information #147
Comments
What's the difference between |
To answer my own question (I think): |
Thanks for pointing that out @axw, just added descriptions for each field. |
Github just happens to have very similar repo and web URLs, some other platforms have quite different URLs. |
I think we need to consider carefully if this should be under |
happy to see this move forward btw @graphaelli. Thanks for getting it started. |
The nesting proposed is driven partially by multiple repository support, whether those are multiple first-party or from dependencies. What this proposal is after is a way to specify a default code repository and eventually add an override code repository. This means that given a given filename and line number, code would resolve the repo using: 1. stack frame 2. service 3. kibana mapping. The ref would have to be specified with the frame and otherwise fall back to the service for that information. We can dig into details later, but I expect if repo is provided at a more specific level, ref must also be provided. Initially these 3 frames would map to code as follows: {
"service": {
"environment": "production",
"name": "some-node-service",
"vcs": {
"revision": "abcd1234"
}
},
"span": {
"stacktrace": [
{
"abs_path": "/app/server.js",
"filename": "server.js",
"library_frame": false,
"line": {
"number": 101
}
},
{
"abs_path": "/app/vendor/handle.js",
"filename": "handle.js",
"library_frame": false,
"line": {
"number": 67
},
"vcs": {
"repo": "handle.git",
"revision": "xxx"
}
},
{
"abs_path": "/app/node_modules/express/lib/router/layer.js",
"filename": "node_modules/express/lib/router/layer.js",
"library_frame": true,
"line": {
"number": 95
},
"vcs": {
"revision": "4.17.1"
}
}
]
}
}
Knowing the final frame is a Open to ideas / counter proposals / etc. The goal is to open issues as soon as possible for apm-server to intake and persist this information and agents to provide it. |
@graphaelli that makes sense to me, but I'm a little concerned about increasing the size of stack traces at the API level, given that this is now adding to every frame from a third-party repo, which in some apps/languages may be many. It's already expensive as it is for some agents. Also, if we add profile events it may be useful to separate the mapping so we can reuse existing profiling formats. Maybe we could go with optionally adding vcs info to stack frames for now, and later introduce a mapping in the metadata based on module prefixes? If we were to do that, would it still make sense to have vcs nested under service? |
Same here, that's why I originally proposed we save that for later (vcs under stack frame), at the expense of not supporting multiple repositories / library frames.
Agreed. This makes me rethink the proposed resolution chain for which repo to use. Perhaps the kibana mapping should always win if present so that older events that include repo data could still be used, in case the repo is moved after the event occurs. I think that's a discussion for another issue.
I'm less focused on the repo mapping then on the revision. The benefit of allowing repo via intake is that agents can automatically report it (in some cases, eg . |
I was thinking that the mapping in the intake metadata would include both the repo and the commit, mapping from module name -> {rep, commit}. This assumes a single version/commit if each dependency though, and some languages do support multiple, so it wouldn't always be sufficient. I'm still left wondering if |
Discussed with @axw, mostly around ways to communicate the vcs information from agents efficiently while still enabling the UI to reliably map the code for a given frame. Expanding on the comment above, the idea is to send vcs info via event metadata to perform mapping per frame to repo & revision later, perhaps in apm-server. In that case, it may be that there is nothing special about mapping first party vs dependency code. It's unclear whether that works for all languages / agents. I'm not sure file prefixes are reliable either as that information is probably lost in some languages. For discussion sake, this is what metadata might look like if vcs were included only there instead of also in stack frames:
It does look like a |
For the lack of better alternatives, the Java agent sets the package name as the |
Thanks for raising @felixbarny. This is a concern we've discussed a bit but haven't explored deeply. Curious to hear from other @elastic/apm-agent-devs on feasibility. |
Is that problematic? We can map multiple modules to the same repo.
I think this will generally be pretty complicated to set up for Go devs, particularly as more people move towards using Go Modules where vcs info isn't necessarily locally available. If it's helpful I can try to describe what I think would be a more ideal approach for Go, but I don't want to distract the conversation. |
To enable code deeper code integration a way to identify the exact version of service/file/dependency would be very useful. #130 introduces some dependency-specific concepts. The proposal here is to capture version control information for all first party/non-dependency code, though a single, resuable definition of fields probably makes sense. elastic/ecs#560 may be useful as well.
Note that
service.version
may also contain useful information but isn't purpose built for integration purposes - it commonly contains user-defined values likev7
,2019-09-16-001
, etc.I propose we capture the following fields under
vcs
:repository
https://github.com/elastic/apm.git
revision
f009776789e91de1909ca59d8415df64c8db16bb
type
git
url
https://github.com/elastic/apm
That group would be captured under
service
initially, and considered for inclusion in eachstacktrace.frame
later. The result would beservice.vcs.revision
, etc.Agents would collect this information automatically if possible, eg if a
.git
directory is present, and otherwise provide a mechanism for users to specify vcs information.The text was updated successfully, but these errors were encountered: