-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev container metadata in image labels #18
Comments
Embedding devcontainer metadata in the generated images as labels seems like a really valuable feature 👍. A couple quick comments/questions:
|
@jkeech There may be a maximum (I need to check), but it's large. From what I've seen, things like Buildpacks add pretty giant json structures as labels - far bigger than what a devcontainer.json would include. The reason for considering separate properties is simplify the process of hand editing and additions into a Dockerfile when you just have one or two. Format wise, json is quote heavy, which means lots of escaping. We could support both I suppose. e.g.
Yeah, we need to use a domain we own (or more accurately, the spec owns), but agreed that the |
If the limit is not too high, it seems cool to me to store the entire partial devcontainer.json as a label, and be able to "reverse engineer" the devcontainer components of an image without too much manual copy/pasting. Doing both also makes sense to me for flexibility. |
This repository has a concrete example of where this could be useful. It uses Buildpacks to generate dev container images w/config. https://github.com/chuxel/devpacks The image build is performed in a way that is not easily to update (since creating a Buildpack is not easy), and does not have an out-of-box way to generate secondary outputs like devcontainer.json. Allowing information to be on labels on the image results in a completely independent workflow being able to participate in devcontainer creation. It also covers #2 and illustrates setting container capabilities for Go (specifically ptrace). To @jkeech's point, I just used the label |
GoalRecord feature metadata in prebuilt images, such that, the image and the built-in features can be used with a devcontainer.json (image-, Dockerfile- or Docker Compose-based) that does not repeat the features or feature metadata. Other tools should be able to record the same metadata without necessarily using features themselves. Feature metadata relevant for using the feature when it is already part of the image: ProposalWe investigated adding a devcontainer.json to the image as a label on the image. This devcontainer.json would contain the merged feature metadata (same merge logic as when starting a dev container, e.g.,
To avoid these hurdles we propose to add metadata to the image with the following structure using one entry per feature:
To simplify adding this metadata for other tools, we also support having a single top-level object with the same properties omitting the We already have the merge logic for merging the metadata from multiple features originating from a devcontainer.json and can use the same logic for also merging this new metadata from the container image. Notes
|
@chrmarti Beyond recoding feature metadata, the goal was also to enable alternate build systems like Buildpacks or Nixpacks or even Docker Compose itself without the ability to output a devcontainer.json file to contribute dev container metadata. This is also important and should not be required to use features under the hood (and in some cases can't use them). Are you thinking the feature metadata is in addition to the main proposal? Or would you express the metadata in this situation like a feature even though no such feature exists? If the idea would be to express the metadata as a feature, we should probably clarify how this should be expressed - in particular how we would think about the "id" in this situation.
@chrmarti FYI - Raised #2 a while back for this bit. Agree 100% we should do it as a part of this. |
This is an important part. Even when using the {
"build": {
"dockerfile": "Dockerfile"
},
"features": {
"node": "latest"
},
"postStartCommand": "npm install",
"customizations": {
"vscode": {
"extensions": [ "my-publisher.myextension" ]
}
}, The resulting image metadata should have all of the feature contributions that need to get applied at runtime as well as the lifecycle hooks like Instead of cherry-picking which properties from devcontainer.json and from features would need to get applied at runtime and carried forward in the image metadata, it might be better to include the full contents. That way we don't have to continually update the list of properties that apply after the image is built as new properties are added to the spec. |
Yeah, postCreateCommand was used in the Buildpacks prototype I did. It is partly why I raised #60. It also assumed |
The difficulty with adding an entire devcontainer.json is that the correct merge logic for a property depends on its semantic. E.g., when merging string properties of the same name I would generally guess that the last one wins, but with To simplify use of the metadata by other tools we support a single object and an array of the same type of object (updating
E.g.: [
{
"postStartCommand": "npm install",
"customizations": {
"vscode": {
"extensions": [
"my-publisher.myextension"
]
}
}
},
{
"feature": "devcontainers/features/node@1.2",
"customizations": {
"vscode": {
"extensions": [
"dbaeumer.vscode-eslint"
]
}
}
},
// ...
] |
@chrmarti @alexdima as we just chatted, this could work if we also did #2 so that we can encourage people not to use things like "runArgs" in the pre-build scenario given we know how common these are. We can update all of our samples as well. (In some ways it's tech-debt to back-port what we already did for features - the reasons for the properties is the same.) Totally good with the array idea and merging later. For the list of properties available in the label though, looking at devcontainer.json, I'd propose we support all General and Lifecycle properties. In particular, these will matter if you pre-build your image that is not in the list above:
These are a bit debatable, but I think customers would expect if you built from a devcontainer.json, they would be applied:
These are not applicable to Dev Container Features (#60), but that's because they are inputs to constructing the image. Here we are dealing with a fully constructed image that either needs to container dev container metadata, or was created as a build from a devcontainer.json file. Or put another way, it may be easier to just say we don't support Scenario Specific Properties. These are really the ones that would differ between orchestrator rather than being about the contents of the image itself. @jkeech made a pretty good point there as I think about it. |
We have to decide on how to merge each property. Going by the above list:
Does it make sense, e.g., for |
@chrmarti The nice thing is there's really three states for booleans, true, false, and unset. With that in mind, I think it's safe to say that "last set value wins" for all booleans. For things that are features, that would also mirror how they are applied, correct? Or are some of these unions? The only reason you'd explicitly set one of these is if you wanted to override the default behavior because it has to be different, which I could see doing in devcontainer.json. Unset is just ignored in that evaluation. An alternative would be to just have devcontainer.json able to override but that results in less predictable behaviors. Agree with the commands and entrypoint being exceptions though. In retrospect, we likely should have supported an array of commands rather than just one since that's often how it gets used (separated by &&). I'd also propose that the default behavior for objects under customizations is straight forward unless the tool overrides it so we can handle things that are not in the schema. Specifically, string/boolean are last one in wins if set, arrays are union and objects you go into individual properties under them and apply these same rules. I think we can do otherPortsAttributes and portsAttribures the same way given how they are set up. Makes it easy to document as well - we can describe default rules and then the few exceptions. |
After some offline discussion, I think it's fine to start with only embedding allow-listed properties from the devcontainer.json and features into the image label. If this ends up being a maintenance burden as new properties are added to the spec, we can always choose to expand the scope to be the full devcontainer and feature metadata, which wouldn't be a breaking change if we choose to do that in the future. |
Another +1 thought on the idea of using an array to persist the metadata here is that we could also add support for the same label existing on an individual container (in addition to the image). That would allow orchestrators to drop in dev container metadata as well in addition to placing it on images. Certain properties wouldn't work like this since they affect how the container is configured, but anything that would fit into the "attach" scenario (#20) could be added this way as well. That would be super powerful. |
Sounds good.
When using an array to collect the metadata, there is no need to merge |
@chrmarti Yeah agreed. I was more mentioning how the merge would happen once the labels were processed.
I think The command has an argument to pass all of the different parts separately, so I think that can include image config, but even in these simple examples, you can see the strong need for returning a normalized, processed config. |
👍 I think the merging semantics should be part of the spec and the reference implementation in the CLI. If it's not, then different implementors could have different behavior due to it being undefined, leading to different results at runtime. |
Actually, this needs clarification: We need to make sure features work independently of their order, e.g., if a feature sets |
I agree. I was thinking specifically about the "customizations": {
"codespaces": {
"repositories": {
"chrmarti/example": {
"permissions": {
"contents": "read",
"workflows": "write"
}
}
}
}
} and "customizations": {
"codespaces": {
"repositories": {
"chrmarti/example": {
"permissions": {
"contents": "write",
"workflows": "read"
}
}
}
}
} you might want to merge the permissions both as |
@chrmarti Yeah I think whatever logic we put in place should be emitted by the CLI. I think documenting a default behavior and then mentioning that tools can override this behavior is a good starting point. To your point, the default on booleans should be "or". For the CLI, generally this indicates that we may need a way for implementors to contribute their own merge semantics where needed so they can alter default behaviors in the CLI. The CLI can also emit all details like it does now for features if an implementing client wants to do something on its own. But, for merged view, we could create a function interface that for a "customizations" name and a given property takes in two values to perform a merge. If no function for a given customization is available, it drops to the default logic. This would be something that could be contributed in addition to a schema update, and we could document that fact That keeps things simple for the majority, but allows for exceptions, rather than forcing complexity on everyone unequivocally. I can see conceptually that there could be customizations that are also used by more than one implementor, so this also keeps logic consistent for those scenarios. |
For image metadata and/or feature metadata? The previous suggestion seemed to be that the "last one set wins" in the image metadata and "
The CLI is currently the "API". If we were to make it extensible we would either have to accept other (small) CLIs as extensions or start having a JavaScript (or so) extension API. We have carefully avoided any need for either so far as that would introduce more complexity. I don't see a pressing need to change that and suggest we first see how far we get without. |
On merging booleans: Maybe |
Updated proposal (I have removed the previously included discussion of alternatives for clarity): GoalRecord feature metadata in prebuilt images, such that, the image and the built-in features can be used with a devcontainer.json (image-, Dockerfile- or Docker Compose-based) that does not repeat the features or feature metadata. Other tools should be able to record the same metadata without necessarily using features themselves. Feature metadata relevant for using the feature when it is already part of the image: ProposalWe propose to add metadata to the image with the following structure using one entry per feature and devcontainer.json (see table below for the full list):
To simplify adding this metadata for other tools, we also support having a single top-level object with the same properties. The metadata is added to the image as a Merge LogicTo apply the metadata together with a user's devcontainer.json at runtime the following merge logic by property is used:
Notes
|
Sounds good to me, only the following comments:
|
It might be best to keep |
@edgonmsft @chrmarti Yeah, any value that is not set should remain at its default across the board. The same is true for booleans, so there's really three states: true, false, unset. This mirrors how devcontainer.json itself is processed. |
@Chuxel Do you agree that features should not be able to change |
@chrmarti Image labels should be able to do it for sure given you can do this from devcontainer.json, but Features is an interesting question given the default is Since the user may extend an image that already has the value set, we'll still need to support processing more than one value. However, given that, adding it into Features later would be pretty trivial I'd assume - so omitting for now is probably the right thing to do until we get feedback otherwise. Unrelated: One other thing we may want to document is how variables will be applied - I'm assuming these would generally just appear as the variables and processed at the end when the container is coming up (rather than when the label is applied). Is that how you were thinking about it as well? |
I agree, that seems to make the most sense. |
Posted the proposal with the discussed clarifications in PR #95 for review. |
Hello. What is the reason why the label is
|
Unlike some developer centric formats, a non-goal for devcontainer.json is to become yet another multi-container orchestrator format. Instead, its goal is to enable development in containers regardless of how they are orchestrated. With this in mind, allowing image labels be used to set devcontainer.json properties could have significant advantages. In particular, there is a problem with images and devcontainer.json files needing to be paired but are deployed and managed separately, which can lead to disconnects. This is important in scenarios like container runtime settings required to enable debugging (e.g. cap_add: SYS_PTRACE), but applies broadly to any settings the images was purpose built to support.
In particular:
This second point also makes it easier to use the metadata when some other mechanism is used to build the container images - whether an orchestrator or another non-Docker CLI like
pack
for Buildpacks.The json-based nature of devcontainer.json would main this fairly straight forward. To simplify just reusing json blobs in automation scenarios, we can also support a whole structure being added:
For hand editing, common and less complex properties could be referenced directly.
More complex
any
type properties could then be encoded json.In the common case, these labels would be automatically added by the dev container CLI to the image when it is pre-built. (
devcontainer build ...
), but manual entry would enable these additions to be embedded in orchestrator formats as well.Any tool that supports the dev container spec would then look for these image labels - regardless of whether they are on a pre-built image or one built by an orchestrator. The reference implementation would then illustrate how to make this happen.
We should be able make this work with any general, lifecycle, or tool specific property.
Dev container features
The dev container CLI enables pre-building images that include the proposed dev container features concept. However, there's a couple things here to consider. First, we can include an image label that indicates whether the build step is already done for the image. This avoids accidentally re-firing of the same feature multiple times based on reused configuration.
Second, we can render out the devcontainer.json properties that tie to the feature in the resulting image. e.g., extensions would include those that were indicated by the feature. Properties like
capAdd
we've talked about as higher level properties, so these could be handled the same way.Processing then is always: build (optional), read labels with metadata, run. You can use devcontainer.json as you use it today, but the labels could also come from somewhere else.
Beyond providing processing consistency, this avoids a possible issue where the referenced feature in devcontainer.json updates, but the pre-built image does not. Problems here could be extremely difficult to track down when they happen otherwise.
Multi-container
Finally - this should help with the [one-to-many problem at least tactically or when there is not a specific integration to an orchestrator format. For pre-building, each image can have a separate devcontainer.json or Dockerfile with the labels - with the dev container CLI enabling features. These can then be referenced in an orchestrator directly. Even when the dev container CLI is not used, at least a subset of properties could be included in the orchestrator format.
Net-net, the resulting image should have labels on it that explains how it should be set up. Orchestrator formats can influence processing where they support adding properties. In all cases, the dev container CLI will inspect the image and make the needed adjustments.
The text was updated successfully, but these errors were encountered: