Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft on multi-version support #2253

Closed
wants to merge 2 commits into from

Conversation

EmileTrotignon
Copy link
Collaborator

This a rough draft for multi-version support for ocamlformat.
It use first class modules to pick the specified version of ocamlformat.

There are two main limitations with the design :

  • First of all, the configuration type is shared across all version. To fix this would probably require modular explicit.
  • Second, the configuration cmdliner terms are also shared across all version. This is probably easier to fix, it just requires some hacks (we need to fetch the version number before parsing any other config).

I would really like comments on the overall design, I do not think this is ready for merging at all.
If the design of the patch looks good to everyone, then a lot of cleaning up and some testing would be required.

@gpetiot
Copy link
Collaborator

gpetiot commented Feb 20, 2023

I will have a look soon.

I'm also interested in having the opinion of @emillon on this design (if you have the bandwidth ofc!), or someone else also having experience on the versioning of OCaml tooling.

@emillon
Copy link
Collaborator

emillon commented Feb 21, 2023

Sure! But it's probably not very useful if I comment on the implementation. Do you have a description of the problem you're solving and what the approach looks like?

@EmileTrotignon
Copy link
Collaborator Author

EmileTrotignon commented Feb 21, 2023

@emillon The problem we are trying to solve is the following :
ocamlformat changes the formatting in each version. This is unavoidable, as even if we do not wish to change anything, some bugs are going to be found.

Some projects don't want to have their formatting change all the time. They use a version=... option, that makes ocamlformat fail if the version is not correct.
This can be really annoying if you want to works on two projects at the same time, but the two project don't use the same version of ocamlformat.
The proposed solution is to use first-class functors to basically vendor all future versions of ocamlformat.

In the lib folder we have three new folders :

  • latest that contains the latest version of ocamlformat
  • ocamlformat that contains a new library that picks the correct version using first class functors.
  • common that contains module common to all version (there is also conf, but it should probably be merged with common)

When a newer version is released, we should copy the latest folder to a folder named version_xxx and start working on the new version in latest.

This approach may seem heavy-handed, and in its current form it is, but I think we get to choose how heavy or light this is going to be by deciding which modules go into the common folder. Going lighter by putting more into common makes it more likely that differences in formatting creeps in, but I think that is a trade-off we can't avoid.

Another advantage to this approach is that it does not care about how each version is implemented, which is a good thing as big changes are planned (switch to PPrint)

Currently the parser is not versionned, but it should probably be as the AST is being changed, and this would cause incompatibility later on.

@emillon
Copy link
Collaborator

emillon commented Feb 21, 2023

OK, thanks for the explanation. I think that it's a great problem to solve, that has been there with ocamlformat since I've known about it. Thanks for working on that.

Here are some remarks about this approach, which I'll leave up to the ocamlformat dev team to evaluate (though I'm happy to expand on some points).

The problem shares some constraints with how dune works, so some of what we've learned there might apply.

We want dune to have excellent backward compatibility properties and remain a no-brainer dependency, so it's important that opam upgrade dune is never something painful for users. To achieve that, we have the notion of project, which is a set of metadata that applies to a set of source files. In particular, a project is built against a particular version of the dune language, and future versions of the dune executable should interpret dune files with the same semantics as the version that introduced it (ex: dune 3.7 should interpret (lang dune 1.0) projects like dune 1.0.0 did). One technical point is that it's important to be have a forwards-compatible way of declaring the version you're using. In dune, this means that the first file of the project file should match (lang dune %d.%d), but the parser should not try to parse the rest until it's sure it's a version that uses s-expressions. Opam recently made a similar change and now mandates that opam-version is the first line in the file. This allows changing the syntax in the future without confusing old versions.

To do so, dune records the version of the project it is building, and compares that version in various places to determine whether it should do the "old" or the "new" behavior. This sounds bad but in practice it is not a big issue in the context of dune since new behaviors are often added in relation to a new field. So we just tag that the field is added in version X and we don't have anything else to do.

Overall, I think that this is working well. Some things I like and were not completely obvious at start:

  • it is possible to get better error messages just by upgrading the dune executable. For example, future versions of dune know which version of dune lang added which construct. Error messages and the CLI in general can be improved independently of new features.
  • it is useful to be able to fix issues in old versions of the language without having to version the fix. Recently we found that the <= operator had been parsed as < since forever. Instead of patching N versions we just have to make a single change. (Fix version check for 4.14.0 new syntax #2240 is similar in the context of ocamlformat)
  • more generally, this is caused by the fact that users can upgrade and are not stuck on old versions. we don't have an explicit support policy, but in practice we only support one or two minor branches (3.6.x and 3.7.x) and to a lesser extent the previous major branch (2.9.x) (in the sense that these are the only branches we would consider backporting fixes to)

As for the approach you're suggesting, the concerns I would have are:

  • binary size. the binary is about 30MB now, supporting a handful of versions would make that grow to 100MB+. There's probably some sharing in the libs that can alleviate that, but making N copies is not going to be cheap in disk space or link time.
  • there might be some issues with third party dependencies: if version A depends on lib L.1, version B depends on lib L.2, you can't easily have a binary that links to L.1 and L.2.
  • it'd be useful to have a policy for versions you support (say, only 5 language versions, or compatible with 2 years of releases, something like that).
  • in terms of prototyping, I think it's more convincing with a backtest: instead of putting in place the architecture for the current release and the releases from now on (which will require solving the problems as they go), it would be more representative to create a version that knows about, say, 0.18 to 0.24. There sure will be difficulties in doing so, but you have all of them on hand to solve; in contrast to commiting to a scheme where there will be difficulties you can't know about today.

Happy to expand on some points if you want more info.

@EmileTrotignon
Copy link
Collaborator Author

Thank for the feedback @emillon

Sometime caught my attention in particular :

it is useful to be able to fix issues in old versions of the language without having to version the fix. Recently we found ocaml/dune#6928. Instead of patching N versions we just have to make a single change. (#2240 is similar in the context of ocamlformat)

In the context of ocamlformat, when bugs affect formatting we would want to keep them in the old versions. This is not the case of all bugs, a bug that causes a crash would need fixing everywhere. Its true that my design is good for the first case but not the second.

Regarding size, the folder that would duplicated is 21M including build artifacts, and the .a file itself is 2M.
I am pretty sure they can be made smaller by moving a lot of stuff in the common folder.

@gpetiot gpetiot removed the CLA Signed label Apr 6, 2023
@nojb
Copy link
Contributor

nojb commented Nov 3, 2023

Just out of curiosity, there is another approach that could be considered: version the ocamlformat executable itself. So you could have ocamlformat-0.26.1 and also ocamlformat-0.24.0 installed at the same time. Each version would live in its own branch of the repository and critical fixes would have to be backported, but otherwise developers would only work on the latest version. The ocamlformat (no version) executable could be a shim that dispatched to the right versioned executable based on the version option. This would allow workspaces that contain projects that use different versions of ocamlformat to be formatted together.

@EmileTrotignon
Copy link
Collaborator Author

This is a better approach for multiversion support : #2535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants