Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating a project fails when Jinja extension with breaking change is used #1170

Open
sisp opened this issue May 31, 2023 · 9 comments
Open
Assignees
Labels
bug triage Trying to make sure if this is valid or not

Comments

@sisp
Copy link
Member

sisp commented May 31, 2023

Describe the problem

Consider the following situation:

  1. A template uses v1 of a Jinja extension.
  2. A project is generated using this template.
  3. A breaking change happens in the Jinja extension which gets released as v2.
  4. The template adapts to v2 of the Jinja extension and gets released.
  5. The project is updated using the new version of the template. This may not be possible.

So what's the problem? Copier's update algorithm re-generates a fresh project using the old version of the template (in this case, the one that uses v1 of the Jinja extension) and also generates a fresh project using the new version of the template (in this case, the one that uses v2 of the Jinja extension). But it is only possible to install one version of a package at a time in a virtual env, either v1 or v2 of the Jinja extension. So when v1 is installed, re-generating a fresh project using the old version of the template works but generating a fresh project using the new version of the template fails. And when v2 is installed, generating a fresh project using the new version of the template works but re-generating a fresh project using the old version of the template fails.

Template

See the next section for a complete test case.

To Reproduce

Add this test case to a new file, e.g. tests/test_jinja_extension_breaking_change.py, in Copier's test suite:

import sys
from pathlib import Path
from textwrap import dedent

import pytest
from plumbum import local
from plumbum.cmd import git

from copier.main import run_copy, run_update

from .helpers import build_file_tree


def test_update_with_jinja_extension_breaking_change(
    tmp_path_factory: pytest.TempPathFactory, monkeypatch: pytest.MonkeyPatch
) -> None:
    src, dst, ext = map(tmp_path_factory.mktemp, ("src", "dst", "ext"))

    monkeypatch.setattr("sys.path", [str(ext)] + sys.path)

    with local.cwd(ext):
        Path("test_ext").mkdir()
        Path("test_ext", "__init__.py").write_text(
            dedent(
                """\
                from jinja2.ext import Extension

                def to_slug(value):
                    return value.replace(" ", "-")

                class TestExt(Extension):
                    def __init__(self, environment):
                        super().__init__(environment)
                        environment.filters["to_slug"] = to_slug
                """
            )
        )

    with local.cwd(src):
        build_file_tree(
            {
                "copier.yml": (
                    """\
                    _jinja_extensions:
                        - test_ext.TestExt

                    version:
                        type: str
                    """
                ),
                "{{ _copier_conf.answers_file }}.jinja": "{{ _copier_answers | to_yaml }}",
                "version.txt.jinja": "{{ version }}",
                "output.txt.jinja": "{{ ('hello world ' + version) | to_slug }}",
            }
        )
        git("init")
        git("add", ".")
        git("commit", "-m1")
        git("tag", "v1")

    run_copy(str(src), dst, data={"version": "v1"})

    assert "_commit: v1" in (dst / ".copier-answers.yml").read_text()
    assert (dst / "version.txt").read_text() == "v1"
    assert (dst / "output.txt").read_text() == "hello-world-v1"

    with local.cwd(dst):
        git("init")
        git("add", ".")
        git("commit", "-m1")

    with local.cwd(ext / "test_ext"):
        Path("__init__.py").write_text(
            Path("__init__.py").read_text().replace("to_slug", "slugify")
        )

    with local.cwd(src):
        build_file_tree(
            {
                "output.txt.jinja": "{{ ('hello world ' + version) | slugify }}",
            }
        )
        git("add", ".")
        git("commit", "-m2")
        git("tag", "v2")

    run_update(dst, data={"version": "v2"}, overwrite=True)

    assert "_commit: v2" in (dst / ".copier-answers.yml").read_text()
    assert (dst / "version.txt").read_text() == "v2"
    assert (dst / "output.txt").read_text() == "hello-world-v2"

Then, run tests/test_jinja_extension_breaking_change.py and observe that it fails.

Logs

...
>       run_update(dst, data={"version": "v2"}, overwrite=True)

tests/test_jinja_extension_breaking_change.py:87: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
copier/main.py:939: in run_update
    with Worker(dst_path=Path(dst_path), **kwargs) as worker:
copier/main.py:198: in __exit__
    raise value
copier/main.py:940: in run_update
    worker.run_update()
copier/main.py:768: in run_update
    self._apply_update()
copier/main.py:822: in _apply_update
    self.run_copy()
copier/main.py:700: in run_copy
    self._render_folder(src_abspath)
copier/main.py:604: in _render_folder
    self._render_file(file)
copier/main.py:521: in _render_file
    tpl = self.jinja_env.get_template(src_relpath)
.venv/lib/python3.10/site-packages/jinja2/environment.py:1010: in get_template
    return self._load_template(name, globals)
.venv/lib/python3.10/site-packages/jinja2/environment.py:969: in _load_template
    template = self.loader.load(self, name, self.make_globals(globals))
.venv/lib/python3.10/site-packages/jinja2/loaders.py:138: in load
    code = environment.compile(source, name, filename)
.venv/lib/python3.10/site-packages/jinja2/environment.py:768: in compile
    self.handle_exception(source=source_hint)
.venv/lib/python3.10/site-packages/jinja2/environment.py:936: in handle_exception
    raise rewrite_traceback_stack(source=source)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   jinja2.exceptions.TemplateAssertionError: No filter named 'slugify'.

/tmp/copier.vcs.clone.6a_6fzf8/output.txt.jinja:1: TemplateAssertionError

Expected behavior

The update process should succeed.

Screenshots/screencasts/logs

No response

Operating system

Linux

Operating system distribution and version

Ubuntu 22.04

Copier version

ff5aa05

Python version

CPython 3.10

Installation method

poetry+git

Additional context

I've come to think that re-generating a fresh project using the old version of the template is bad. Instead, the fresh project using the old version of the template should be saved and retrieved during the update process. I haven't figured out a concrete solution yet, but I wonder whether custom Git references might be useful (along the lines of https://iterative.ai/blog/experiment-refs). Perhaps custom Git refs could point to hidden commits, each of which contains a fresh project based on a specific version of a template. Then, those custom Git refs would be pushed as well and Copier's update algorithm could retrieve the fresh project from there instead of re-generating the project. But I haven't tried this out yet nor have I got experience with using custom Git refs. If anybody has a good idea, I'm all ears. 🙏

In fact, the problem isn't limited to Jinja extensions, but it's the most likely situation. Also, for example, a breaking change in the Jinja library itself might lead to this problem.

@sisp sisp added bug triage Trying to make sure if this is valid or not labels May 31, 2023
@yajo
Copy link
Member

yajo commented Jun 1, 2023

Hello! Yes, indeed this problem existed since forever. The lack of reproducibility inherent in many tools lead to this behavior.

I experienced this same problem since years ago. 2020 prettiermaggeddon comes to my mind: prettier/prettier#9459. The problem was produced by the way pre-commit installed dependencies (no surprise) and the way prettier didn't care about it.

Currently available on master, there's a new command: copier recopy. It reapplies the template using the same source and answers, but ignoring history.

So, if you come to this situation, just recopy, solve conflicts, and move on. Luckily it's not so often. It's documented on master too: https://copier.readthedocs.io/en/latest/updating/#when-the-update-gets-broken-because-while-replaying-old-copy

You can also provide an officially supported reproducible build of copier along with your template. See https://github.com/orgs/copier-org/discussions/1130#discussioncomment-5890503.

@yajo yajo closed this as completed Jun 1, 2023
@sisp
Copy link
Member Author

sisp commented Jun 1, 2023

The problem isn't related to lack of reproducibility in installing tools but to the way the update algorithm works. As Copier needs to re-generate a fresh project from the old template and generate a fresh project from the new template, two (fully deterministic) environments would be needed, but even with Nix, Docker or whatever, only one (fully deterministic) environment (typically the one for the new template) is available. For this, Copier would have to call Nix or Docker within the update process for generating those fresh projects. I don't think that's feasible. And without this, when the old and new version of a template rely on different versions of the same dependency with incompatible API, then it is simply impossible to perform an update. This is unrelated to deterministic installations.

But if the update algorithm only had to generate a fresh project from the new template while a snapshot of a fresh project from the old template were retrieved from somewhere, this particular problem would be solved. I don't think copier recopy can, e.g., remove files that existed in the old template but don't exist in the new template. So updating is superior to re-copying. For this reason, I think this issue shouldn't be considered solved.

@yajo
Copy link
Member

yajo commented Jun 1, 2023

Yes, indeed. Sorry for closing so fast! I didn't mean the issue is invalid.

I just meant that you can hit the same basic problem (unable to replay) when the render process triggers some impure installation. But indeed you're right and this case can be really disconnected from that other one.

It's also true that recopy isn't a replacement for update. What I meant is that it's a solution. Or a valid workaround. At the end of the day, I don't find myself hitting this problem very often. When that happens, a recopy just solves the problem and lets me continue updating from there onward. I actually found it when updating old templates with the current master, as it includes lots of breaking changes; and recopy helped a lot.

If we stored a replay cache in hidden git refs as you suggest, we wouldn't need to replay (unless there's no such cache) and the problem would disappear (and copier would be faster on updates). However my point is: do we really need all that complexity, when a simple recopy unblocks the situation, and the problem is not extremely common?

Let me reopen while we settle this. 😉

@yajo yajo reopened this Jun 1, 2023
@sisp
Copy link
Member Author

sisp commented Jun 18, 2023

Another case in which updating a problem fails because the update algorithm replays the old template version is when Copier contains a breaking change such that the new template version works with the new Copier version but the old template version is incompatible. For instance, this might be a breaking change in copier.yml such that the new Copier version is incompatible with copier.yml of the old template version.

yajo added a commit that referenced this issue Jun 29, 2023
Docs were a bit repetitive and unclear.

Targets #1170. It could even fix it?
@yajo
Copy link
Member

yajo commented Jun 29, 2023

Well, this problem is inherent to the design and feature set of Copier, so I think it's time to settle wether it's just a known issue we can live with or there's something doable.

Possible solutions:

  1. Already explained: document (docs: clarify how to use recopy to recover from broken updates #1218) that, if copier cannot replay for whatever reason, you should recopy and solve problems manually.

  2. Make copier build the environment where it's running. I think it's obvious that we're not gonna do that. There are many ways to install Copier, and doing this would convert it into a chicken-and-egg self-bootstrapping package manager. That's a path I dont wanna go for obvious reasons. You install Copier and then it runs, not the other way around.

  3. Record the path to the executable used in last play. This would need a specific environment to work with. For example, if you install Copier with pipx and I install it with nix, none of us could find the other's executable path. And if we did, only you would be able to replay my render, not the other way around. The only environment where a path is a guaranteed reproducible environment is nix, and still you need the source of that path. Thus it doesn't seem like the best option.

  4. Allow the template to specify the path to a reproducible executable. For example, if my template includes an output that produces a reproducible executable like I showcased in https://github.com/orgs/copier-org/discussions/1130#discussioncomment-5890503, I can tell copier how to reproduce the URL to that executable (it should be based in the git commit, of course, which is the only reproducibility source I have) and do the replay running that. In practical terms, this would be only useful for nix or docker executions.

  5. Store a cache of replays in local disk. That would reduce the chances to get a broken replay, and speeding things up. On CI, you can still use caches or artifacts to pass that cache across jobs.

  6. Store that cache on some git ref in the subproject origin repo. It seems to me like it would create a lot of unwanted git garbage. Maybe you can specify a repo for caches, which could or not be the same as the origin.

To set thing clear, I don't have any willingness to implement any of the solutions because for me, number 1 just works and is good enough. Solutions 2 and 3 are clearly no-go. Among the others, what do you prefer?

@pawamoy
Copy link
Contributor

pawamoy commented Jun 29, 2023

5 personally, but it's not perfect either. Cache gets deleted. Such a feature is a helper at best, sometimes users will have to fallback to recopy. So I wonder if it's worth it.

@sisp
Copy link
Member Author

sisp commented Jun 29, 2023

6, probably unsurprisingly because I brought it up. I think we can make it optional, so that users who want to benefit from the reliable updating without replay need to push the Git ref. For those who don't want to do that, or for those who do but forgot to do it last time and lost the Git ref, Copier can fall back to replaying with the old template. This way, it's backwards compatible and less intrusive.

Since I seem to be feeling this pain the most, I volunteer to work on a PR. Sounds okay, @yajo and @pawamoy?

@yajo
Copy link
Member

yajo commented Jun 29, 2023

OK. I guess then that the reference to that cache (the remote, branch, tag, commit and/or whatever needed) should be stored along within the copier answers in something like _replay_cache: git@tro:lo/lo.git@asdfasdfasdfasdf or similar, right? So the next update looks at the last cache reproducibly.

@sisp
Copy link
Member Author

sisp commented Jun 29, 2023

I think Copier could use a custom refs namespace (like DVC), e.g. refs/copier, and therein simply use the value from the _commit field in the answers file as the ref name. But I haven't tried this out yet, just an idea based on my current understanding of what DVC is doing. Then, we wouldn't need an entry in the answers file for the replay cache.

yajo added a commit that referenced this issue Jul 8, 2023
Docs were a bit repetitive and unclear.

Targets #1170. It could even fix it?

Co-authored-by: Timothée Mazzucotelli <pawamoy@pm.me>
Co-authored-by: Sigurd Spieckermann <2206639+sisp@users.noreply.github.com>
@sisp sisp self-assigned this Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug triage Trying to make sure if this is valid or not
Projects
None yet
Development

No branches or pull requests

3 participants