Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devise a robust, transactional system for "publishing" repository versions #1236

Open
horazont opened this issue Nov 5, 2022 · 2 comments
Open
Labels
Editor Tooling Issue relates to process/tooling

Comments

@horazont
Copy link
Contributor

horazont commented Nov 5, 2022

Summary

Many actions the editor do involve sending or processing diffs (= changes before and after an editor action). For instance, when updating a XEP, the editor will generally send an update email which informs the community about the changes.

There is tooling which analyses the current state of a xeps repository checkout and dumps it in a machine-readable format (make build/xeplist.xml). There are also various tools which act on this state or on the difference between two such states (tools/archive.py, tools/send-updates.py).

The challenge in the context of automation is to know the "old" and the "new" state to base the comparison on.

Concept

Conceptually, this requires that the XEP tooling knows about changes which have been formally "published". "Published" changes have been announced to the mailing list and archived into the attic.

Note that "Published" is not equivalent to changes being in the main branch of the git repository.

Requirements

We don't yet know what such a system should look like, so instead of concrete steps to implement it, here are the requirements:

  • MUST allow running downstream tools on each transaction, passing the "published" and "unpublished" state in form of a xeplist.xml file into it.

    The goal here is to use this system to run tools/archive.py and tools/send-updates.py, and potential future tools.

  • MUST allow a dry-run functionality

    During a dry-run, changes MUST NOT be marked as published and downstream tools should be informed in some to-be-defined way that they are in a dry-run situation (preferably environment variable).

  • MUST NOT mark changes as published if any of the dependent tools fail to run

    Example: If the archive.py tool crashes on a diff between commits A and B, B MUST be processed again in the future and MUST NOT be marked as "seen" by the system.

  • MUST NOT batch multiple changes to the same document into the same transaction

    Example: Between the previously published state of the repository and the next run of the tool, XEP-1234 gets updated to 1.0.0 and then to 1.1.0. The tooling MUST handle the transitions to 1.0.0 and 1.1.0 separately.

  • SHOULD NOT send duplicate emails for the same revision

    This may happen depending on how the previous requirements are satisfied, but SHOULD be avoided if possible.

  • SHOULD ignore any changes not covered by xeplist.xml

Additional Notes

  • It should be evaluated whether, if Create tool to facilitate automatic creation of git tags for new XEP releases #1238 is reliable, we can use a list of "seen" or "published" git tags to keep the state. That would be pretty transparent and neat and easy to implement.
  • Another way to track "published" would be a somehow protected branch/head in some git repository. The commit at which that head points would be the last published one. This is pretty git-native, which is nice. (But it doesn't natively address the "MUST NOT batch multiple changes" requirement)
    In this model, a simple transactional system would:
    1. Check out the "published" branch
    2. For each commit in linear order on the main branch (be careful with merges!):
      1. Pull that commit onto the published branch (this must be a fast-forward operation)
      2. Run the tools
      3. On failure, roll back to the previous commit and break out of the loop
    3. Push the published branch state: because we roll back one step on failure, this is always safe
    4. Report any errors $somewhere
@moparisthebest
Copy link
Contributor

moparisthebest commented Dec 20, 2022

I would like to propose an alternative to this, which would hopefully be much easier to do.

If I understand correctly, we basically want historical versions of XEPs, the attic is what we currently call this. But this is highly brittle, because we are trying to create our own transactional system instead of using a database or git, but we already have git, and developers looking at XEPs are already used to the concept of history etc, so I think what we really want is just a way to view historical XEPs ?

Just like how with anything else I'd go to https://github.com/xsf/xeps/blob/master/xep-0368.xml and click "commit" and then "parent" or "history" and then click to look at past versions? Except we'd want these rendered as XEPs right? Why not just write a tool to do exactly this, using git? Possibly even taking gitea/cgit/rgit and just adding a xep-rendering-extension similar to how they render markdown now? Or go through history and using xep2md create a git repo with the same history except XEPs are markdown and then github will render them for free for us?

tl;dr find a way to use existing git viewing tools to solve the attic for us

Do any of those sound acceptable? I understand this slight process change may require Board approval but I can't imagine they'd have a problem with it.

@horazont
Copy link
Contributor Author

While this helps with viewing, it doesn't help with automating the email sending process, does it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Editor Tooling Issue relates to process/tooling
Projects
None yet
Development

No branches or pull requests

2 participants