Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add meaningful names to AnnotatedString annotations to make their usage understandable #55249

Open
aplavin opened this issue Jul 25, 2024 · 30 comments · May be fixed by #55741
Open

Add meaningful names to AnnotatedString annotations to make their usage understandable #55249

aplavin opened this issue Jul 25, 2024 · 30 comments · May be fixed by #55741
Labels
strings "Strings!"

Comments

@aplavin
Copy link
Contributor

aplavin commented Jul 25, 2024

This is IMO quite unintuitive and confusing to read:

julia> s = Base.AnnotatedString("this is an example annotated string", [(1:18, :A => 111), (12:28, :B => 222), (18:35, :C => 333)])

julia> Base.annotations(s)[2][1]
12:28

julia> Base.annotations(s)[2][2][1]  # or Base.annotations(s)[2][2].first
:B

julia> Base.annotations(s)[2][2][2]  # or Base.annotations(s)[2][2].second
222

Would be much nicer and self-explanatory to use meaningful names instead of numbers:

julia> Base.annotations(s)[2].region
12:28

julia> Base.annotations(s)[2].label
:B

julia> Base.annotations(s)[2].value
222
@SarthakPaandey
Copy link

i want to work on this issue.

@tecosaur
Copy link
Contributor

I see the point here, and I think a NamedTuple{(:region, :label, :value)} could make sense, but this would be a breaking change somewhat late in the development cycle.

@aplavin
Copy link
Contributor Author

aplavin commented Jul 25, 2024

but this would be a breaking change

Before 1.11 release is the only time to make such "breaking" changes (not really breaking because feature wasn't released). Otherwise Julia and all users are stuck with these decisions "forever".

somewhat late in the development cycle

Not sure what exactly you mean here... Now appears to be a common reasonable time to try using 1.11 in realistic scenarios, given that it's becoming more stable and regressions are getting fixed. In comparison, nightly is much less stable, so trying 1.12 features now and giving feedback based on actual experience is more challenging, only for the most dedicated users.

@tecosaur
Copy link
Contributor

Before 1.11 release is the only time to make such "breaking" changes (not really breaking because feature wasn't released)

Right, I should clarify that this is breaking with Julia v1.11-rc1, not a currently released Julia. However, my understanding of the alpha, beta, and rc stages is that non-bugfix changes are more appropriate for the alpha and maybe beta releases.

On that note, with

somewhat late in the development cycle

I specifically mean this in the sense that this feature has been part of Julia 1.11 for a while now, and we're now just before the release of 1.11.

In this sense "let's change an aspect of the API of a new 1.11 feature" is rather late in the development cycle.

I do sympathise with only just trying a new feature now and having thoughts, but it places a fair bit of time pressure on me to make such changes and inconvenience other people involved in the release process if we decide this change should be part of 1.11.

I think this could be a nice API tweak, it complicates construction but makes the intent in the values more obvious. Not 100% sold though, so I'd be interested in hearing other thoughts, and also on the matter of whether such a change is even potentially appropriate for an rc-stage release.

@tecosaur
Copy link
Contributor

@fatteneder I'm not sure quite what opinion you're trying to express via 👎, care to elaborate?

@nsajko
Copy link
Contributor

nsajko commented Jul 27, 2024

So what if it's late in the cycle?

@fatteneder
Copy link
Member

fatteneder commented Jul 27, 2024

I dislike that this is quality-of-life improvement might not happen, and that we might be stuck now with this until a hypothetical v2.

Right, I should clarify that this is breaking with Julia v1.11-rc1, not a currently released Julia. However, my understanding of the alpha, beta, and rc stages is that non-bugfix changes are more appropriate for the alpha and maybe beta releases.

It would be nice to have this confirmed from someone in charge.

IMO breaking RC should be ok, after all its the last chance to fix things before you are bound to the backwards compatibility guarantee that definitely applies to full release.

@tecosaur
Copy link
Contributor

I dislike that this is quality-of-life improvement might not happen, and that we might be stuck now with this until a hypothetical v2.

I fully appreciate that, I just don't know how that jibes with the way that release candidates are handled (I don't think we have any docs anywhere on this?).

It also makes annotations a bit more of a pain to write if you're writing a lot of them, but maybe that can be resolved with a constructor?

@aplavin
Copy link
Contributor Author

aplavin commented Jul 27, 2024

It also makes annotations a bit more of a pain to write if you're writing a lot of them

The constructor can support both annotations formats: one convenient for manual writing (is it expected to be common?) and another convenient for any programmatic processing (and returned from annotations()).
The latter should have names to make any processing code readable, while the former can be anything suitable for explicitly listing them in the code, eg the current version.

Of course, if hand-writing annotations in code isn't supposed to be very common, then the corresponding format may not be needed at all.
Extra verbosity doesn't depend on the number of annotations btw:

julia> NamedTuple{(:range,:label,:value)}[
       (1:5, :x, 123),
       (3:6, :y, 456),
      # and more ...
       ]
2-element Vector{NamedTuple{(:range, :label, :value)}}:
 (range = 1:5, label = :x, value = 123)
 (range = 3:6, label = :y, value = 456)

@nsajko
Copy link
Contributor

nsajko commented Jul 28, 2024

Why would this have to be a named tuple instead of a struct?

@tecosaur
Copy link
Contributor

tecosaur commented Jul 28, 2024

Mmm, so I'd say the contenders for the form are:

  • A Tuple{UnitRange{Int}, Pair{Symbol, Any}}, as currently used
  • A NamedTuple{(:range, :label, :value), Tuple{UnitRange{Int}, Symbol, Any}}
  • Defining:
struct Annotation
    range::UnitRange{Int}
    label::Symbol
    value::Any
end

or similar.

I do just wish this came up in #49586 rather than once it's part of a release candidate 🫠.

@aplavin
Copy link
Contributor Author

aplavin commented Jul 28, 2024

Mmm, so I'd say the contenders for the form are:

IMO anything with reasonable names is better than opaque numbers, don't have a preference between 2 and 3.

I do just wish this came up in #49586 rather than once it's part of a release candidate 🫠.

With base Julia, one cannot try out a speicifc new feature – it's all or nothing. There are still known rough edges and regressions, but RC at least tend to have fewer of them, making it feasible to actually try using the upcoming version in real scenarios. I only noticed this issue when actually doing processing/transformations/aggregations of annotations in a string.
Using and evaluating individual new features is much more straightforward when they are packages vs julia base.

@LilithHafner LilithHafner added the strings "Strings!" label Aug 1, 2024
@LilithHafner LilithHafner added this to the 1.11 milestone Aug 1, 2024
@LilithHafner
Copy link
Member

Triage things this is something definitely worth doing. @tecosaur will try to do it within a fortnight and would be happy to receive help.

@JeffBezanson
Copy link
Sponsor Member

To clarify on the release process issue, yes it is ok to break things in a release candidate in a new feature that is not yet in a final release.

@JeffBezanson
Copy link
Sponsor Member

A good approach is to define const Annotation = @NamedTuple{range::UnitRange{Int}, label::Symbol, value::Any}, so it has the functionality of a NamedTuple but the value::Any declaration is there to avoid dealing with heterogeneous named tuple types. Type alias printing can handle it too.

@tecosaur
Copy link
Contributor

I'm still looking for the time to make this change. In the meantime, I'll just ask if anyone has thoughts on what other functions that have the label-value pair without the range (e.g. AnnotatedChar, StyledStrings.eachregion) should return.

Using @NamedTuple{label::Symbol, value::Any} would be the obvious choice, but I may as well see what people's thoughts are.

@LilithHafner
Copy link
Member

I agree that @NamedTuple{label::Symbol, value::Any} is good. I can't think of anything better.

@JeffBezanson
Copy link
Sponsor Member

Well, here we are, the last item on the 1.11 milestone. This is a good example of how Base is not a good venue for designing novel, complex APIs. Instead of having both AnnotatedString and julia 1.11, we have neither.

@tecosaur
Copy link
Contributor

tecosaur commented Sep 7, 2024

I think more than anything else this illustrates the problem with:

  • Changes asked for late in a release cycle
  • When there is a lone person who is responsible/actively working on the subsystem in question
  • And said person is just about to enter a very busy period

As alluded to in my last message, and stated on Slack, over the past ~month I simply haven't had the time for this. The good news is that this busy period started winding down a few days ago, and so I can actually work on this now.

@tecosaur
Copy link
Contributor

tecosaur commented Sep 8, 2024

Update: I think I have this implemented now, along with the required changes to StyledStrings and JuliaSyntaxHighlighting. I now need to do a few sanity checks and update the tests, then I'll put a PR up.

@tecosaur tecosaur linked a pull request Sep 11, 2024 that will close this issue
@JeffBezanson
Copy link
Sponsor Member

  • Changes asked for late in a release cycle

But the whole point is that new APIs need time to try out and revise the design. That inherently makes it harder to work within Base. Is it workable to mark the new AnnotatedString APIs as experimental? I don't have a good handle on whether that would be realistic. Of course it wouldn't necessarily stop people from using it, but might be better than nothing.

@aplavin
Copy link
Contributor Author

aplavin commented Sep 11, 2024

I was even a bit surprised this went into Base without trying out the interface in a package first... #49586 (comment)
But now – nice if at least readily apparent and reported api improvements get in before the release :)

@oschulz
Copy link
Contributor

oschulz commented Sep 12, 2024

Is it workable to mark the new AnnotatedString APIs as experimental

That approach worked quite well with the multithreading API, I think?

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 13, 2024

It should be moved out of Base (into StyledStrings or its own stdlib). join and * will have to be figured out later. This + JuliaLang/StyledStrings.jl#61 + all other places where these invalidations pop up shows over and over together with the non-generic _is_annotated checks in generic methods shows that this split and design is not right.

@KristofferC
Copy link
Sponsor Member

As alluded to in my last message, and stated on Slack, over the past ~month I simply haven't had the time for this.

I think this highlights the danger in bringing in someone's more or less personal project into Base / stdlibs and all of a sudden it is on the critical path for releases. Especially, if that project is kind of new and "fun" and not old and boring.

@KristofferC
Copy link
Sponsor Member

Anyway, I don't understand why this is on the milestone when the existing API would still work if using NamedTuples instead of tuples so adding that will not break anything. So I will take this off the milestone.

@KristofferC KristofferC removed this from the 1.11 milestone Sep 19, 2024
@LilithHafner
Copy link
Member

From the linked PR

the type of annotations is to be changed from a Tuple{UnitRange{Int}, Pair{Symbol, Any}} to a NamedTuple{(:region, :label, :value), Tuple{UnitRange{Int}, Symbol, Any}}.

It's not just tuple -> named tuple, it's nested tuple-pair to flat named tuple. This changes (region, label => value) to (;region, label, value) and breaks access in the form annotation[2][1] and annotation[2].first.

@LilithHafner LilithHafner added this to the 1.11 milestone Sep 19, 2024
@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 19, 2024

Ok, then we declare the whole thing experimental. This has been on the milestone for two months for some new color highlighting thingy. Insanity.

@KristofferC
Copy link
Sponsor Member

Removing from milestone since StyledStrings + AnnotatedString will be considered experimental

@KristofferC KristofferC removed this from the 1.11 milestone Sep 19, 2024
@KristofferC
Copy link
Sponsor Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
strings "Strings!"
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants