Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for alternatives to package identifiers #896

Merged

Conversation

Pfoerd
Copy link
Contributor

@Pfoerd Pfoerd commented Jun 22, 2022

Adds the ability to define alternatives such as (a|b) in package identifiers. This allows to support architectures that for some reasons need a namespace for some components by using different packages even though they logically belong to the same architectural entity like a layer, domain, etc. This is especially useful if your code needs to satisfy some "overarching" structural requirements that come from outside your actual project, e.g. from an enterprise architecture.

  • pack.[a.c|b*].d matches pack.a.c.d or pack.bar.d, but neither pack.a.d nor pack.b.c.d
  • ..service.(a|b*).. matches a.service.bar.more and group 1 would be bar
  • Nested alternatives like (?:..application.(*).*..)|(?:..domain.api|logic.(*)..) are not allowed to keep the syntax of package identifiers simple and readable.

Resolves: #662

@Pfoerd Pfoerd force-pushed the fix/662/add_alternatives_to_package_matching branch from 35cc0e6 to a301da2 Compare July 8, 2022 14:05
@Pfoerd
Copy link
Contributor Author

Pfoerd commented Jul 8, 2022

@codecholeric maybe this could be a good improvement for stable release 1.0 too 😉 ?

@codecholeric
Copy link
Collaborator

I'll try to find time to review this weekend 😉

@codecholeric
Copy link
Collaborator

Okay, I managed to look into it 🙂 Thanks a lot for your support! I really like the approach of just extending the syntax a little bit without going overboard, but still making it way more powerful with this one little feature 😃

I added some minor changes, because I think the check for nested groups was a little too weak. In particular, it would only find the nesting, if the brackets were directly following each other. But I could also do something like ..[in.(the).middle|other].. and could bypass the check by that.

Looking into it for a while the following things came clear to me:

  1. Nesting capturing groups into non-capturing groups would be really complicated to implement, because for ..I.think.(a)|and.(b).are.group.one there would be two capturing groups, even though by alternation they could technically both be group 1. But this behavior would make it very complicated to do the correct package capturing
  2. Nesting noncapturing alternations into other noncapturing alternations would complicate the transformation to regex a lot as well
  3. While at the moment not explicitly checked nested capturing groups also make no sense for package matching
  4. Nesting noncapturing groups into capturing groups might allow some ugly complicated stuff to write, but it's easy to implement and doesn't cause any trouble.

Based on that I tried to adjust it to forbid cases 1-3 and to make checking this easier also toplevel alternations. I also added a check that forbids noncapturing groups without alternations, because that just doesn't make sense 🤷‍♂️

I also added tests that mirror the problem in #685 and I think with this change it would now be possible to solve #685 just with the package identifier syntax.

Let me know what you think! Does the change make sense to you?

@Pfoerd
Copy link
Contributor Author

Pfoerd commented Jul 11, 2022

Thanks for your effort and for adding proper checks for the nested groups, you're totally right that they were too weak!

I'm a bit torn about the support for nesting noncapturing groups into capturing groups... I fully agree that this is a "low hanging fruit" from an implementation pov and can solve #685.
On the other hand it makes the API of package matching more complicated and a bit ambiguous... From a user's pov one might ask "why can I do this kind of nested alternation but not the others?"... this introduces a certain inconsistency right?

I'm wondering whether it could be better to only add simple top-level alternations without nesting options to the package identifier syntax and extend the APIs to add a full regex based package matching approach for complex usecases like the one in #685 (like suggested here). What do you think?

@codecholeric
Copy link
Collaborator

Yeah, you might be right. Maybe I got pulled a little bit too much into "this is easy to implement and gives more power" versus "this is consistent and easy to understand". Let's forbid all nesting then and tackle the rest in a more powerful extended API 👍
Fun fact, to solve the case in #685 you don't need it anyway 😉 Because the tests I wrote to cover these examples don't need nesting within the capturing groups... At least if I have understood it correctly 🤷‍♂️

@codecholeric codecholeric force-pushed the fix/662/add_alternatives_to_package_matching branch 2 times, most recently from 5105185 to da4b7e0 Compare July 11, 2022 17:29
This change will allow defining alternations within `()` and `[]`. I.e. it is now possible to match both `some.a.pkg` and `some.b.pkg` with a single pattern `some.[a|b].pkg` or capture it via `some.(a|b).pkg`. The demand for this comes from users dealing with legacy package structures, or where the package structure is dictated from the outside (e.g. via enterprise architecture). They might sometimes need to create the slices from different parts of the package tree, e.g. `..[application|domain].(*)`. While this is already possible using the `SliceAssignment` API this is not very straight forward and also demands quite some boilerplate.

We apply the following constraints to the new API to keep it simple to implement and easy and consistent for the user:
1) No nested groups via `[]` or `()`
  1.1) `()` within `()` would just be confusing and there does not seem to be any valid use case for it
  1.2) `()` within `[]` would be ambiguous and hard to implement (e.g. within `[first.(*)|second.(*)]` the second `(*)` would be capturing group 2 within the regex, even though group 1 and 2 are mutually exclusive)
  1.3) `[]` within `[]` would make the implementation a lot more complex, since we would now need to match the correct brackets of nested groups when creating the regex
  1.4) `[]` within `()` would be easy to implement, but make the API inconsistent ("why can I nest it like this, but not the other way around?")
2) No toplevel alternations. This goes together with 1), since it would make the implementation more complex to prevent nested capturing groups

Issue: TNG#662
Signed-off-by: e.solutions <17569373+Pfoerd@users.noreply.github.com>
on-behalf-of: @e-solutions-GmbH <info@esolutions.de>
Signed-off-by: Peter Gafert <peter.gafert@tngtech.com>
@codecholeric codecholeric force-pushed the fix/662/add_alternatives_to_package_matching branch from da4b7e0 to a51b918 Compare July 11, 2022 17:56
@codecholeric codecholeric merged commit 275a4e2 into TNG:main Jul 12, 2022
@Pfoerd
Copy link
Contributor Author

Pfoerd commented Jul 12, 2022

Thank you for your support again :) . Btw do you have any timeline when 1.0 will be released?

@codecholeric
Copy link
Collaborator

I'm thinking any day 😰 All the real blockers I had are gone now, only #912 I still want to merge... But it will be an rc1 first (yeah, I'm a coward 😂)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature proposal: Provide "extended" package matching support
2 participants