Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V4, drop greenery.fsm, overhauled API #67

Merged
merged 21 commits into from
Nov 8, 2022
Merged

V4, drop greenery.fsm, overhauled API #67

merged 21 commits into from
Nov 8, 2022

Conversation

qntm
Copy link
Owner

@qntm qntm commented Nov 8, 2022

This is a complete overhaul of greenery.

  • greenery.fsm has been eliminated from the public API - this package is intended for manipulating regular expressions, and FSM-related capability is secondary to that, an implementation detail. If the FSM module is considered useful enough to warrant re-exposing, I may factor it out as its own package. For now, though, it's internal.
  • greenery.lego no longer exists. Instead the package's API consists only of parse(string) which returns a Pattern object, the methods on those Pattern objects, and some bits and pieces noted in the documentation. This fixes Rename the lego module to something more sensible and less trademarked #23.
  • Class names are all now UpperCase format. No clue why this wasn't the case before... (EDIT: because Python's own builtin classes like frozenset are lower case for some stupid reason, does Python ignore its own style guidelines?)
  • What used to be the lego module has been extracted out into separate pieces: parse.py handles parsing and is its own module, similarly bound.py, multiplier.py and charclass.py. However, due to the circularity, I wasn't able to split out mult.py, conc.py and pattern.py - these still have to be lumped together in what is now rxelems.py. I might try to unscramble this sometime but it may be intractable. I was able to split out most of their dedicated unit tests, though.
  • Internally, "lego pieces" now have a much more intelligible hierarchy. It is no longer possible to, for example, try to concatenate a Pattern and a Mult.
  • We no longer use hasattr anywhere, although we do still have to use isinstance and type in a few places. I consider this an unavoidable result of Mult's multiplicand needing to potentially be either a Charclass or a sub-Pattern. I'm disinclined to sink any more work into trying to "resolve" this. Fixes Eliminate reliance on hasattr in lego module #24.
  • Scrapped all remaining use of self.__dict__ (if any). Because all the classes here are immutable, we do still have to use setattr in several places right after object initialisation - I might try to minimise this sometime, but I consider it a low priority. Fixes Why are instance attributes initialized with self.__dict__? #56.
  • In parse, added support for the confusing scenario of negated charclass shorthand inside of square bracket charclesses, like [^\W], [123\D], [\s\S] and so on. Fixes Error when parse lego.parse('[\S]') or similar regex #35. Might tidy this up a little more later.
  • Scrapped the GNU makefile, I have no clue what this was supposed to be doing.
  • Scrapped module map.txt, ditto.
  • README updates to reflect the new version.
  • Applied a serious amount of linting.
  • As these are all extremely breaking changes, bumped the version to 4.0.0.
  • Added a CHANGELOG.md.
  • Minimum required version of Python is now 3.8.

@qntm qntm self-assigned this Nov 8, 2022
@qntm
Copy link
Owner Author

qntm commented Nov 8, 2022

Aside: what the heck is up with Python's module exports behaviour? There's seriously no way at all to prevent everything from being exported, even if the importer didn't specifically import it? This is completely bananas... JavaScript's system is far superior...

@qntm qntm merged commit 390289b into main Nov 8, 2022
@qntm qntm deleted the v4 branch November 8, 2022 17:28
@rwe
Copy link
Contributor

rwe commented Nov 23, 2022

Aside: what the heck is up with Python's module exports behaviour? There's seriously no way at all to prevent everything from being exported, even if the importer didn't specifically import it?

I agree with your sentiment :)

In Python, the default is to export everything except in these cases:

  • It does not, by default, export names leading with an underscore. So, _some_helper is not automatically imported when importing the module defining it.
  • If you specify __all__ = ("Foo", "Bar") in a given module, only those names are imported by default.

The default behaviour is kinda similar to each module declaring something like:

__all__ = [name for name in globals().keys() if not name.startswith('_')]

But that includes everything you've imported, too.

Usually, you want to define __all__ = ("…"). My habit is typically to do that for every module/file:

class Pattern:
  ...

__all__ = (
  'Pattern',
  'parse_pattern',
  'ParseError',
  # …
)

Importantly, without unusual temporary-scoping jiggery-pokery, it's not straightforward to prevent someone manually importing something. Which is a nightmare for interface maintenance, unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants