-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganize the re module sources #91308
Comments
I proposed it several years ago on the Python-Dev mailing list and that change was approved in general. The reorganization was deferred because there were several known bugs in the RE engine (fixes for which could potentially be backported) and there were not merged patches waiting for review. Now the patch for atomic groups was merged and bugs was fixed (thanks to Ma Lin). Both the C code and the Python code for the re module are distributed on few files, which lie down in directories Modules and Lib. It makes difficult to work with all related files because they are intermixed with source files of different modules. The following changes are planned:
|
Could the sre_parse and sre_constants modules be kept with public names (i.e. without the leading underscore) but within the re namespace? I use them to tokenize and then syntax highlight regular expressions. I did a quick search and found a few other users of the modules:
The whole modules don't necessarily need exposing, but certainly sre_parse.parse, sre_parse.parse_template, and the opcodes from sre_constants would be the most useful. [1] https://github.com/twisted/pydoctor/blob/c86273dffade5455890570142c8b7b068f5dffd1/pydoctor/epydoc/markup/_pyval_repr.py#L776 |
Please don't merge too close to the 3.11 beta1 release date, I'll submit PRs after this merged. |
It turns out that pip uses sre_constants in its copy of pyparsing. The problem is already fixed in the upstream of pyparsing and soon should be fixed in pip. We still need to keep sre_constants and maybe other sre_* modules, but deprecate them.
It is a good idea which will allow to minimize breakage in short term. You can write "from re import sre_parse", and it would work in old and new versions because sre_parse and sre_compile were imported in the re module. This trick does not work with sre_constants, you still need try/except. But the code that depends on these modules is fragile and can be broken by other ways.
I am going to implement step 2 only after merging your changes for bpo-23689. |
sre_constants, sre_compile and sre_parse are not tested and are not documented. I don't consider them as public API currently. If someone has good reason to use them, IMO we must clearly define which exact API is needed, properly document and test it. If we expose something, I don't think that the API would be exposed as re.sre_xxx.xxx, but as re.xxx. I suggest to hide sre_xxx submodules by adding an underscore to their name. Moreover, the "sre_" prefix is now redundant. I suggest renaming:
|
I don't mind reorganizing this, but I would insist that we keep code using old undocumented things (like the sre_* modules) working for several releases, using the standard deprecation approach. |
Modules with old names are kept (deprecated). The questions are:
|
|
$ ls Lib/re/
_compiler.py _constants.py __init__.py _parser.py Thanks, that's a nice enhancement! Serhiy: Would you mind to explicitly document the 3 deprecated modules in What's New in Python 3.11? |
Is the "import _locale" still used in re/init.py? It cannot see any reference to it in the code and test_re still if it's removed. The last reference to the _locale module has been removed in 2017 by the commit 898ff03. diff --git a/Lib/re/__init__.py b/Lib/re/__init__.py
index c47a2650e3..b887722bbb 100644
--- a/Lib/re/__init__.py
+++ b/Lib/re/__init__.py
@@ -124,10 +124,6 @@
import enum
from . import _compiler, _parser
import functools
-try:
- import _locale
-except ImportError:
- _locale = None
# public symbols |
It's funny to still see mentions of "experimental stuff" in Python 3.11 (2022), whereas these "experimental stuff" are there for 20 years. Maybe it's time to consider that re.template() and re.Scanner are no longer experimental? Maybe change their status to alpha or beta? :-D
|
In |
would it be possible to expose I'm currently using that for my text editor: https://github.com/asottile/babi/blob/d37d7d698d560aef7c6a0d1ec0668672e039bd9a/babi/screen.py#L501 |
It is true.
First we need to find original discussions for these features (it may be not easy) and decide whether we want to finish them or remove.
It is step 2.
Maybe, in some form. Currently you can precompile a pattern, but for a replacement string you rely on a LRU cache. It is slower, and limited by the fixed size of the cache. I think it would be worth to add a function for compiling a replacement string. sub() etc should accept both string and a precompiled template object. It is a separate issue. |
See also bpo-40259: "re.Scanner groups". |
The re.template() function and the re.TEMPLATE functions are not documented and not tested. The re.Scanner class is not documented but has a test_scanner() test in test_re. |
There are two very different classes with similar names: _sre.SRE_Scanner and re.Scanner. The former is used to implement the Pattern.finditer() method, but it could be used in other cases. The latter is an experimental implementation of generalized lexer using the former class. Both are undocumented. It is difficult to document Pattern.scanner() and _sre.SRE_Scanner because the class name contains implementation-specific prefix, and without it it would conflict with re.Scanner. But let leave it all to a separate issue. The original discussion about TEMPLATE was lost. Initially it only affected repetition operators, but now using them with TEMPLATE is error. |
Match.regs is an undocumented attribute, it seems it has existed since 1991. Line 2871 in ff2cf1d
|
For reference, I also implemented .regs in the regex module for compatibility, but I've never used it myself. I had to do some investigating to find out what it did! It returns a tuple of the spans of the groups. Perhaps I might have used it if it didn't have such a cryptic name and/or was documented. |
It was kept for compatibility with the pre-SRE implementation of the re module. It was an implementation detail in the original Python code, but I am sure that somebody still uses it. I am sure some code still use it. If we are going to remove it, it needs to be deprecated first. |
The undocumented sre_parse module got deprecated in Python 3.11 which leads to build/test failures on Linux distributions making use of this version, e.g. Ubuntu 23.04. See bpo-47152: python/cpython#91308
The undocumented sre_parse module got deprecated in Python 3.11 which leads to build/test failures on Linux distributions making use of this version, e.g. Ubuntu 23.04. See bpo-47152: python/cpython#91308
as part of cleanup in pyhton source code, some of the regex code is using internal resources that was renamed. Ref: python/cpython#91308
as part of cleanup in pyhton source code, some of the regex code is using internal resources that was renamed. Ref: python/cpython#91308
as part of cleanup in pyhton source code, some of the regex code is using internal resources that was renamed. Ref: python/cpython#91308
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. (Bitbake rev: a4cd5b0b4b355b7b75fb48c61289700e3e908b2a) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: Steve Sakoman <steve@sakoman.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: Steve Sakoman <steve@sakoman.com>
Source: poky MR: 125004 Type: Integration Disposition: Merged from poky ChangeID: 86e2430d3f40433f978667f15ab6d20d0663e56d Description: As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. (Bitbake rev: a4cd5b0b4b355b7b75fb48c61289700e3e908b2a) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: Steve Sakoman <steve@sakoman.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: Jeremy A. Puhlman <jpuhlman@mvista.com>
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
- `sre_constants` is deprecated, the import statement was updated accordingly ### Description The `re` module sources are being reorganized, see more information: python/cpython#91308 Small fix accommodates changes for all python versions. Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
* Update openstacksdk from branch 'master' to 6c76a8cf7b962f9da293e04f7443ca4c805d7428 - Merge "Remove usage of deprecated `sre_constants` module" - Remove usage of deprecated `sre_constants` module Remove usage of undocumented `sre_constants` module deprecated in `python3.11` and use `re` module instead. As dicsussed in python/cpython#91308, `sre_constants` is undocumented and it's usage was deprecated starting with `python3.11`, where it causes a deprecation warning. https://docs.python.org/3/whatsnew/3.11.html#modules Importing `sre_constants` for exception handling of invalid regular expressions in not necessary, as the same exception class is exposed through the `re` module. Change-Id: Ifd9cccf504a5493683152178ebef9183f30b7f4c
Remove usage of undocumented `sre_constants` module deprecated in `python3.11` and use `re` module instead. As dicsussed in python/cpython#91308, `sre_constants` is undocumented and it's usage was deprecated starting with `python3.11`, where it causes a deprecation warning. https://docs.python.org/3/whatsnew/3.11.html#modules Importing `sre_constants` for exception handling of invalid regular expressions in not necessary, as the same exception class is exposed through the `re` module. Change-Id: Ifd9cccf504a5493683152178ebef9183f30b7f4c
* xeger: avoid deprecated sre_parse module in python-3.11+ The undocumented sre_parse module got deprecated in Python 3.11 which leads to build/test failures on Linux distributions making use of this version, e.g. Ubuntu 23.04. See bpo-47152: python/cpython#91308 * tox: enable Python 3.11 --------- Co-authored-by: Brendan McCollam <brendan@mccoll.am>
ref python/cpython#91308 (cherry picked from commit e21159a)
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. (Bitbake rev: 3c0cd401472ffee06d5a93bdba566cb033851fcf) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
As reported by Martin Jansa <Martin.Jansa@gmail.com>: bitbake/lib/bb/cooker.py:16: DeprecationWarning: module 'sre_constants' is deprecated import sre_constants it's deprecated since 3.11 with: python/cpython#91308 The correct replacement for our usage is re.error so use that instead. (Bitbake rev: 3c0cd401472ffee06d5a93bdba566cb033851fcf) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: