-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Markup escaping not working as expected #878
Comments
Here's a possible fix: --- markup.py.orig 2020-12-30 06:35:31.000000000 -0800
+++ markup.py 2021-01-04 20:00:50.000000000 -0800
@@ -9,7 +9,7 @@
RE_TAGS = re.compile(
r"""
-(\\\[)|
+(\\[[\\])|
\[([a-z#\/].*?)\]
""",
re.VERBOSE,
@@ -60,12 +60,12 @@
"""
position = 0
for match in RE_TAGS.finditer(markup):
- (escape_open, tag_text) = match.groups()
+ (escaped, tag_text) = match.groups()
start, end = match.span()
if start > position:
yield start, markup[position:start], None
- if escape_open:
- yield start, "[", None
+ if escaped:
+ yield start, escaped[1], None
else:
text, equals, parameters = tag_text.partition("=")
if equals: Basically, it looks for either |
It occurred to me this morning that the escape() function should also be updated. Here's a diff for that: @@ -48,7 +48,7 @@
Returns:
str: Markup with square brackets escaped.
"""
- return markup.replace("[", r"\[")
+ return markup.replace("\\", r"\\").replace("[", r"\[")
Note that the order of the replace calls is important here. I also wonder whether or not backslash escaping should be allowed for ":" to prevent it from doing emoji conversion (with either additional changes to escape() here, or a separate escape function for that). Speaking of separating markup & emoji replacement, it doesn't look like there's a way to do ONLY emoji replacement on a string right now. There are some places where I'd like to be able to do that, but the replace function is in a private module right now. The opposite is supported by calling Text.from_markup() with emoji=False, but there's no way to ask for emoji replacement to be enabled while disabling other markup. |
Thanks for the report and suggested fix. I'll take a look at those soon. re Emoji you can do this: from rich.emoji import Emoji
print(Emoji.replace(":100:")) |
Ah, I missed that. I'll give it a try - thanks! |
Potential fix in this branch. Decided against escaping emoji, it complicates things too much. Emoji can be disabled anyway. |
Unfortunately, this fix doesn't cover all the cases. It handles an escaped bracket and a single escaped backslash, but if you put more than two backslashes, it always matches on the rightmost two of them and leaves the square bracket as always being a markup token. With three backslashes, the first two should turn into a single backslash in the output, but the third backslash should escape the bracket and make it a literal. This continues in even/odd pairs for an arbitrary number of backslashes, and my fix took that into account by letting each pair (of either |
Try the branch now... Note that the escaping is only required for backslashes directly preceding the |
Looks good - thanks! Preserving backslashes as-is elsewhere in the string makes sense to me. I do something similar in some code I wrote which allows users to input raw unicode characters by entering something like |
Fixed in v9.6.2 |
Upgraded here - looks great. Thanks so much! |
Describe the bug
It's a bit of an edge case, but right now it's not possible to output a backslash immediately before a square bracket starting a markup token. While it does properly output a single backslash if you feed in
r'\\'
followed by a markup token, it still disables interpreting the token as markup.To Reproduce
Working cases:
rich.print(r'[bold]text[/]')
prints "text"rich.print(r'\[not a tag]')
prints "[not a tag]"Not working case:
rich.print(r'\\[bold]some text[/]')
gives an error about an unexpected close tag, because it doesn't treat "[bold]" as markup. The expected result here was "\text", a single backslash followed by the text in bold.Adding more backslashes also produces a surprising result. It seems like the rightmost backslash is always consumed to disable markup on the square bracket, but all the previous backslashes are passed through as-is, rather than having every other backslash escape the following one. So, with three backslashes, I was expecting one to be output (consuming the first two of the input) and one to be left over to escape the square bracket and disable markup. WIth four backslashes, I was expecting two backslashes to be output (consuming all four of the input) and the square bracket after that to be treated as markup.
Platform
Rich 9.5.1 on macOS Mojave (10.14.6) with Apple's Terminal.app.
The text was updated successfully, but these errors were encountered: