Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only treat _ as markdown if separated by white space #5685

Merged
merged 4 commits into from
Sep 6, 2023

Conversation

grzesiek2010
Copy link
Member

@grzesiek2010 grzesiek2010 commented Jul 19, 2023

Closes #5508

What has been done to verify that this works as intended?

I've improved automated tests.

Why is this the best possible solution? Were any other approaches considered?

This is what we discussed in the issue.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

We need to test styling with underscores _ https://docs.getodk.org/form-styling/#emphasis. Other types of styling should rather be intact.

Do we need any specific form for testing your changes? If so, please attach one.

No.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

Before submitting this PR, please make sure you have:

  • run ./gradlew checkAll and confirmed all checks still pass OR confirm CircleCI build passes and run ./gradlew connectedDebugAndroidTest locally.
  • verified that any code or assets from external sources are properly credited in comments and/or in the about file.
  • verified that any new UI elements use theme colors. UI Components Style guidelines

text = text.replaceAll("\\s_([^\\s][^_\n]*)_\\s", " <em>$1</em> ");
text = text.replaceAll("^_([^\\s][^_\n]*)_$", "<em>$1</em>");
text = text.replaceAll("^_([^\\s][^_\n]*)_\\s", "<em>$1</em> ");
text = text.replaceAll("\\s_([^\\s][^_\n]*)_$", " <em>$1</em>");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's possible to have one regex for all those 4 cases I don't know I'm not a master of regex. But even if it's possible I think this way is better because it's clear and such a one regex for all cases would be complicated.

@@ -77,32 +77,40 @@ static String markdownToHtml(String text) {
text, createSpan);

//intermediary replacements keys for special characters, N/B: These symbols are not meant to be interpreted as markdown
text = text.replaceAll("(?s)\\\\#", "&#35;");
Copy link
Member Author

@grzesiek2010 grzesiek2010 Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?s) in regex is called a dot-all mode.
In the default mode (without (?s)), the dot (.) character in a regex pattern matches any character except a newline (\n). However, when (?s) is used, the dot (.) matches any character, including newline characters.
so it's about using dots but the lines I updated by removing (?s) didn't contain any.

text = text.replaceAll("\\s_([^\\s][^_\n]*)_$", " <em>$1</em>");

// emphasis using asterisk
text = text.replaceAll("\\s\\*([^\\s][^\\*\n]*)\\*\\s", " <em>$1</em> ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually want to keep * the same. There's no practical use of * surrounded by text that I can think of. This is why Github makes the space special behavior only for _:

un*frigging*believable: unfriggingbelievable
un_frigging_believable: un_frigging_believable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've reverted that change.

@@ -31,15 +31,13 @@ public void markDownToHtmlEscapesBackslash() {
{"_A\\_B\\_C_", "<em>A_B_C</em>"},
{"A_B\\_C", "A_B_C"},
{"A\\_B_C", "A_B_C"},
{"A_B_C", "A<em>B</em>C"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to keep the existing test cases with the new behavior so it's easier to see what changed. The new cases you added are helpful too so I suggest keeping both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this took me forever to get to! Regex take a lot of concentration for me to read.

The behavior for _ looks like what I would expect. I think we should revert the change to *.

@grzesiek2010 grzesiek2010 marked this pull request as ready for review August 25, 2023 10:39
Copy link
Member

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!!

@lognaturel lognaturel changed the title Improve handling markdown italic Only treat _ as markdown if separated by white space Sep 1, 2023
@srujner
Copy link

srujner commented Sep 6, 2023

Tested with Success!

Verified on device with Android 13

Verified cases:

  • Issue Only treat _ as markdown if separated by white space #5508 is no longer reproducing;
  • Checked different styling option in form e.g. white spaces before words, after words, in the middle of sentences;
  • Regression check on other Styling options;
  • Light and Dark mode;
  • Rotate the screen, minimize the app;

@grzesiek2010 grzesiek2010 merged commit 3cc9ea3 into getodk:master Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Only treat _ as markdown if separated by white space
3 participants