Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add regex and case sensitive to FindReplaceMenu #480

Merged
merged 48 commits into from
Oct 10, 2023

Conversation

sun-jiao
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Sep 18, 2023

Codecov Report

Attention: 17 lines in your changes are missing coverage. Please review.

Comparison is base (af8d96b) 80.41% compared to head (4574659) 80.29%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #480      +/-   ##
==========================================
- Coverage   80.41%   80.29%   -0.12%     
==========================================
  Files         281      281              
  Lines       11863    11928      +65     
==========================================
+ Hits         9540     9578      +38     
- Misses       2323     2350      +27     
Files Coverage Δ
.../command_shortcut_events/find_replace_command.dart 88.88% <ø> (ø)
...rc/editor/find_replace_menu/find_menu_service.dart 76.47% <100.00%> (+0.96%) ⬆️
.../editor/find_replace_menu/find_replace_widget.dart 88.00% <100.00%> (+3.67%) ⬆️
lib/src/l10n/intl/messages_en.dart 99.12% <100.00%> (+0.01%) ⬆️
lib/src/l10n/intl/messages_zh-CN.dart 99.12% <100.00%> (+0.01%) ⬆️
lib/src/l10n/intl/messages_zh-TW.dart 99.12% <100.00%> (+0.01%) ⬆️
lib/src/l10n/l10n.dart 59.88% <100.00%> (+0.68%) ⬆️
...rc/editor/find_replace_menu/search_service_v3.dart 99.03% <99.03%> (ø)
...src/editor/find_replace_menu/search_algorithm.dart 5.71% <11.11%> (-94.29%) ⬇️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


@override
Iterable<Match> searchMethod(Pattern pattern, String text) {
if (pattern is String) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sun-jiao @Xazin, do we need to create a Mixture class?
If DartBuiltin works in all cases, we should just use that, since it is faster.
And keep BoyerMoore as an example for the developer if they want to implement their own algorithm.
Also then, we could remove the BoyerMoore implementation from the code and only show it in Docs as an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My tests showed BoyerMoore to be slightly slower. But auto-generated random strings were used in my tests.

I think BoyerMoore should be more efficient for natural language. Because most languages that use alphabetic scripts have prefixes and suffixes that are shared between different words.

Copy link
Contributor Author

@sun-jiao sun-jiao Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read some articles or papers on Boyer-Moore style regex search algorithm. But they're not as concise as the original Boyer-Moore.

Rust community also has a similar issue. rust-lang/regex#197

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have already implemented it: rust-lang/regex#419. I'll take a look tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a brief look and this still seems to be a plain text search algorithm, not regex. Even though it's in the regex carte.

lib/src/editor/find_replace_menu/search_algorithm.dart Outdated Show resolved Hide resolved
@sun-jiao
Copy link
Contributor Author

sun-jiao commented Sep 19, 2023

I did some test for natural language.
data: https://www.reddit.com/r/datasets/comments/1uyd0t/200000_jeopardy_questions_in_a_json_file/

late final String file;
final bm = BoyerMoore();
timer(String algorithmName, String target,
    Iterable<Match> Function(String target, String text) findFunction) {
  final start1 = DateTime.now().millisecondsSinceEpoch;
  final result = findFunction.call(target, file);
  final length = result.length;
  final end1 = DateTime.now().millisecondsSinceEpoch;
  print('$algorithmName finds $length matches, costs ${end1 - start1} milliseconds.');
}

Future<void> main() async {
  file = await File('data').readAsString();
  final targets = [
    "newspaper editor",
    "school girl",
    "the day after tomorrow",
    "United States",
    "Chinese",
    "native speaker",
    "very interesting",
    "disappeared",
    "a small compartment for holding papers",
    "in this important document"
  ];
  for (String target in targets) {
    print('Searching $target');
    timer('Built-in String', target, (target, text) => target.allMatches(text));
    timer('Built-in RegExp', target, (target, text) => RegExp(RegExp.escape(target)).allMatches(text));
    timer('Boyer-Moore    ', target, (target, text) => bm.searchMethod(target, text));
  }
}

result:

Searching newspaper editor
Built-in String finds 2 matches, costs 30 milliseconds.
Built-in RegExp finds 2 matches, costs 12 milliseconds.
Boyer-Moore     finds 2 matches, costs 25 milliseconds.
Searching school girl
Built-in String finds 1 matches, costs 26 milliseconds.
Built-in RegExp finds 1 matches, costs 15 milliseconds.
Boyer-Moore     finds 1 matches, costs 30 milliseconds.
Searching the day after tomorrow
Built-in String finds 1 matches, costs 29 milliseconds.
Built-in RegExp finds 1 matches, costs 17 milliseconds.
Boyer-Moore     finds 1 matches, costs 11 milliseconds.
Searching United States
Built-in String finds 143 matches, costs 21 milliseconds.
Built-in RegExp finds 143 matches, costs 13 milliseconds.
Boyer-Moore     finds 143 matches, costs 19 milliseconds.
Searching Chinese
Built-in String finds 201 matches, costs 22 milliseconds.
Built-in RegExp finds 201 matches, costs 10 milliseconds.
Boyer-Moore     finds 201 matches, costs 29 milliseconds.
Searching native speaker
Built-in String finds 1 matches, costs 26 milliseconds.
Built-in RegExp finds 1 matches, costs 13 milliseconds.
Boyer-Moore     finds 1 matches, costs 17 milliseconds.
Searching very interesting
Built-in String finds 1 matches, costs 22 milliseconds.
Built-in RegExp finds 1 matches, costs 16 milliseconds.
Boyer-Moore     finds 1 matches, costs 15 milliseconds.
Searching disappeared
Built-in String finds 14 matches, costs 24 milliseconds.
Built-in RegExp finds 14 matches, costs 9 milliseconds.
Boyer-Moore     finds 14 matches, costs 21 milliseconds.
Searching a small compartment for holding papers
Built-in String finds 1 matches, costs 32 milliseconds.
Built-in RegExp finds 1 matches, costs 9 milliseconds.
Boyer-Moore     finds 1 matches, costs 9 milliseconds.
Searching in this important document
Built-in String finds 1 matches, costs 28 milliseconds.
Built-in RegExp finds 1 matches, costs 17 milliseconds.
Boyer-Moore     finds 1 matches, costs 12 milliseconds.

Interesting, dart built-in RegExp.allMatches is faster than String.allMatches even for plain texts.

@sun-jiao
Copy link
Contributor Author

@MayurSMahajan @Xazin Done.

Already converted them to svg files. And I added some test cases for regex replace and case sensitive.

Copy link
Contributor

@MayurSMahajan MayurSMahajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good to me.

@sun-jiao
Copy link
Contributor Author

Hi, any updates on this?

@Xazin
Copy link
Collaborator

Xazin commented Sep 27, 2023

I will take a look soon 👍

@Xazin
Copy link
Collaborator

Xazin commented Oct 2, 2023

Hey @sun-jiao, sorry about the delay, my colleague was working on some improvements for the find-replace feature, so I was waiting to see how those changes would impact this PR.

Can you rebase on the latest main?

@sun-jiao
Copy link
Contributor Author

sun-jiao commented Oct 4, 2023

Hello, I am traveling during the China National Day holiday. I will update the branch as soon as possible after the trip.

@Xazin
Copy link
Collaborator

Xazin commented Oct 4, 2023

Hello, I am traveling during the China National Day holiday. I will update the branch as soon as possible after the trip.

No worries, hope you enjoy the trip 👌

@sun-jiao
Copy link
Contributor Author

sun-jiao commented Oct 7, 2023

@Xazin Already merged the main branch to this PR.

@LucasXu0
Copy link
Collaborator

Thanks @sun-jiao. LGTM.

@LucasXu0 LucasXu0 merged commit 8bfd779 into AppFlowy-IO:main Oct 10, 2023
8 of 10 checks passed
q200892907 added a commit to q200892907/appflowy-editor that referenced this pull request Oct 12, 2023
* main:
  fix: unable to clear the style by toggling twice (AppFlowy-IO#532)
  fix: pinyin IME on Linux (AppFlowy-IO#531)
  fix: background color issues (AppFlowy-IO#530)
  fix: cursor blink at wrong location when inserting text (AppFlowy-IO#529)
  feat: enable toggling and canceling of the formatting with shortcuts (AppFlowy-IO#528)
  chore: bump version 1.4.4 (AppFlowy-IO#527)
  fix: the selection handles remain on the screen after cutting the text (AppFlowy-IO#526)
  fix: the selection should be clear if header or footer is focusing (AppFlowy-IO#525)
  feat: customize desktop toolbar style (AppFlowy-IO#519)
  feat: add regex and case sensitive to `FindReplaceMenu` (AppFlowy-IO#480)
  feat: export JSON in mobile and update mobile_example.json (AppFlowy-IO#515)
  feat: support customizing error block (AppFlowy-IO#524)
  fix: impossible to click on any sub-items if the top level item is off-screen (AppFlowy-IO#522)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants