-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lucene search #11542
Lucene search #11542
Conversation
Co-authored-by: Christoph <siedlerkiller@gmail.com>
# Conflicts: # CHANGELOG.md # src/main/java/org/jabref/gui/LibraryTab.java # src/main/java/org/jabref/gui/StateManager.java # src/main/java/org/jabref/gui/openoffice/OpenOfficePanel.java
This reverts commit 536ecfa.
# Conflicts: # CHANGELOG.md # src/jmh/java/org/jabref/benchmarks/Benchmarks.java # src/main/java/org/jabref/gui/JabRefFrame.java # src/main/java/org/jabref/gui/LibraryTab.java # src/main/java/org/jabref/gui/entryeditor/EntryEditor.java # src/main/java/org/jabref/gui/entryeditor/fileannotationtab/FulltextSearchResultsTab.java # src/main/java/org/jabref/gui/externalfiles/ExternalFilesEntryLinker.java # src/main/java/org/jabref/gui/externalfiles/ImportHandler.java # src/main/java/org/jabref/gui/groups/GroupDialogView.java # src/main/java/org/jabref/gui/groups/GroupsPreferences.java # src/main/java/org/jabref/gui/maintable/MainTable.java # src/main/java/org/jabref/gui/maintable/MainTableColumnFactory.java # src/main/java/org/jabref/gui/maintable/columns/FileColumn.java # src/main/java/org/jabref/gui/preview/PreviewPanel.java # src/main/java/org/jabref/gui/search/GlobalSearchBar.java # src/main/java/org/jabref/gui/search/RebuildFulltextSearchIndexAction.java # src/main/java/org/jabref/gui/search/SearchResultsTableDataModel.java # src/main/java/org/jabref/logic/pdf/search/indexing/IndexingTaskManager.java # src/main/java/org/jabref/model/database/BibDatabaseContext.java # src/main/java/org/jabref/model/pdf/search/SearchFieldConstants.java # src/main/java/org/jabref/model/search/rules/SearchRules.java # src/main/java/org/jabref/preferences/JabRefPreferences.java # src/main/java/org/jabref/preferences/SearchPreferences.java # src/test/java/org/jabref/gui/groups/GroupTreeViewModelTest.java
# Conflicts: # CHANGELOG.md # src/main/java/org/jabref/model/search/rules/ContainsBasedSearchRule.java # src/main/java/org/jabref/model/search/rules/GrammarBasedSearchRule.java # src/main/java/org/jabref/model/search/rules/RegexBasedSearchRule.java
# Conflicts: # CHANGELOG.md # src/main/java/org/jabref/model/search/rules/ContainsBasedSearchRule.java # src/main/java/org/jabref/model/search/rules/GrammarBasedSearchRule.java # src/main/java/org/jabref/model/search/rules/RegexBasedSearchRule.java
When closing JabRef, only ask users to wait for the linked files indexer to finish. The bib fields indexer is recalculated on startup, so it doesn't need to be completed before shutdown.
The build for this PR is no longer available. Please visit https://builds.jabref.org/main/ for the latest build. |
@@ -58,6 +59,7 @@ public class StateManager { | |||
private final OptionalObjectProperty<LibraryTab> activeTab = OptionalObjectProperty.empty(); | |||
private final ObservableList<BibEntry> selectedEntries = FXCollections.observableArrayList(); | |||
private final ObservableMap<String, ObservableList<GroupTreeNode>> selectedGroups = FXCollections.observableHashMap(); | |||
private final ObservableMap<String, LuceneManager> luceneManagers = FXCollections.observableHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For later discussion: This is maybe a hint that LuceneManager should be called different. Maybe "SearchIndex" or sthg alike? Usually you have one manager for the entire app, not a manager for each file...
for (String resultTextHtml : searchResult.getAnnotationsResultStringsHtml()) { | ||
content.getChildren().addAll(TooltipTextUtil.createTextsFromHtml(resultTextHtml.replace("</b> <b>", " "))); | ||
content.getChildren().addAll(new Text(System.lineSeparator()), lineSeparator(0.8), createPageLink(linkedFile, searchResult.getPageNumber())); | ||
stateManager.activeSearchQuery(SearchType.NORMAL_SEARCH).get().ifPresent(searchQuery -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very weird to put that into a lambda expression. Also makes stack traces way longer. Maybe a simple if check is enough or better - fail fast strategy (if (!activeSearchQuery.isPresent()) { return; } )
import com.tobiasdiez.easybind.EasyBind; | ||
import com.tobiasdiez.easybind.Subscription; | ||
import org.jspecify.annotations.Nullable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that @Siedlerchr will like this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... by default all is nullable in java
private static final Logger LOGGER = LoggerFactory.getLogger(SearchGroup.class); | ||
private final GroupSearchQuery query; | ||
|
||
@ADR(38) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java, annotations are limited. We tried to use e-adr
whereever possible. Where not, we used Java comments. #research.
🎉 |
Congratulations, Loay! |
Lucene search backend
Follow-up to: #8963, #8206, #11326.
Indexing
BibEntry#hashCode
across sessions.IndexWriter
is opened only once at startup and remains open during the runtime.Analyzing
WhitespaceTokenizer
: Suitable for bib fields as it does not escape special characters, preserving LaTeX formatting.LowerCaseFilter
: Converting all text to lowercase.StopFilter
: Removes English stop words.LatexToUnicodeFoldingFilter
: Converts LaTeX-encoded characters to their Unicode equivalents.ASCIIFoldingFilter
: Converts Unicode characters to their ASCII equivalents.EdgeNGramTokenFilter
: To support the "prefix" or "starts with" searches. Although theNGramTokenizer
could be used for "contains" searches, but it is slower during indexing.EdgeNGramTokenFilter
.EnglishAnalyzer
for both indexing and searching. This analyzer converts all strings to lowercase, removing English stop words, and usesPorterStemFilter
which reduces words to their base or root form, known as the "stem". For example, terms like "computer", "compute", "computations", and "computerized" will all be reduced to the stem "comput", to get more relevant search results.Searching
MultiReader
is used to search across both indexes.SearcherManager
is used to manage theIndexSearcher
s to be refreshed before each search query to get the matches without the need to commit changes.Search Results
Search Groups free-search expression
Caution
Before proceeding, create a backup of your library. This is an alpha release, and the search syntax is changed.
Removed
Screenshots
Closes: #8857
Closes: #11374
Closes: #11378
Closes: #8626
Closes: #11595
Closes: #11246
Closes: #7996
Closes: #8067
Closes: #1975
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)