Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate searching is closed notebooks #173726

Closed
andreamah opened this issue Feb 7, 2023 · 5 comments · Fixed by #174287
Closed

investigate searching is closed notebooks #173726

andreamah opened this issue Feb 7, 2023 · 5 comments · Fixed by #174287
Assignees
Labels
feature-request Request for new features or functionality insiders-released Patch has been released in VS Code Insiders notebook on-testplan search Search widget and operation issues
Milestone

Comments

@andreamah
Copy link
Contributor

This issue is for tracking research on how we should handle searching in closed notebooks. Functionality for searching opened/rendered notebooks is in #164926.

The main problem is that the notebook must be rendered/opened for the find widget to be used on it.

@andreamah andreamah added search Search widget and operation issues notebook labels Feb 7, 2023
@andreamah andreamah added this to the February 2023 milestone Feb 7, 2023
@andreamah andreamah self-assigned this Feb 7, 2023
@andreamah
Copy link
Contributor Author

andreamah commented Feb 7, 2023

Some possibilities:

1. New extension provider API - extensions provide info on what the search results are for the notebook/editor that they provide, given the search query

Pros:

  • extendable; we can ask the extensions how they'd want to express their info
    • even hex editor wouldn't be able to rely on native webview find (ie: would strings that wrap-around be captured)?
  • the best way to support custom editors

Cons:

  • if we kept what we have for open notebooks, would create 2 ways of searching for notebooks
  • overhead of maintaining extension API
  • extensions would be responsible for processing regexes/etc.
  • extensions would need to do stuff like highlighting, which might be a lot of work
  • extensions would probably want to add UI buttons to toggle in search view

2. Mostly relying on what we already have in core - use Notebook Find widget results to search

Pros

  • we already have the find widget to give us results and highlight the results
  • aligns with what we are already doing with open notebooks

Cons:

  • heavy performance for rendering all notebooks upon search

@andreamah
Copy link
Contributor Author

Discussed options that involved only searching for inputs in closed notebooks, which can be deduced once notebooks are serialized. Tested deserializing performance here:

registerAction2(class NotebookDeserializeTest extends Action2 {

Results:

Repo 1:
9.63MB Processed | Number of Files: 42 | Number of Cells: 1697
Execution Times: 1031ms // 800ms // 913ms // 815ms
Avg: 889.75ms -> 21.2ms per file

Repo 2:
19.42MB Processed | Number of Files: 68 | Number of Cells: 3034
Execution Times: 1677ms // 1435ms // 1463ms // 1779ms 
Avg: 1588.5ms -> 23.36ms per file

Repo 3:
18.46MB Processed | Number of Files: 110 | Number of Cells: 4394
Execution Times: 2218ms // 2137ms // 2248ms // 2101ms
Avg: 2176 -> 19.8ms per file

@andreamah
Copy link
Contributor Author

I have an example implementation in 2abfedc, but we still have high numbers. Mostly because of deserialization time.

2023-05-24 12:54:13.618 [info] query: a
2023-05-24 12:54:13.619 [info] notebook closed-notebook search END | 3575ms | 18.46MB | Number of Cells: 4394
2023-05-24 12:54:13.620 [info] notebook deserialize time | 3362ms | 94.04195804195804% of total time

@andreamah
Copy link
Contributor Author

I have tried to cache notebook data in memory so that subsequent runs with the same set of files is not as slow as the first in cc6cae4.

First search:

2023-05-24 15:41:24.686 [info] query: a
2023-05-24 15:41:24.686 [info] notebook closed-notebook search END | 3853ms
2023-05-24 15:41:24.687 [info] notebook deserialize time | 3603ms | 93.51154944199325% of total time

Second search:

2023-05-24 15:41:31.612 [info] query: b
2023-05-24 15:41:31.612 [info] notebook closed-notebook search END | 201ms
2023-05-24 15:41:31.612 [info] notebook deserialize time | 111ms | 55.223880597014926% of total time

@andreamah
Copy link
Contributor Author

Todo:

  • figure out how to cache this data to improve performance.
  • add support for other notebook types
    • needs to be able to fetch all notebook URIs in the workspace.

@andreamah andreamah modified the milestones: May 2023, June 2023 May 31, 2023
@andreamah andreamah modified the milestones: June 2023, July 2023 Jun 26, 2023
@vscodenpa vscodenpa added the unreleased Patch has not yet been released in VS Code Insiders label Jul 11, 2023
@vscodenpa vscodenpa added insiders-released Patch has been released in VS Code Insiders and removed unreleased Patch has not yet been released in VS Code Insiders labels Jul 12, 2023
@andreamah andreamah added the feature-request Request for new features or functionality label Jul 21, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Aug 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Request for new features or functionality insiders-released Patch has been released in VS Code Insiders notebook on-testplan search Search widget and operation issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants