Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix workspace extensions filer accidentally reading notebooks #1891

Merged
merged 6 commits into from
Nov 18, 2024

Conversation

shreyas-goenka
Copy link
Contributor

Changes

The workspace extensions filer should not read or stat a notebook called foo if the user calls .Stat(ctx, "foo").

Instead, the filer should return a file not found error. This is because the contract for the workspace extensions filer is to only work for notebooks when the file path / name includes the extension (example: foo.ipynb or foo.sql instead of just foo)

Tests

Integration tests.

//
// To read the content of a notebook called `foo` in the workspace the user
// should use the name with the extension included like `foo.ipynb` or `foo.sql`.
_, err := w.Stat(ctx, name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: In the read-only mode for the filer, this does not introduce an additional network API call since the stat API calls are cached and likely prefetched.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

I notice that the implementation of the underlying filer (the regular workspace files one) also runs a stat on read. It does this to determine that the path is not a directory and that that triggers other behavior. This could be optimized out as well (not needed for this PR).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, noticed that as well. Switching over to use_qualified_names should take care of that since we'll no longer have the extensions filer at that point.

internal/filer_test.go Show resolved Hide resolved
internal/filer_test.go Show resolved Hide resolved
//
// To read the content of a notebook called `foo` in the workspace the user
// should use the name with the extension included like `foo.ipynb` or `foo.sql`.
_, err := w.Stat(ctx, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

I notice that the implementation of the underlying filer (the regular workspace files one) also runs a stat on read. It does this to determine that the path is not a directory and that that triggers other behavior. This could be optimized out as well (not needed for this PR).

libs/filer/workspace_files_extensions_client.go Outdated Show resolved Hide resolved
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it is possible to eliminate an extra API call by reusing the stat object in the Read and Delete functions. Once these functions establish that the stat'd path is good, they try to call the parent filer with the original argument, and then call stat again on failure.

Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 1891
  • Commit SHA: d38060214e70dd6b210f90b7b4c93b5eaffe109e

Checks will be approved automatically on success.

@eng-dev-ecosystem-bot
Copy link
Collaborator

Test Details: go/deco-tests/11895948996

@shreyas-goenka shreyas-goenka added this pull request to the merge queue Nov 18, 2024
Merged via the queue into main with commit 72dde79 Nov 18, 2024
10 checks passed
@shreyas-goenka shreyas-goenka deleted the fix/extensions-client-read branch November 18, 2024 17:32
pietern added a commit that referenced this pull request Nov 20, 2024
**Note:** the `bundle generate` command now uses the `.<resource-type>.yml`
sub-extension for the configuration files it writes. Existing configuration
files that do not use this sub-extension are renamed to include it.

Bundles:
 * Make `TableName` field part of quality monitor schema ([#1903](#1903)).
 * Do not prepend paths starting with ~ or variable reference ([#1905](#1905)).
 * Fix workspace extensions filer accidentally reading notebooks ([#1891](#1891)).
 * Fix template initialization when running on Databricks ([#1912](#1912)).
 * Source-linked deployments for bundles in the workspace ([#1884](#1884)).
 * Added integration test to deploy bundle to /Shared root path ([#1914](#1914)).
 * Update filenames used by bundle generate to use `.<resource-type>.yml` ([#1901](#1901)).

Internal:
 * Extract functionality to detect if the CLI is running on DBR ([#1889](#1889)).
 * Consolidate test helpers for `io/fs` ([#1906](#1906)).
 * Use `fs.FS` interface to read template ([#1910](#1910)).
 * Use `filer.Filer` to write template instantiation ([#1911](#1911)).
github-merge-queue bot pushed a commit that referenced this pull request Nov 20, 2024
**Note:** the `bundle generate` command now uses the
`.<resource-type>.yml`
sub-extension for the configuration files it writes. Existing
configuration
files that do not use this sub-extension are renamed to include it.

Bundles:
* Make `TableName` field part of quality monitor schema
([#1903](#1903)).
* Do not prepend paths starting with ~ or variable reference
([#1905](#1905)).
* Fix workspace extensions filer accidentally reading notebooks
([#1891](#1891)).
* Fix template initialization when running on Databricks
([#1912](#1912)).
* Source-linked deployments for bundles in the workspace
([#1884](#1884)).
* Added integration test to deploy bundle to /Shared root path
([#1914](#1914)).
* Update filenames used by bundle generate to use `.<resource-type>.yml`
([#1901](#1901)).

Internal:
* Extract functionality to detect if the CLI is running on DBR
([#1889](#1889)).
* Consolidate test helpers for `io/fs`
([#1906](#1906)).
* Use `fs.FS` interface to read template
([#1910](#1910)).
* Use `filer.Filer` to write template instantiation
([#1911](#1911)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants