Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows paths become part of the dbfs filename. #1109

Closed
IMarvinTPA opened this issue Jan 9, 2024 · 4 comments
Closed

Windows paths become part of the dbfs filename. #1109

IMarvinTPA opened this issue Jan 9, 2024 · 4 comments
Assignees
Labels
CLI CLI related issues

Comments

@IMarvinTPA
Copy link

Describe the issue

When uploading a file using "databricks fs cp" if you include a windows path, the path will become part of the filename. The expected result should ignore everything before the last "" and only keep the file name.

Steps to reproduce the behavior

  1. databricks fs cp ".\testfile.ext" "dbfs:/FileStore/testLocation"
  2. databricks fs ls "dbfs:/FileStore/testLocation"
  3. See in the files list a file named "dbfs:/FileStore/testLocation/.\testfile.ext"
  4. databricks fs cp "C:\testfile.ext" "dbfs:/FileStore/testLocation"
  5. databricks fs ls "dbfs:/FileStore/testLocation"
  6. See in the files list a file named "dbfs:/FileStore/testLocation/C:\testfile.ext"

Expected Behavior

Both of these examples should have resulted in
"dbfs:/FileStore/testLocation/testfile.ext"

Actual Behavior

It created
"dbfs:/FileStore/testLocation/.\testfile.ext"
and
"dbfs:/FileStore/testLocation/C:\testfile.ext"
which are just garbage.
(You can delete them fortunately.)

OS and CLI version

Windows, CLI version 0.209.0 and 0.211.0 (Two different machines.)

Is this a regression?

Unknown. I did not try older versions.

@IMarvinTPA IMarvinTPA added the CLI CLI related issues label Jan 9, 2024
@shreyas-goenka shreyas-goenka self-assigned this Jan 10, 2024
@shreyas-goenka
Copy link
Contributor

shreyas-goenka commented Jan 10, 2024

Thank you for reporting this issue! I have verified that this is indeed an issue with the CLI. Note we still respect unix style file paths, so to use this from windows you can run:

// For absolute paths
databricks fs cp /foo/bar.txt dbfs:/FileStore/testLocation

// For relative paths
databricks fs cp bar.txt dbfs:/FileStore/testLocation
OR
databricks fs cp apple/mango.txt dbfs:/FileStore/testLocation

However, this UX is indeed not great nor intuitive. We have taken note of this issue and filed it in our backlog as an important UX improvement.

Please let me know if this workaround does not work for you, and we can consider prioritising a patch for this.

@IMarvinTPA
Copy link
Author

IMarvinTPA commented Jan 10, 2024

The main problem is that powershell likes to autocomplete files by adding ".\" to the beginning of the file, so it is super easy to mess up.

@shreyas-goenka
Copy link
Contributor

Yeah, that's fair. I'll prioritise working on it this week and it should be patched in a week or two.

github-merge-queue bot pushed a commit that referenced this issue Jan 11, 2024
## Changes
Copying a local file in windows to remote directory in DBFS would fail
if the path was specified as a windows style path (compared to a UNIX
style path). This PR fixes that.

Note, UNIX style paths will continue to work because `filepath.Base`
respects both `/` and `\` as file separators. See: `IsPathSeparator` in
https://go.dev/src/os/path_windows.go.

Fixes issue: #1109.

## Tests
Integration test and manually
```
C:\Users\shreyas.goenka>Desktop\cli.exe fs cp .\Desktop\foo.txt dbfs:/Users/shreyas.goenka@databricks.com
.\Desktop\foo.txt -> dbfs:/Users/shreyas.goenka@databricks.com/foo.txt

C:\Users\shreyas.goenka>Desktop\cli.exe fs cat  dbfs:/Users/shreyas.goenka@databricks.com/foo.txt
hello, world
````
@shreyas-goenka
Copy link
Contributor

Fixed in #1118. This should be out in next week's CLI release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI CLI related issues
Projects
None yet
Development

No branches or pull requests

2 participants