Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #824 (Temporarily)
Description
PR suggests a temporary fix for issue #824. The issue might lie in the way an internal dependency is parsing the path for a given parquet file on Windows as per my conversation with @nikhilsinhaparseable.
Windows uses
\
as path separator which, if present in a string, appears as an escaped\\
. Rust is able to reach themanifest.json
by following the path given in.stream.json
as..."manifest_list":[{"manifest_path":"G:\\projects\\local_parseable\\parseable\\data\\test/date=2024-07-06/manifest.json",...
. By extension, it should be able to read the parquet files from their path present inmanifest.json
as..."files":[{"file_path":"G:\\projects\\local_parseable\\parseable\\data\\test/date=2024-07-06/hour=06/minute=52/DESKTOP-C8E53PR.data.Z6KsJTjB2iryjtr.parquet",...
Looking at partitioned_files, it is clear that the issue is not with how Rust is treating the path, but probably with how the
physical plan
is executed by Datafusion.Excerpts from the logs-
Another strange thing is that, in the error, the path appears to have both the escaped backslash
\\
as well as the hex notation of a backslash%5C
. Which could indicate that the path is getting corrupted somehow.For now, a temporary fix in the form of string replacement can be used (since windows works just fine with forward slashes in place of backslashes)
This PR also fixes issues present with the powershell script which is in charge or downloading and setting-up Parseable on a Windows machine. Over time, the nomenclature rules of new releases changed which broke the script. Similar issues exist in the
.sh
script as well but it relies on somezsh
specific commands and thus could not be fixed by me (I would propose changing it to abash
script since macOS supports that also).This PR has: