Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cloud_upload(<filename>) table func #2670

Merged
merged 36 commits into from
Feb 21, 2024
Merged

feat: cloud_upload(<filename>) table func #2670

merged 36 commits into from
Feb 21, 2024

Conversation

greyscaled
Copy link
Contributor

@greyscaled greyscaled commented Feb 16, 2024

Summary

When a file is uploaded via GlareDB Cloud, it can be read with cloud_upload(<filename>).

This function can only be called in remote and hybrid contexts; it is not meaningful locally.

The filename must contain a valid extension (foo.csv).

Example

After uploading a file called test.csv to GlareDB Cloud, it can be accessed like:

SELECT * FROM cloud_upload('test.csv');

and joined with other data sources, etc.

Details

MVP File Format Limitations

Presently only:

  • CSV as filename.csv
  • Parquet as `filename.parquet
  • Newline-delimited JSOn as filename.json

are accepted.

The file name must currently include the extension.

The file can only use alphanumeric characters.

Testing

I tested this by building the image and running with a GlareDB Cloud environment.
I think it's tractable to write an integration test for this function, in the same vein
as other integration tests, but I'd prefer to leave this as a follow-up refinement
item.

Resolves: #2614

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
@greyscaled greyscaled changed the title [WIP] glaredb_upload feat: glaredb_upload(<file>) table func Feb 20, 2024
@greyscaled greyscaled changed the title feat: glaredb_upload(<file>) table func feat: glaredb_upload(<filename>) table func Feb 20, 2024
@greyscaled greyscaled self-assigned this Feb 21, 2024
@greyscaled greyscaled marked this pull request as ready for review February 21, 2024 00:05
@tychoish
Copy link
Contributor

this all seems good, and I'd be fine merging it.

  • we could probably write tests for this using mino or the mock gcs fixture, but it'd be gnarly. Shouldn't do that today.
  • before we merge we should definitely go around and make sure the name is good.
  • want to make sure that it errors correctly, when running in pure local or embedded.

@greyscaled
Copy link
Contributor Author

want to make sure that it errors correctly, when running in pure local

It does and we can easily add that test, I see no harm in it.

@tychoish
Copy link
Contributor

It does and we can easily add that test, I see no harm in it.

I was moving too quick mostly meant: "a really good error message" and not, say ("credentials not found") or something else.

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
@greyscaled greyscaled changed the title feat: glaredb_upload(<filename>) table func feat: cloud_upload(<filename>) table func Feb 21, 2024
@greyscaled greyscaled merged commit ca2ead4 into main Feb 21, 2024
25 checks passed
@greyscaled greyscaled deleted the grey/read_upload branch February 21, 2024 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feat: read_uploads table func
3 participants