You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 17, 2024. It is now read-only.
It's awesome for a few reasons that apply to data-diff. Namely, you can direct-query raw csv/txt/parquet files as though they were tables. (eg select posting_date, count(*) as r_count from '/Users/me/data.csv' group by posting_date )
We use this ability to load PROD v UAT files from our system to compare output. Being able to pass this across to data-diff would be incredible.
Whilst just being able to reference csv files in data-diff might be another option, doing this via duckDB would allow you to perform some basic transformations on the way; such as renaming fields, selecting a reduced range etc
The text was updated successfully, but these errors were encountered:
I have actually started working on a duckdb driver not so long ago, might have something ready next week, but the second part of this
Whilst just being able to reference csv files in data-diff might be another option, doing this via duckDB would allow you to perform some basic transformations on the way; such as renaming fields, selecting a reduced range etc
might deserve a separate issue as it could be generalized for all drivers, no?
DuckDB is an in-process database. You typically create it as a session, then discard it once you're done (though not the only way to use it)
https://duckdb.org
It's awesome for a few reasons that apply to data-diff. Namely, you can direct-query raw csv/txt/parquet files as though they were tables. (eg
select posting_date, count(*) as r_count from '/Users/me/data.csv' group by posting_date
)We use this ability to load PROD v UAT files from our system to compare output. Being able to pass this across to data-diff would be incredible.
Whilst just being able to reference csv files in data-diff might be another option, doing this via duckDB would allow you to perform some basic transformations on the way; such as renaming fields, selecting a reduced range etc
The text was updated successfully, but these errors were encountered: