-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Order of bulk data csv fields makes it difficult (impossible?) to import #4241
Comments
I reported this. since I couldn't reorder the 6MM columns in the CSV file, I reordered the ordered of the columns in CREATE TABLE to match the order of the columns in the CSV (see below) the alternative solution would be to re-order the columns in the script that creates the bulk data files to match the order in the DDL: courtlistener/scripts/make_bulk_data.sh Line 45 in 8c7f674
|
maybe add a COPY FROM script https://www.postgresql.org/docs/current/sql-copy.html
|
Actually, wouldn't |
Ok, well that would do it. I followed these instructions: https://www.courtlistener.com/help/api/bulk-data/#formats And I guess I missed the fact that it lists the column names. This could be cleared up by better documentation. |
That's generally good, so it sounds like the real issue here is a documentation issue. |
Lots of people are using the bulk data, so I'm not sure how this hasn't been fixed, but one of our users is reporting that our CSVs cannot be imported using the schema we provide because the columns in the CSVs are in a different order than the schema.
I think we should fix that.
I imagine a related issue will be that the schema has all of our columns while the CSV has a subset.
I'm not sure the fix for this. Maybe we need to provide better schemas? Maybe we need to re-order the CSV fields properly?
The error they get is:
The text was updated successfully, but these errors were encountered: