-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spreadsheet viewer has trouble displaying large tabular files #20
Comments
Hi, I was going to post on this same problem today, when I saw this new issue 🙃 We have the same problem on a not-yet-published dataset, which I cannot share, but I found an example on Harvard Dataverse (89.5 MB - 145 Variables, 56200 Observations). https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/D1N0GO/3NK9D8 I agree on the proposition that there should be a limit on file size (bytes or number of observations) which the admin could configure at install time. |
It's a somewhat longstanding problem so thank you to @jggautier and @claudiodsf for getting the discussion going here. 😄 My first thought is that the next version of Dataverse (5.13 probably) will include a new feature for the external tools framework whereby tools can express "requirements" that they need to operate. Here's an example...
... from this pull request: What's going on here is that the NcML preview tool has a requirement that a certain auxiliary file be present for the eyeball to show up (to offer a preview, that is). Perhaps, like @jggautier suggested with "let installations set a byte size limit specifically for the spreadsheet viewer" each tool could express a size limit, something like this:
The idea would be to simply not show the eyeball for large files. We could get fancier, of course, as suggested above (preview only some rows) and maybe the logic should be in the spreadsheet viewer itself, but I thought I'd at least mention this new "requirements" feature. For now, docs are here (look for "requirements"): http://preview.guides.gdcc.io/en/develop/api/external-tools.html It was added in this PR: |
A depositor reported last week that the spreadsheet viewer is having trouble viewing the CSV file they uploaded to the Harvard Dataverse Repository.
Because the file is not published, I can't share it publicly, but the depositor said I could share it privately with any colleagues who want to do more digging. In the meantime, the depositor wrote that they'll add a note in the dataset or file metadata to explain the situation with the file previewer.
The file is 17.4 MB, with 10 columns and 134 rows. The cells in one of the columns has a lot of text. Once the spreadsheet viewer is able to load the preview, it doesn't display all of the columns right away and there's no indication that the viewer is still trying to load parts of the file. This made the depositor think that the viewer would never display all of the columns.
Questions
How quickly the viewer can show the entire tabular file depends at least partly on the user's internet speed and/or computer. Is those two factors?
Recommendations
The text was updated successfully, but these errors were encountered: