Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File size error in readSDMX #137

Open
oharac opened this issue Dec 21, 2017 · 5 comments
Open

File size error in readSDMX #137

oharac opened this issue Dec 21, 2017 · 5 comments
Assignees

Comments

@oharac
Copy link

oharac commented Dec 21, 2017

Using readSDMX() to try to open 1996 Canada census files (data from here), I encounter the error Error in readChar(file, file.info(file)$size) : invalid 'nchars' argument. I would bet the problem is that the file size of the Generic_... file is about 3.2GB, so the file size exceeds R's maximum integer size (max = 2147483647).

That may be a problem inherent in readChar() since it seems that the nchar argument takes an integer.

Thanks for any help you can provide!


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@eblondel eblondel self-assigned this Dec 21, 2017
@eblondel eblondel added this to the 0.6 milestone Dec 21, 2017
@eblondel
Copy link
Member

@oharac I need to think about it. It's not a common issue, and it's not the first issue dealing with Canada Statistics; 3Gb file is very big, in other tickets users also reported issues with these big datafiles from Canada. There is a discussion launched in #125 where i heard about the potential existence of a web-service to interogate and query SDMX datasets of Canada, but at that time I didn't identify such web-service. The consequence is that to read Canada statistics there is no other solution than download full dataset (should we say a data collection). Althought it's the only thing we have there, reading these big datasets give big issues. But I will investigate how to deal with this, and possibly find a better reader than readChar...

@eblondel
Copy link
Member

Typically, this is the situation where we would need #36 Simple API for XML, where the full XML is not loaded into R, but accessed keeping it outside....

@oharac
Copy link
Author

oharac commented Dec 24, 2017

thanks for looking into it. The package worked great for smaller data collections...

@eblondel eblondel removed the bug label Jan 11, 2018
@eblondel
Copy link
Member

I removed the bug tag, we reach here R memory limitations, not rsdmx itself.

@ghawkins-ott
Copy link

I might be getting the same problem, but a different error message. I am trying to read a 1.3GB XML file and getting the following message: <simpleError in gsub(BOM, "", content): 'Calloc' could not allocate memory (18446744072278353920 of 4 bytes)>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants