-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance RDA #751
Enhance RDA #751
Conversation
- add ROBJ and RVector{T} to minimize code duplication - add RDAIO and RDAContext to store the RDA stream state and props and minimize the number of parameters passed to readxxx() - rename all funcs that read from stream into readxxx() - Dict-based dispatch of readxxx() methods
- NA is not NaN in R, but Julia was treating R NaNs as NA for RNumeric - since NaN != NaN for any NaN number, to properly detect NA '===' op should be used - add NA/NaN tests
- if true (default behaviour) R column names are checked to be valid Julia identifiers and fixed, if necessary - if false, column names are imported as is - test added
- support passing keyword options to read_rda() - add support for convertdataframes= option - add support for fixcolnames= option
Enhance RDA Thanks for doing this.
This was discussed and intentionally left out for all IO methods. The plan is for `df.colname` to be the idiomatic way of specifying a column in the near future (it already works on some experimental AbstractDataFrame types), and `df.col.name`, for example, can't be parsed as desired. (`.` is meaningful syntax in Julia, so using `col.name` in place of a valid identifier is like using `col+name` which wouldn't be valid in R, for example. It's trivial to work around in user code if the user insists, but it doesn't belong in any package code for now, though adding it uniformly to all IO methods may be revisited later if the roadmap changes.
Thanks for submitting this -- it was a major improvement! As a heads up, |
Thanks for merging my PR that fast! And no problems with Regarding |
Those are good points and suggestions. Metadata / pretty printing headers, etc. have gotten some discussion, and there seemed to be agreement that they're important -- just perhaps a little ways off while we wait and see how dot-overloading is implemented in Base, play catch up on basic functionality, and figure out even what |
Enhance RDA Thanks for doing this.
This was discussed and intentionally left out for all IO methods. The plan is for `df.colname` to be the idiomatic way of specifying a column in the near future (it already works on some experimental AbstractDataFrame types), and `df.col.name`, for example, can't be parsed as desired. (`.` is meaningful syntax in Julia, so using `col.name` in place of a valid identifier is like using `col+name` which wouldn't be valid in R, for example. It's trivial to work around in user code if the user insists, but it doesn't belong in any package code for now, though adding it uniformly to all IO methods may be revisited later if the roadmap changes.
Various enhancements to RDA import