-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getCRUCLdata: Download and Use CRU CL2.0 Climatology Data in R #96
Comments
Editor checks:
Editor commentsCurrently seeking reviewers. It's a good fit and not overlapping.
Reviewers: @ldecicco-USGS @ivanhanigan |
@sckott, I'm working on getting the test time down. I've managed to reduce it by 1/2 already and keep the code coverage the same. I'll see what I can do further. |
thanks @adamhsparks - thanks for trying! |
reviewers assigned, see #96 (comment) |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing: Review CommentsInstallationI'm not sure if this is actually an issue with the package, but when I tried to install via CRAN, there was an error via the Windows binary:
I could install via source just fine. This could be an issue with my CRAN mirror or something else...but I haven't experienced it before. VignetteThe first thing I tried to do was run the first example in the vignette, and was unsuccessful:
I then tried setting a single parameter to TRUE, and still could not get data to return:
ExamplesI had the same problem running the examples from the help files as running the examples from the vignette. TestsSimilarly, the tests via TroubleshootingWhen I went to the URLs to test the service, they worked fine. I use
|
Thanks for your review @ldecicco-USGS ! |
Thanks for the review @ldecicco-USGS. It's interesting. I'm not able to replicate the issue with installation because unfortunately, I don't have a Windows machine to test on, I rely on Winbuilder and Appveyor to alert me to any issues. I checked the zip package from CRAN, if I'm reading the error correctly it says that there was no DESCRIPTION file. However, the zip file I downloaded had one in it. Moving on to the other items, I'm also unable to replicate the issues. I just ran the code that was pasted here that timed out. It worked and the tests pass on my computer, Travis and Appveyor when downloading, so I assume everything is fine. I wonder if there's a network issue that is causing problems? I'll look at using |
@ldecicco-USGS, I've modified the download function to use |
Package Review for getCRUCLdataPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing:
Review CommentsThis is a useful R package that is well written and provides access to an important data collection. Major commentstest 1
|
@ivanhanigan Thank you for a thorough and helpful review. I've started addressing a few of the more minor issues that you've pointed out in the "devel" branch and will continue to work through these suggestions. |
thanks for your review @ivanhanigan ! |
@ivanhanigan and @ldecicco-USGS, HighlightsMajor Changes
Minor Changes
Detailed discussionRegarding the parallelisation. I've intentionally not gone this route due to cross-platform issues and different processors making things difficult with little reward based on experience with another R package. However, I have used Regarding the downloading issues, I've included two new functions that work with local files only and do not rely on R to download the files. Since CRAN guidelines specify that an R package should not write to disk (other than R's Regarding the saving data when downloading, the functions now only download files that do not exist in the current R session. If you download temperature and create a stack, when creating a data frame of temperature will not trigger a new download unless you restart your R session. Regarding the Regarding the 10 minute (second) issues. I've corrected the documentation and added the values both in arcminutes (the original from New et al.) and as a decimal value of degrees. I've updated the paper.md file to be more detailed and addressed other comments regarding functionality and documentation as suggested. Thank you both, @ldecicco-USGS and @ivanhanigan, again for the very constructive and helpful comments. They have helped me make this a much more robust R package that hopefully benefits R users. |
Thanks for the changes @adamhsparks @ivanhanigan @ldecicco-USGS are you two happy with the changes? |
Thanks @adamhsparks for the changes. I am not able to look at this for a
few days, and I cannot check on windows like @ldecicco-USGS reviewed with.
I'll get back to you as soon as I can, probably early next week.
Thanks again Adam.
…On Wed, Mar 1, 2017 at 7:59 AM, Scott Chamberlain ***@***.***> wrote:
Thanks for the changes @adamhsparks <https://github.com/adamhsparks>
@ivanhanigan <https://github.com/ivanhanigan> @ldecicco-USGS
<https://github.com/ldecicco-USGS> are you two happy with the changes?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#96 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPsp5E4bK1JRyiXwm27Q7s8ltQHhtPjks5rhIqngaJpZM4Lx5w2>
.
|
I'll have a look sometime this week! Hopefully this afternoon if meetings pan out as scheduled. |
Hello @ldecicco-USGS and @ivanhanigan, please don't rush to check these updates. I'm planning more. I just found out about the However, I think that this will provide a much nicer end-user experience. |
thanks for the update @adamhsparks see also https://github.com/ropensci/hoardr which is not quite to cran yet - wraps rappdirs to be a little more user friendly, see eg usage here https://github.com/ropenscilabs/cmipr/blob/master/R/onLoad.R and then you just export and document the caching object https://github.com/ropenscilabs/cmipr/blob/master/R/caching.R created in onload no pressure to try |
@ldecicco-USGS do you have an estimate for how long your review took (for our records) |
I'll be happy to look again when @adamhsparks does his next round of updates (see 2 comments up). I think the first round only took ~1 hour. |
okay, thanks @ldecicco-USGS |
Ok folks. I've had another go at this. I've added the ability to cache files in the users' home filespace as I mentioned and cleaned up a few other minor items while I was at it. Please have another look and let me know of any issues or suggestions. |
Adam.
The new README looks great. I'll try to get to this in the next few days.
…On 15 Mar 2017 5:33 pm, "Adam H. Sparks" ***@***.***> wrote:
Ok folks. I've had another go at this.
I've added the ability to cache files in the users' home filespace as I
mentioned and cleaned up a few other minor items while I was at it.
Please have another look and let me know of any issues or suggestions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#96 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPsp4Ku2scAxtzjPPC1alUBzIzcQUUBks5rl4Y6gaJpZM4Lx5w2>
.
|
Same here, hopefully today or tomorrow. |
Hi Adam thanks for the new release of the But thought I'd just check in on something I think is worth discussing re cached files of a one-off download of the data vs repeated downloading on-the-fly. When I looked at the code in the readme, it was obvious `~/Downloads' does not work.
So I checked the function and found downloading is actually to this directory:
It seems to me that this location will not be very clear to users of the program if asked `where is the CRU data you downloaded as a one-off'? It seems like the package assumes people will download the data whenever the function is invoked - 'on-the-fly'. My concern I wanted to raise before proceeding with my review is that data downloading as a 'one-off' might considered more important by the user, and being prompted to choose a place to store the result for further management, versus a 'on-the-fly' data download to a temporary/system location crated for the current session. So I just thought I'd flag this issue now and suggest the following: What about making PS I noticed a duplicated chunk of text in this
All the best, and I will continue the review this week - in between other work responsibilities. Thanks Adam! I |
Hi @ivanhanigan, It seems to work properly for me, could you give me the line numbers in the code that you think are incorrect? Is there a need to rename the functions so that the differences are more clear? PS |
Hi,
Or do you recommend that users store the raw data in the |
@ivanhanigan, it's not intended for the user to move data at all. It's intended if a user has network issues, to use an external FTP program was previously suggested. The the help for
The caching works only with the |
I've tried to revise the README text for clarity. |
I pulled the latest changes from the master branch here and built from that. When I tried the first example in the vignette:
I thought maybe that was because the vignette maybe is dated? So I tried the first example for that function:
So...I need to create some actual path in the dsn argument. I did this:
So, I think I agree with @ivanhanigan that the documentation on what I'm suppose to put in the |
@ldecicco-USGS, This is the first example in the vignette: library(getCRUCLdata)
CRU_data <- get_CRU_df(pre = TRUE,
pre_cv = TRUE,
rd0 = TRUE,
tmp = TRUE,
dtr = TRUE,
reh = TRUE,
tmn = TRUE,
tmx = TRUE,
sunp = TRUE,
frs = TRUE,
wnd = TRUE,
elv = TRUE,
cache = TRUE) The equivalent for library(getCRUCLdata)
CRU_data <- create_CRU_df(pre = TRUE,
pre_cv = TRUE,
rd0 = TRUE,
tmp = TRUE,
dtr = TRUE,
reh = TRUE,
tmn = TRUE,
tmx = TRUE,
sunp = TRUE,
frs = TRUE,
wnd = TRUE,
elv = TRUE,
dsn = "~/Downloads") Where you downloaded the CRU files external to R and put them in "~/Downloads". The function worked properly as designed telling you that you didn't tell it where to find the files to process. |
@ldecicco-USGS, Thanks! |
Hi again,
I have re-read the README section "Creating tidy data frames for use in R"
and it does make a lot of sense thanks.
I appreciate the way the package is trying to accommodate a variety of
different data downloading/storage/networking scenarios.
I think that you've done a great job of making this flexible for that, but
maybe the additional detail that needs to be discussed for
caching/FTP-FileZilla is confusing and could be put in a later section such
as "advanced usage/special circumstances" section and the more visible
sections of README and vignettes just show the simple application of
running the `get_CRU_df()` function.
Then you could move into a discussion about how useless repeatedly
downloading copies of the same data, OR internet connectivity problems
might lead you to want to use FileZilla or caching?
I'll move on to the next components of the documentation tomorrow. I think
that if you don't want to make more changes then the current approach to
caching/downloading is adequate. Up to you mate!
Cheers, Ivan
…On Tue, Mar 28, 2017 at 1:13 PM, Adam H. Sparks ***@***.***> wrote:
@ldecicco-USGS <https://github.com/ldecicco-USGS>,
I've added more description to the documentation, but I still don't
understand the first example that you posted. I don't know where that comes
from?
Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#96 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPsp6YLPeeSJpyws22vmQHbY5KgBvesks5rqGzegaJpZM4Lx5w2>
.
|
OK, I think what was happening with my first example is that I was using the older version of the vignette. I re-built the vignettes and got things working. I think where there could be confusion (there was from me at least...), that might easily be straightened out for the user, is have the very first example in the vignette be how to actually download the data. The vignette starts with "Using getCRUCLdata", which is definitely the cool part of the package, but maybe having a section above that "Getting CRUCL Data" with the explicit instruction downloading for the first time would be helpful. Then, all the examples with Otherwise, looks good! A minor suggestion that might help people like me (who..maybe...have been known to only skim through the instructions 😓 )....I kept trying to run |
@ldecicco-USGS that's a good suggestion, I'm guilty of skimming documentation too. I'll implement that change. I'll reorganise the documentation as you and @ivanhanigan have suggested. You've both been very helpful, thank you. |
Hi folks, I've edited the README and vignette to include a quick start and a more advanced section on caching and using No changes to functionality, only documentation. I've also linted the package and fixed some errors in the documentation with typos, "a a", etc. |
is all taken care of @adamhsparks ? If so, i'll take a final look |
@sckott, I hope that I've addressed @ldecicco-USGS' and @ivanhanigan's comments re: documentation. I'm still trying to change the error message when the DSN doesn't exist or files aren't found. The error stops and works, but the issue I'm having is with getting it to pass the test. :/ |
@sckott, I think we're good. I've tidied it up a bit more here and there. Have a look now when you have time. |
@adamhsparks Looks good to me. approved :) i assume you know the drill from here? |
I think I do. I've added the footer and have transferred it over to rOpenSci. Thanks for your time, suggestion and effort @ivanhanigan, @ldecicco-USGS and @sckott. Once again, the rOpenSci review process really made the package a much better offering. |
thx for your submission! |
Summary
What does this package do? (explain in 50 words or less):
The getCRUCLdata package provides two functions that automate downloading and importing CRU CL2.0 climatology data, facilitates the calculation of minimum temperature and maximum temperature, and formats the data into a tidy data frame or a list of raster stack objects for use in an R session or easily exports to a raster format file for use in a geographic information system (GIS).
Paste the full DESCRIPTION file inside a code block below:
URL for the package (the development repository, not a stylized html page):
https://github.com/adamhsparks/getCRUCLdata
Who is the target audience?
Anyone interested in using climate data or generating raster files from the CRU CL2.0 data that are easy to use in a GIS
Are there other R packages that accomplish the same thing? If so, what is different about yours?
Not that I'm aware of
Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
with a high-level description in the package root or ininst/
.Detail
Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
Ivan Hanigan, https://github.com/ivanhanigan
The text was updated successfully, but these errors were encountered: