cde package submission #284

robbriers · 2019-02-25T13:51:51Z

Submitting Author: Name (@robbriers)
Repository: https://github.com/robbriers/cde
Version submitted: 0.3.0
Editor: @sckott
Reviewer 1: @boshek
Reviewer 2: TBD
Archive: TBD
Version accepted: TBD

Paste the full DESCRIPTION file inside a code block below:

Package: cde
Type: Package
Title: Download Water Framework Directive (WFD) data from the Environment Agency Catchment Data Explorer (CDE) website
Version: 0.3.0
Authors@R: person("Rob", "Briers", email = "r.briers@napier.ac.uk",
  role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0341-1203"))
Description: The cde package facilitates searching and download of the
  WFD-related data for all waterbodies within the Environment Agency area.
  The types of data that can be downloaded are: WFD status classification data
  (which can also be plotted), Reasons for Not Achieving Good (RNAG) status,
  objectives set for waterbodies, measures put in place to improve water 
  quality and details of associated protected areas. The site accessed is 
  https://environment.data.gov.uk/catchment-planning/. The data are made 
  available under the Open Government Licence v3.0 
  (https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).
Depends:
    R (>= 3.0.0)
License: GPL-3
Encoding: UTF-8
LazyData: true
Imports:
   graphics,
   curl,
   data.table
Suggests:
    testthat,
    knitr,
    rmarkdown
URL: https://github.com/robbriers/cde
BugReports: https://github.com/robbriers/cde/issues
RoxygenNote: 6.1.1
VignetteBuilder: knitr

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- data retrieval
- data extraction
- database access
- data munging
- data deposition
- reproducibility
- geospatial data
- text analysis
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
The package facilitates the download, collation and analysis of Water Framework Directive reporting data from the Environment Agency (England) Catchment Data Explorer website.
Who is the target audience and what are scientific applications of this package?
Any user who wishes to query, download and analyse water quality-related data available from the Catchment Data Explorer website.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
No
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Technical checks

Confirm each of the following by checking the box. This package:

does not violate the Terms of Service of any service it interacts with.
has a CRAN and OSI accepted license.
contains a README with instructions for installing the development version.
includes documentation with examples for all functions.
contains a vignette with examples of its essential functions and uses.
has a test suite.
has continuous integration, including reporting of test coverage using services such as Travis CI, Coveralls and/or CodeCov.

Publication options

Do you intend for this package to go on CRAN?
Do you wish to automatically submit to the Journal of Open Source Software? If so:

JOSS Options

The package has an obvious research application according to JOSS's definition.
- The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
- The package is deposited in a long-term repository with the DOI: https://doi.org/10.5281/zenodo.2577179
- (Do not submit your package separately to JOSS)

Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options

The package is novel and will be of interest to the broad readership of the journal.
The manuscript describing the package is no longer than 3000 words.
You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
(Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
(Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
(Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

The text was updated successfully, but these errors were encountered:

robbriers · 2019-02-26T15:17:55Z

An overnight change to the CDE data has meant that two of the tests fail now - I have just committed a fixed version to the master branch; taken out the fragile tests and also have taken the opportunity to fix some issues highlighted by gp.

sckott · 2019-02-26T17:46:10Z

Thanks for your submission @robbriers - Im your handling editor.

Thanks for the update on tests and such

sckott · 2019-02-27T21:52:28Z

Editor checks:

Fit: The package meets criteria for fit and overlap
Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
License: The package has a CRAN or OSI accepted license
Repository: The repository link resolves correctly
Archive (JOSS only, may be post-review): The repository DOI resolves correctly
Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thanks for your submission @robbriers !

Here's the output from goodpractice. You've already run good practice, and the below only has a note about test coverage.

── GP cde ───────

It is good practice to

  ✖ write unit tests for all functions, and all package code in general. 89% of code lines are covered by test
    cases.

    R/get_objectives.r:112:NA
    R/get_objectives.r:113:NA
    R/get_objectives.r:128:NA
    R/get_pa.r:56:NA
    R/get_rnag.r:106:NA
    ... and 28 more lines

Seeking reviewers now 🕐

Could you please add the rOpenSci under review badge to your README?

[![](https://badges.ropensci.org/284_status.svg)](https://github.com/ropensci/onboarding/issues/284)

Reviewers:

@boshek due date 2019-04-08
@limnoliver due date 2019-04-17

sckott · 2019-03-22T16:54:25Z

Still looking for the 2nd reviewer ...

sckott · 2019-03-27T15:59:16Z

both reviewers assigned

robbriers · 2019-03-27T16:08:28Z

Great, thanks for the update

boshek · 2019-03-30T02:59:27Z

@sckott I'm anticipating some difficulty meeting the April 1st due date and would request moving this ahead one week to April 08. Possible?

sckott · 2019-03-30T03:11:47Z

@boshek Okay, date changed

boshek · 2019-04-04T21:57:20Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Vignette(s) demonstrating major functionality that runs successfully locally
- I think a little more info on the README about usage would be helpful. Right now you are assuming someone clicks through to the pkgdown page
Function Documentation: for all exported functions in R help
Examples for all exported functions in R Help that run successfully locally
- The get_measures function is not working for me
Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
- CONTRIBUTING and CODE_OF_CONDUCT files needs in the repo

For packages co-submitting to JOSS

The package has an obvious research application according to JOSS's definition

The package contains a paper.md matching JOSS's requirements with:

A short summary describing the high-level functionality of the software

Authors: A list of authors with their affiliations

A statement of need clearly stating problems the software is designed to solve and its target audience.

References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Installation: Installation succeeds as documented.
Functionality: Any functional claims of the software been confirmed.
- Almost. I think, however, the get_measures function simply returns an empty dataframe.
Performance: Any performance claims of the software been confirmed.
Automated tests: Unit tests cover essential functions of the package
and a reasonable range of inputs and conditions. All tests pass on the local machine.
Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
- Examples aren't run.

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:
4

Review Comments

I think this is a great package that leverage data access provided by the UK government. It is certainly within rOpenSci' scope for package. The documentation is very good and easy to follow which effectively lower the barrier to entry for a package user. I have some specific comments which are outlined below.

Documentation

On the blog down page, you are linked to the "Getting Started" section even though it is "Get Started". This is a potential source of confusion.

Vignette

I think you could revise this line

and use of this package implies acceptance of these licence conditions

to:

and use of the data accessed by this package implies acceptance of these licence conditions
Consider using remotes for install_github()
Consider using full argument names in the getting started materials.

Code

Consider prefixing every function with cde_*
Consider returning columns as all lower case with underscores instead of decimals
This file: https://github.com/robbriers/cde/blob/master/R/sysdata.rda. Internal data in the package should be included in the data folder.
No LICENSE file. The open data license would license the data used but not the code in the package
Here is a good description of how to do this: http://r-pkgs.had.co.nz/data.html. In addition, that data should be license within the repo (if it ships with package).
Here is an example of that: https://github.com/ropensci/tidyhydat/tree/master/data-raw/HYDAT_internal_data
This chunk of code: https://github.com/robbriers/cde/blob/d0e109a2794a2653b15410926a52d2a1f15469bf/R/utils.r#L37-L51
And others like it could probably benefit from a) some helper functions and b) use of switch.
This worries me: https://github.com/robbriers/cde/blob/d0e109a2794a2653b15410926a52d2a1f15469bf/R/utils.r#L94. Could you instead catch the specific error using tryCatch()?
A helper function called something like cde_data.table or something like that would be good as you are replicating lots of code here. (https://github.com/robbriers/cde/blob/d0e109a2794a2653b15410926a52d2a1f15469bf/R/utils.r#L94-L114)
I think you should consider using data.table's progress bar. When downloads are happening, it is ambiguous to the user if the process is working.
Consider developing print methods (and a class) for cde objects that only print out 10 rows (or just import tibble and use those print methods)
- I think this could also benefit from plot and maybe even summary methods.
Consider letting your examples run so that they are tested with CI

robbriers · 2019-04-05T11:01:17Z

Thanks for the detailed comments - I'll review and start working through.

boshek · 2019-04-05T18:11:08Z

@robbriers Please do reach out if anything is unclear. Happy to back and forth on anything as needed.

robbriers · 2019-04-15T14:14:39Z

@boshek Just a query re license and data. The code is licensed as GPL 3 in the DESCRIPTION file, so I was under the impression that this did not need an explicit LICENSE file to be included?
Re the data, I was intending to keep the data internal to the package, so as per the Hadley Packages site: 'If you want to store parsed data, but not make it available to the user, put it in R/sysdata.rda'. However now I am wondering whether as the internal data were derived from the Open Data source, the licence for this needs to be explicitly stated somewhere?

boshek · 2019-04-15T15:04:39Z

The code is licensed as GPL 3 in the DESCRIPTION file, so I was under the impression that this did not need an explicit LICENSE file to be included

I don't think it is absolutely required but I it is a) a generally accepted practice to include a LICENSE file in the repo and b) provides a path for a non R user (not to mention GitHub) to look at the license. That is, someone wanting to find out how this software is licensed only needs to come to repo and doesn't need to know that one needs to look into the DESCRIPTION file.

Internal data were derived from the Open Data source, the licence for this needs to be explicitly stated somewhere?

It definitely does. My understanding is that regardless of how the data in packaged if you include it you need to identify how it is licensed. I would also encourage you to expose this data. As a prospective user I found it confusing that I couldn't find out more on this data. That's why I suggested not only exposing the data but also providing a data-raw folder. An example I've done is as follows. Here is how I create and provide licensing info for an internal data file

https://github.com/ropensci/tidyhydat/tree/master/data-raw/HYDAT_internal_data

Then here I document the data for context for folks:

https://github.com/ropensci/tidyhydat/blob/e63704360b4cdac9283752659738553e919ff170/R/data.R#L14-L35

Does that help?

robbriers · 2019-04-15T19:33:57Z

Thanks, I have added an explicit licence file.

With the data I would prefer to keep the internal df in the sysdata.rda file, but have added a copy of the data with appropriate licence details to a data-raw folder.

sckott · 2019-04-15T20:37:10Z

👋 @limnoliver reminder that your review is due in a few days

boshek · 2019-04-16T16:37:14Z

@robbriers sounds good. However if that dataframe isn't exposed I think it is worth outlining the workflow on selecting sites. Searching is one option, which you've outlined, but I do see utility in having the site list available for users.

robbriers · 2019-04-17T13:02:25Z

@boshek if I document and expose the data in /data I seem to run into an issue with no visible binding for global variable 'ea_wbids'. I am sure that I have come across this issue before but can't recall/find how to fix it - any advice?

boshek · 2019-04-17T16:02:23Z

What if instead of ea_wbids.r the file describing the data was data.r? Total shot in the dark.

robbriers · 2019-04-18T08:53:09Z

no luck. will keep working on it but progressing with data saved internally for now.

limnoliver · 2019-04-18T12:04:38Z

@sckott apologies for the delay on my review, it will be done today!

boshek · 2019-04-18T16:22:19Z

@robbriers This is a workaround to global variables issue:

.onLoad <- function(libname = find.package("cde"), pkgname = "cde") {
  # CRAN Note avoidance
  if (getRversion() >= "2.15.1")
    utils::globalVariables(
      c("ea_wbids")
    )
  invisible()
}

Just add that bit of code to your zzz.R file.

robbriers · 2019-04-18T17:00:43Z

@boshek I just found this as well! Feels a bit hacky but if that is what it takes. Thanks!

limnoliver · 2019-04-19T19:55:28Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Vignette(s) demonstrating major functionality that runs successfully locally
Function Documentation: for all exported functions in R help
Examples for all exported functions in R Help that run successfully locally
I think the get_measures function is working (did for me below), you've just chosen two examples that don't have any data. Maybe choose one that does? (see my example below)
Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
see @boshek review

For packages co-submitting to JOSS

The package has an obvious research application according to JOSS's definition

The package contains a paper.md matching JOSS's requirements with:

A short summary describing the high-level functionality of the software

Authors: A list of authors with their affiliations

A statement of need clearly stating problems the software is designed to solve and its target audience.

References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Installation: Installation succeeds as documented.
Functionality: Any functional claims of the software been confirmed.
Performance: Any performance claims of the software been confirmed.
Automated tests: Unit tests cover essential functions of the package
and a reasonable range of inputs and conditions. All tests pass on the local machine.
Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
see @boshek review

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 6 (though this was my first time reviewing for ropensci, so a lot of this was getting up to speed on the review process. Thanks for all the great documentation!)

Review Comments

Congrats on providing a useful package that allows for easier and reproducible ways to access and use important water data. Most of my comments are for making it easier for someone who is new to the data get up-to-speed more quickly. I don't think this will take too much effort, and could be as simple as linking to some documentation that likely exists on the WFD website. A few comments:

README

Is the documentation (installation instructions/vignettes/examples/demos) clear and sufficient? Does it use the principle of multiple points of entry i.e. takes into account the fact that any piece of documentation may be the first encounter the user has with the package and/or the tool/data it wraps?

I think the documentation could be improved in regards to describing the data source the package is accessing. In the README, it would be useful to have an extra sentence or two that describes the scope of the data - e.g., data collected by X for the purposes of Y within the U.K. etc. I had never heard of the WFD or the Environment Agency, so I had to take one more step to understand where the data are coming from.

Functionality

To test the functions, I followed the vignettes and created my own examples, and I show those below with comments on individual functions.

lakes <- cde::search_names('Lake', 'name')

It might be useful to note in the search_names documentation that the search is case-sensitive. Also, in the Value section of the help file, it would be helpful to the user to see definitions of the columns that are returned.

bolam_dat <- get_status(lakes$WBID[lakes$name %in% 'Bolam Lake'], 'WBID')
names(bolam_dat)

##  [1] "River.Basin.District"           "Management.Catchment"          
##  [3] "Operational.Catchment"          "Waterbody.ID"                  
##  [5] "water.body"                     "water.body.type"               
##  [7] "easting"                        "northing"                      
##  [9] "ngr"                            "Hydromorphological.designation"
## [11] "Year"                           "Status"                        
## [13] "Cycle"                          "Classification.Level"          
## [15] "Classification.Item"            "classification.ID"             
## [17] "Certainty"                      "Confidence"                    
## [19] "Type"                           "Deterioration.type"            
## [21] "Linked.Reasons"                 "Investigation.outcome"

I agree with the other reviewer that it would be nice to have column names standardized and cleaned up to exclude periods.
Here as well, it would be useful to have more definitions in the Values section of the help file for get_status. If you don't feel like it's within the scope of the package to define the columns that are returned, a link to metadata would be useful.

bolam_oc <- get_status(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC', level = 'Chemical')

I had a hard time understanding "level", and in the get_status help file it says to go to the vignette for more info. I would include a direct link to the vignette section of relevance, or spell out the options in the help file.

plot_status(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC')

It seems like a common workflow would be to get the status data, and then to plot. To do that easily using the package, the user would use get_status and plot_status. I was expecting plot_status to use the output of get_status, but it instead makes a call to get_status within the function and and downloads the data again, so if you want to both see the data in a data frame and plot it, you have to download it twice.

bolam_oc_rnag <- get_rnag(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC')

Again, it would be helpful to have more in-depth description of the returned table, or at least a link to a description for how these classifications are made or descriptions of the columns. (same can be said for get_objectives, get_measures, and get_pa)

bolam_oc_meas <- get_measures(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC')
## No measures data specified - empty dataframe returned

Good warning in the vignette that this often returns no results!

robbriers · 2019-04-22T19:13:42Z

@limnoliver thanks for the review - I will go through the comments and address!

sckott · 2019-04-23T17:00:47Z

thanks for your review @limnoliver !

robbriers · 2019-05-13T08:46:49Z

Ok, took a bit longer than planned, but here we go...
First @boshek review:
get_measures function: There is an issue with the measures data on the CDE website since the last quarterly update - this is being actively worked on by the EA and should be fixed soon. This has been documented in the vignette. but at the moment all calls to this will return an empty dataframe.
Review Comments
I think this is a great package that leverage data access provided by the UK government. It is certainly within rOpenSci' scope for package. The documentation is very good and easy to follow which effectively lower the barrier to entry for a package user. I have some specific comments which are outlined below.

Documentation
On the blog down page, you are linked to the "Getting Started" section even though it is "Get Started". This is a potential source of confusion.

This has been updated on the pkgdown pages to point to "Get Started".

Vignette
I think you could revise this line

and use of this package implies acceptance of these licence conditions

to:

and use of the data accessed by this package implies acceptance of these licence conditions

Revised and expanded to cover data contained within package as well.

Consider using remotes for install_github()

Revised to use remotes

Consider using full argument names in the getting started materials.

All examples now use full argument names.

Code
Consider prefixing every function with cde_*

Kept as they are as I think there is limited scope for confusion with other packages.

Consider returning columns as all lower case with underscores instead of decimals

All column names now lower case with underscores.

This file: https://github.com/robbriers/cde/blob/master/R/sysdata.rda. Internal data in the package should be included in the data folder.

Thanks for your help with this: internal data now included in the data folder and documented in terms of structure and content. Raw csv also available on Github repo in data-raw folder.

No LICENSE file. The open data license would license the data used but not the code in the package
Here is a good description of how to do this: http://r-pkgs.had.co.nz/data.html. In addition, that data should be license within the repo (if it ships with package).
Here is an example of that: https://github.com/ropensci/tidyhydat/tree/master/data-raw/HYDAT_internal_data

LICENSE file added with appropriate details; documentation also updated to highlight licencing of data in several places.

This chunk of code: https://github.com/robbriers/cde/blob/d0e109a2794a2653b15410926a52d2a1f15469bf/R/utils.r#L37-L51
And others like it could probably benefit from a) some helper functions and b) use of switch.

switch is my new best friend! The code highlighted (and several other places) have been broken up to use helper functions and reduce repetition, see https://github.com/robbriers/cde/blob/9fb0be3ac62ce7aa5f9836c7c7c797632bd1e2eb/R/utils.r#L42-L51.

This worries me: https://github.com/robbriers/cde/blob/d0e109a2794a2653b15410926a52d2a1f15469bf/R/utils.r#L94. Could you instead catch the specific error using tryCatch()?

tryCatch() now implemented to deal with the warning generated, see https://github.com/robbriers/cde/blob/9fb0be3ac62ce7aa5f9836c7c7c797632bd1e2eb/R/utils.r#L47-L50.

A helper function called something like cde_data.table or something like that would be good as you are replicating lots of code here. (https://github.com/robbriers/cde/blob/d0e109a2794a2653b15410926a52d2a1f15469bf/R/utils.r#L94-L114)

Code has been broken up into a number of smaller helper functions to reduce replication, see https://github.com/robbriers/cde/blob/9fb0be3ac62ce7aa5f9836c7c7c797632bd1e2eb/R/utils.r#L59-L139

I think you should consider using data.table's progress bar. When downloads are happening, it is ambiguous to the user if the process is working.

Profiling the code has highlighted that it is the download and unzip parts of the process where most of the latency is. curl has been replaced by download.file as the messages given are more informative, and a message also added to highlight continued activity when unzipping is occurring. I have also set showProgress=TRUE for the data.table part, but this was rarely needed on any of the machines that I tested it on.

Consider developing print methods (and a class) for cde objects that only print out 10 rows (or just import tibble and use those print methods)
I think this could also benefit from plot and maybe even summary methods.

Custom print and plot methods associated with a new cde_df class have been implemented for all output from get_... functions. search_names has been retained as a plain df as truncation of values to fit the console width may result in confusion if using the data in the get_... functions.

Consider letting your examples run so that they are tested with CI

All examples now run.

robbriers · 2019-05-13T08:54:25Z

and now @limnoliver ...
Review Comments
Congrats on providing a useful package that allows for easier and reproducible ways to access and use important water data. Most of my comments are for making it easier for someone who is new to the data get up-to-speed more quickly. I don't think this will take too much effort, and could be as simple as linking to some documentation that likely exists on the WFD website. A few comments:

README
Is the documentation (installation instructions/vignettes/examples/demos) clear and sufficient? Does it use the principle of multiple points of entry i.e. takes into account the fact that any piece of documentation may be the first encounter the user has with the package and/or the tool/data it wraps?

I think the documentation could be improved in regards to describing the data source the package is accessing. In the README, it would be useful to have an extra sentence or two that describes the scope of the data - e.g., data collected by X for the purposes of Y within the U.K. etc. I had never heard of the WFD or the Environment Agency, so I had to take one more step to understand where the data are coming from.

A couple of initial paragraphs have been added to the vignette (and README) giving more detail on the context of the WFD and the Environment Agency, plus links to relevant parts of the EA website for more detail.

Functionality
To test the functions, I followed the vignettes and created my own examples, and I show those below with comments on individual functions.

lakes <- cde::search_names('Lake', 'name')

It might be useful to note in the search_names documentation that the search is case-sensitive. Also, in the Value section of the help file, it would be helpful to the user to see definitions of the columns that are returned.

search_names docs now state that this is case-sensitive. The Value section of all functions now links to a reference table, which details the meaning of all of the columns returned, plus links to relevant pages on the CDE website for some sections where more detail is available.

I agree with the other reviewer that it would be nice to have column names standardized and cleaned up to exclude periods.

All column names now standardised as lower case and with underscores.

Here as well, it would be useful to have more definitions in the Values section of the help file for get_status. If you don't feel like it's within the scope of the package to define the columns that are returned, a link to metadata would be useful.

bolam_oc <- get_status(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC', level = 'Chemical')

See response above - links to full details now provided.

I had a hard time understanding "level", and in the get_status help file it says to go to the vignette for more info. I would include a direct link to the vignette section of relevance, or spell out the options in the help file.

I would really prefer to use the term 'element' but the EA column is named classification_level, so kept this to reduce potential confusion. There is now a list of the different values of 'level' in the help file, plus a link to the page on the CDE website that explains the different levels of elements used within the classification.

plot_status(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC')

It seems like a common workflow would be to get the status data, and then to plot. To do that easily using the package, the user would use get_status and plot_status. I was expecting plot_status to use the output of get_status, but it instead makes a call to get_status within the function and and downloads the data again, so if you want to both see the data in a data frame and plot it, you have to download it twice.
bolam_oc_rnag <- get_rnag(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC')

plot methods have now been implemented for all output and detailed in the documentation. The plot_status function has therefore been removed as the output is now covered within the generic 'plot' method.

Again, it would be helpful to have more in-depth description of the returned table, or at least a link to a description for how these classifications are made or descriptions of the columns. (same can be said for get_objectives, get_measures, and get_pa)

See response above - a link to the full reference table is given in each help file.

bolam_oc_meas <- get_measures(lakes$OC[lakes$name %in% 'Bolam Lake'], 'OC')

No measures data specified - empty dataframe returned

Good warning in the vignette that this often returns no results!

There is only limited measures data on the website, however, there is an issue with the measures data on the CDE website since the last quarterly update - this is being actively worked on by the EA and should be fixed soon. This has been documented in the vignette.

sckott · 2019-05-13T17:05:49Z

thanks for your response @robbriers to the reviewers.

@boshek and @limnoliver - are you happy with the maintainer's responses? any further comments?

boshek · 2019-05-13T17:57:28Z

@sckott I say ship it! @robbriers really nice work on the package and the revisions. Really great effort.

limnoliver · 2019-05-14T00:29:46Z

@limnoliver looks great!

sckott · 2019-05-14T20:02:58Z

thanks reviewers! Having a final look ...

sckott · 2019-05-14T20:09:35Z

Approved! Thanks again for your submission @robbriers ! And thanks for your reviews @boshek and @limnoliver 👌

To-dos:

Please transfer the package to ropensci- I've invited you to a team (you need to accept that invitation first, before transferring). You'll be made admin once you transfer
add rOpenSci footer to README
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
Change any needed links, such those for CI badges
Travis CI should update to the new location automatically - you may have to update other CI systems manually
it's a good idea to add issue and PR template files to your .github folder, egs here https://github.com/ropensci/dotgithubfiles/

We've started putting together a bookdown with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved. The repo is at https://github.com/ropensci/dev_guide

Are you interested in doing a blog post for our blog https://ropensci.org/blog/ ? either a short-form intro to it (https://ropensci.org/technotes/) or long-form post with more narrative about its development (https://ropensci.org/blog/). If so, we'll have our community manager @stefaniebutland get in touch with you on that

robbriers · 2019-05-15T11:11:49Z

Great thanks! I have transferred it over and updated CI etc. Thanks also to @boshek and @limnoliver for such helpful and constructive reviews - it has been a great process and I have learned a lot en route!

sckott · 2019-05-15T17:31:50Z

great! glad it's been useful. you have admin access now

sckott · 2019-05-15T17:32:07Z

and let Stef now if you want to do a blog post or not

stefaniebutland · 2019-05-15T17:37:38Z

@robbriers Here are the blog post guidelines https://github.com/ropensci/roweb2#contributing-a-blog-post

Happy to answer any questions.

sckott added package 1/editor-checks labels Feb 25, 2019

sckott self-assigned this Feb 25, 2019

sckott added topic:data-access topic:earth-science labels Feb 25, 2019

sckott added 2/seeking-reviewer(s) and removed 1/editor-checks labels Feb 27, 2019

sckott added 3/reviewer(s)-assigned and removed 2/seeking-reviewer(s) labels Mar 27, 2019

sckott added 4/review(s)-in-awaiting-changes and removed 3/reviewer(s)-assigned labels Apr 23, 2019

sckott added 5/awaiting-reviewer(s)-response and removed 4/review(s)-in-awaiting-changes labels May 13, 2019

sckott added 6/approved and removed 5/awaiting-reviewer(s)-response labels May 14, 2019

sckott closed this as completed May 15, 2019

karthik mentioned this issue Jun 26, 2019

JOSS Review: Issues to address ropensci/cde#23

Closed

cde package submission #284

cde package submission #284

Comments

robbriers commented Feb 25, 2019 • edited by sckott Loading

Scope

Technical checks

Publication options

Code of conduct

robbriers commented Feb 26, 2019

sckott commented Feb 26, 2019

sckott commented Feb 27, 2019 • edited Loading

Editor checks:

Editor comments

sckott commented Mar 22, 2019

sckott commented Mar 27, 2019

robbriers commented Mar 27, 2019

boshek commented Mar 30, 2019

sckott commented Mar 30, 2019

boshek commented Apr 4, 2019 • edited Loading

Package Review

Documentation

For packages co-submitting to JOSS

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 4

Review Comments

Documentation

Vignette

Code

robbriers commented Apr 5, 2019

boshek commented Apr 5, 2019

robbriers commented Apr 15, 2019

boshek commented Apr 15, 2019

robbriers commented Apr 15, 2019

sckott commented Apr 15, 2019

boshek commented Apr 16, 2019

robbriers commented Apr 17, 2019

boshek commented Apr 17, 2019

robbriers commented Apr 18, 2019

limnoliver commented Apr 18, 2019

boshek commented Apr 18, 2019

robbriers commented Apr 18, 2019

limnoliver commented Apr 19, 2019

Package Review

Documentation

For packages co-submitting to JOSS

Functionality

Final approval (post-review)

Review Comments

README

Functionality

robbriers commented Apr 22, 2019

sckott commented Apr 23, 2019

robbriers commented May 13, 2019 • edited Loading

robbriers commented May 13, 2019 • edited Loading

No measures data specified - empty dataframe returned

sckott commented May 13, 2019

boshek commented May 13, 2019

limnoliver commented May 14, 2019

sckott commented May 14, 2019

sckott commented May 14, 2019

robbriers commented May 15, 2019

sckott commented May 15, 2019

sckott commented May 15, 2019

stefaniebutland commented May 15, 2019

robbriers commented Feb 25, 2019 •

edited by sckott

Loading

sckott commented Feb 27, 2019 •

edited

Loading

boshek commented Apr 4, 2019 •

edited

Loading

Estimated hours spent reviewing:
4

robbriers commented May 13, 2019 •

edited

Loading

robbriers commented May 13, 2019 •

edited

Loading