Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove C code #47

Merged
merged 37 commits into from
Sep 23, 2022
Merged

Remove C code #47

merged 37 commits into from
Sep 23, 2022

Conversation

dieghernan
Copy link
Member

@dieghernan dieghernan commented Jan 14, 2022

Hi:

Following #46 and related to #45 , this PR removes the C code of the package, substituting it by pure R (base) code. I am aware this is a critical change to the package, so let me share some points of attention on this PR:

  • Version bump to make clear that this is a dev version
  • trimws has been updated to the latest version of the same function on R, see https://github.com/wch/r-source/blob/5ab79ec84040684c74dc9c901fde944fff6e8375/src/library/base/R/strwrap.R#L219-L229
  • Some parameters have been deprecated:
    • src in do_read_bib() is not needed any more
    • read.bib(): header and footer. They were never actually used, see:

      bibtex/R/bibtex.R

      Lines 162 to 170 in 05a401c

      make.citation.list <- function( x, header, footer){
      rval <- list()
      for( i in seq_along(x) ){
      if( !is.null(x[[i]] ) )
      rval <- c( rval, x[[i]] )
      }
      class(rval) <- c( "bibentry" )
      rval
      }
  • A whole set of functions have been added to utils.R. This is actually the replacement of C in R code.
  • An url on README has been updated following the urlcheck package.
  • There is a change on a snapshot. Actually now is correct, before was ommiting a part on the string:
@String{Rnews = "http://CRAN.R-project.org/doc/Rnews/" }
@String{Rnews2001-1 = rnews # "Rnews_2001-1.pdf"}
@String{Rnews2001-2 = rnews # "Rnews_2001-2.pdf"}
@String{Rnews2001-3 = rnews # "Rnews_2001-3.pdf"}

So Rnews2001-3 should be parsed as rnews # "Rnews_2001-3.pdf (http://CRAN.R-project.org/doc/Rnews/Rnews_2001-3.pdf)

Comments welcome

This would also close #42 , fix #16

DESCRIPTION Outdated
@@ -1,5 +1,5 @@
Package: bibtex
Version: 0.4.3
Version: 0.4.3.9000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump, this can be undone

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go 0.5.0 for the next CRAN release and, then, bump to 1.0.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

R/bibtex.R Outdated
trimws <- function(x) {
sub("^[[:space:]]+", "", sub("[[:space:]]$", "", x))
# See https://github.com/wch/r-source/blob/trunk/src/library/base/R/strwrap.R
trimws <- function(x, which = c("both", "left", "right"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated version of the backport

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider a dependency on the {backports} package to automatically get up-to-speed? Though, as it stands now, we only have a dependency on one base package: utils.

That said, we probably should bump the minimum version to at least R 3.6.* as we are not really testing if it works on R 3.0. Something to ponder for later on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I just added the backports, but agree on bumping the minimal R version as well

@@ -196,10 +207,93 @@ findBibFile <- function(package) {
#'
#' @param file file name
#' @param encoding encoding
#' @param srcfile output of \code{\link{srcfile}}
#' @param srcfile Deprecated
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed nor used any more

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also modify the function signature to use '' and write a quick diagnostic message:

  if (!missing("srcfile"))
    message("`srcfile` argument is deprecated.")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I also added a couple of tests on this

#' the bib file.
#' @param footer footer of the citation list
#' @param header DEPRECATED.
#' @param footer DEPRECATED
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those were never used

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous note on deprecating arguments.

If a previous argument was never used, perhaps we could also just exclude these arguments altogether in the next version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I would favor removing those on the next major version (1.0.0) just in case. For v0.5.0 I prefer to leave them, so we can get feedback

@@ -38,4 +38,4 @@ donate it to rOpenSci.
- Get citation information for `bibtex` in R with `citation("bibtex")`
- Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms.

[![ropensci](http://ropensci.org/public_images/github_footer.png)](http://ropensci.org)
[![ropensci](https://ropensci.org//public_images/github_footer.png)](https://ropensci.org/)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an update of the url, see urlchecker::url_update()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for running the {urlchecker}!

@@ -281,7 +281,7 @@
pages = {20--23},
month = {September},
url = {http://CRAN.R-project.org/doc/Rnews/},
pdf = {rnewsRnews_2001-3.pdf},
pdf = {http://CRAN.R-project.org/doc/Rnews/Rnews_2001-3.pdf},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now correct, before it just parsed the last part of the String, see https://ctan.javinator9889.com/biblio/bibtex/base/btxdoc.pdf:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching the update!

@@ -1,12 +1,10 @@
test_that("Unbalanced braces", {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we can test fatal errors safely, R does not crash any more

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huzzah!

@dieghernan
Copy link
Member Author

I added some comments on the review tab to make easier to understand some things @coatless

Copy link
Collaborator

@coatless coatless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive work with the regular expressions! Reading through the regex, I'm not sure we'll unintentionally introduce another problem; but, the users will surely let us know.

Outside of that, there are minor cleanups likely needed for long-term maintainability.

R/bibtex.R Outdated
trimws <- function(x) {
sub("^[[:space:]]+", "", sub("[[:space:]]$", "", x))
# See https://github.com/wch/r-source/blob/trunk/src/library/base/R/strwrap.R
trimws <- function(x, which = c("both", "left", "right"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider a dependency on the {backports} package to automatically get up-to-speed? Though, as it stands now, we only have a dependency on one base package: utils.

That said, we probably should bump the minimum version to at least R 3.6.* as we are not really testing if it works on R 3.0. Something to ponder for later on.

@@ -196,10 +207,93 @@ findBibFile <- function(package) {
#'
#' @param file file name
#' @param encoding encoding
#' @param srcfile output of \code{\link{srcfile}}
#' @param srcfile Deprecated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also modify the function signature to use '' and write a quick diagnostic message:

  if (!missing("srcfile"))
    message("`srcfile` argument is deprecated.")

NULL
},
error = function(e) {
stop("Error when parsing strings of ",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we could indicate which line was problematic? Anything to make the error message more informative would be ideal here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are already doing that, see this reprex. Do you wish to improve it in any way?

# Message on parsing entries
f <- file.path(system.file("bib", "badFormat.bib", package = "bibtex"))
bib <- read.bib(f)
#> ignoring entry 'brown:1963' (line 6) because :
#>  A bibentry of bibtype 'Book' has to specify the field: title
#> ignoring entry 'chambers+cleveland+kleiner+tukey:1983' (line 21) because :
#>  Invalid author/editor format.

# On unbalanced braces we throw an error indicating the problematic line
f2 <- system.file("bib", "unbalanced_braces.bib", package = "bibtex")
bib2 <- read.bib(f2)
#> Error: Unbalanced braces on entry (line 9). Invalid .bib file
#> Error: Invalid bib file
file <- f2

See

@book{brown:1963,
author = "Brown, R. G.",
year = 1963,
publisher = "Prentice-Hall"
}
and
@misc{murdoch:2010,
author = "Duncan Murdoch",
title = {"Parsing {Rd} files unbalanced",
year = 2010,
url = "http://developer.r-project.org/parseRd.pdf"
}

#' the bib file.
#' @param footer footer of the citation list
#' @param header DEPRECATED.
#' @param footer DEPRECATED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous note on deprecating arguments.

If a previous argument was never used, perhaps we could also just exclude these arguments altogether in the next version?

#'
#' Returns a table with all the string values and their value to be used on the
#' `bib` file.
#'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add an @param describing input? This is for long-term maintenance benefits as the function is internal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added documentation of params and also some examples on each internal function, hopefully that would help maintaining them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the examples also as tests on test-utils.R

#' Parse a single entry
#'
#' Returns an entry from a .bib file parsed into an R object
#'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some documentation on parameters. (Even though it's an internal function.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, this is now documented and with examples

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the examples are also part of the tests

@@ -281,7 +281,7 @@
pages = {20--23},
month = {September},
url = {http://CRAN.R-project.org/doc/Rnews/},
pdf = {rnewsRnews_2001-3.pdf},
pdf = {http://CRAN.R-project.org/doc/Rnews/Rnews_2001-3.pdf},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching the update!

@@ -1,12 +1,10 @@
test_that("Unbalanced braces", {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huzzah!

DESCRIPTION Outdated
@@ -1,5 +1,5 @@
Package: bibtex
Version: 0.4.3
Version: 0.4.3.9000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go 0.5.0 for the next CRAN release and, then, bump to 1.0.0.

|magrittr |2.0.1 |NA |* |
|stringi |1.7.6 |NA |* |
|stringr |1.4.0 |NA |* |

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there are any dependencies w.r.t. stringr / stringi / glue / magrittr?

Copy link
Member Author

@dieghernan dieghernan Jan 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is funny: On CRAN version, bibtex imports stringr (¿?), that imports glue, magritt and stringi, so I guess thats why they appear on the summary

https://cran.r-project.org/web/packages/bibtex/index.html
https://cran.r-project.org/web/packages/stringr/index.html

@dieghernan
Copy link
Member Author

Thanks for this useful review. I would set this as a Draft until I tackle all your comments

@dieghernan dieghernan marked this pull request as draft January 16, 2022 08:32
@dieghernan dieghernan marked this pull request as ready for review January 17, 2022 09:30
@dieghernan
Copy link
Member Author

Hi @coatless , I think I tackled all your comments, ready for review

@dieghernan
Copy link
Member Author

Hi @coatless , just a quick reminder on this

@maelle
Copy link
Member

maelle commented Sep 16, 2022

👋 @coatless @dieghernan, what's the status on this? Thanks both for your work on {bibtex} 🙏

Some snapshots changes due to changes on toBibTex() on later versions of R, not related with the package
@dieghernan
Copy link
Member Author

Hi @maelle

I re-runned checks, examples and snapshots after 8 months and I see no problems, so this is ready on my side. This PR would close #16 and close #42

@maelle
Copy link
Member

maelle commented Sep 16, 2022

Thanks @dieghernan!
@coatless would you be able to review this PR?

@coatless
Copy link
Collaborator

@dieghernan thanks for addressing the remaining issues. Sorry this sat for so long!

@coatless coatless merged commit e6f9720 into ropensci:master Sep 23, 2022
@coatless coatless mentioned this pull request Sep 26, 2022
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants