Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_csv vs read.csv #710

Closed
jpgannon opened this issue Apr 16, 2021 · 5 comments · Fixed by #736
Closed

read_csv vs read.csv #710

jpgannon opened this issue Apr 16, 2021 · 5 comments · Fixed by #736
Assignees
Labels
type:clarification Suggest change for make lesson clearer type:instructor guide Issue with the instructor guide

Comments

@jpgannon
Copy link

When introducing read_csv() I think it could be helpful to mention that read.csv() exists and is different. I often have students confuse the two and then have issues later, especially when they have dates in their data.

@vanilink
Copy link

vanilink commented May 9, 2021

I agree that there may be confusion for new learners with regards to read_csv() and read.csv(). In fact this and related issue s have been discussed multiple times before, e. g. #268 #335 #420 #480 #553 #555 #663 #682 (side note, you can search for previous issues by removing the is:open tag from the issues filters in GitHub).
From my understanding, as of #663 the whole lesson switched over to the tidyverse function readr:_read_csv() However, I agree with @jpgannon and others that there may still be possible confusion, both on the instructor and learner side, especially when considering coding outside of the carpentries. This may further be a good opportunity to alert learners new to coding that exact spelling is extremely crucial (another opportunity is the before-mentioned "View()" vs "view()" debate).
Here is the current wordage of the "Starting with data" episode:

Reading the data into R
The file has now been downloaded to the destination you specified, but R has not yet loaded the data from the file into memory. To do this, we can use the read_csv() function from the tidyverse package.

Packages in R are basically sets of additional functions that let you do more stuff. The functions we’ve been using so far, like round(), sqrt(), or c(), come built into R; packages give you access to additional functions. Before you use a package for the first time you need to install it on your machine, and then you should import it in every subsequent R session when you need it.

I suggest to add some wording for clarification and to introduce the term "base R", such as:

Reading the data into R
The file has now been downloaded to the destination you specified, but R has not yet loaded the data from the file into memory. To do this, we can use the read_csv() function from the tidyverse package.

Packages in R are basically sets of additional functions that let you do more stuff. The functions we’ve been using so far, like round(), sqrt(), or c(), come built into R; packages give you access to additional functions beyond base R. The corresponding function to read_csv() from the tidyverse package is read.csv() from base R. We don't have time to cover their differences but notice that the exact spelling determines which function is used. Before you use a package for the first time you need to install it on your machine, and then you should import it in every subsequent R session when you need it.

So far, this information is contained within the instructor notes as such:

Clarify the differences between the functions read_csv() (used in this lesson) and read.csv() (used in the previous lesson).
Note: If students end up with 30521 rows for surveys_complete instead of the expected 30463 rows at the end of the chapter, then they have likely used read.csv() and not read_csv() to import the data
When explaining view(), consider mentioning that is a function of the tibble package, and that the base function View() can also be used to view a data frame.

From my understanding, these notes also refer to a previous iteration of the lesson where read.csv was still used and should thus be updated to reflect that change as well.

@Teebusch
Copy link
Contributor

Teebusch commented Jul 5, 2021

Hi @jpgannon @vanilink
I've made a PR with the changes suggested by @vanilink , #736
Would you kindly have a look if you agree with them?

@fmichonneau
Copy link
Member

sorry I merged the PR before seeing this message. Please let us know if you have any feedback on the changes made by @Teebusch, @jpgannon and @vanilink

@jpgannon
Copy link
Author

jpgannon commented Jul 8, 2021 via email

@vanilink
Copy link

vanilink commented Jul 8, 2021

Thank you so much @jpgannon @Teebusch @fmichonneau ! What a smooth experience. I'm glad we could improve the clarity of the materials together. No more comments from my side :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:clarification Suggest change for make lesson clearer type:instructor guide Issue with the instructor guide
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants