Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Renaming factors" section does not reflect true results of running these commands #728

Closed
rkmeade opened this issue Jun 24, 2021 · 1 comment

Comments

@rkmeade
Copy link

rkmeade commented Jun 24, 2021

Hi Maintainers!

A quick comment on the "Renaming factors" section, which does not work on my console the same way the episode says that it should.

Beginning with the first command, plot(surveys$sex), my console plots the ~1700 missing values as their own column (which is not reflected on the plot in the episode). I believe this is because instead of NA, it recognizes a third category of values, designated "".

In the next set of commands:
sex <- surveys$sex
levels(sex)

The lesson says the output should be:
[1] "F" "M"

This is what I get:
[1] "" "F" "M"

In the next code block, a new category for missing values is added:
sex <- addNA(sex)
levels(sex)

The lesson says the output should be:
[1] "F" "M" NA

My output now has two equivalents of missing values:
[1] "" "F" "M" NA

I believe all downstream errors can be remediated by running this before the initial plot command:
levels(sex)[1] <- NA

I hope this is helpful!

-- Rachel

@Teebusch
Copy link
Contributor

Teebusch commented Jul 5, 2021

Hi @rkmeade, thank you for raising this issue. Using the code from the lesson, it runs as expected. See reproducible example below. However, I can replicate your issue by using read.csv() (base R) instead of read_csv() (tidyverse, used in the lesson). This is an easy to make mistake that has been brought up a few times (e.g., #710). We could probably do a better job at preventing this.

Correct output, using read_csv()

## Loading the survey data
# modified slightly, for reprex to work
library(tidyverse)
surveys <- read_csv("https://ndownloader.figshare.com/files/2292169")
#> 
#> -- Column specification --------------------------------------------------------
#> cols(
#>   record_id = col_double(),
#>   month = col_double(),
#>   day = col_double(),
#>   year = col_double(),
#>   plot_id = col_double(),
#>   species_id = col_character(),
#>   sex = col_character(),
#>   hindfoot_length = col_double(),
#>   weight = col_double(),
#>   genus = col_character(),
#>   species = col_character(),
#>   taxa = col_character(),
#>   plot_type = col_character()
#> )

# ...

## Factors
surveys$sex <- factor(surveys$sex)

# ...

### Renaming factors

plot(surveys$sex)

sex <- surveys$sex
levels(sex)
#> [1] "F" "M"
sex <- addNA(sex)
levels(sex)
#> [1] "F" "M" NA
head(sex)
#> [1] M    M    <NA> <NA> <NA> <NA>
#> Levels: F M <NA>
levels(sex)[3] <- "undetermined"
levels(sex)
#> [1] "F"            "M"            "undetermined"
head(sex)
#> [1] M            M            undetermined undetermined undetermined
#> [6] undetermined
#> Levels: F M undetermined
plot(sex)


Unexpected output, using read.csv()

## Loading the survey data
# modified slightly, for reprex to work
library(tidyverse)
surveys <- read.csv("https://ndownloader.figshare.com/files/2292169")

# ...

## Factors
surveys$sex <- factor(surveys$sex)

# ...

### Renaming factors

plot(surveys$sex)

sex <- surveys$sex
levels(sex)
#> [1] ""  "F" "M"
sex <- addNA(sex)
levels(sex)
#> [1] ""  "F" "M" NA
head(sex)
#> [1] M M        
#> Levels:  F M <NA>
levels(sex)[3] <- "undetermined"
levels(sex)
#> [1] ""             "F"            "undetermined"
head(sex)
#> [1] undetermined undetermined                                       
#> [6]             
#> Levels:  F undetermined
plot(sex)

Created on 2021-07-05 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.4 (2021-02-15)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United Kingdom.1252 
#>  ctype    English_United Kingdom.1252 
#>  tz       Europe/Paris                
#>  date     2021-07-05                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date       lib source        
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.2)
#>  backports     1.2.1   2020-12-09 [1] CRAN (R 4.0.3)
#>  broom         0.7.6   2021-04-05 [1] CRAN (R 4.0.4)
#>  cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.0.2)
#>  cli           2.5.0   2021-04-26 [1] CRAN (R 4.0.4)
#>  colorspace    2.0-1   2021-05-04 [1] CRAN (R 4.0.5)
#>  crayon        1.4.1   2021-02-08 [1] CRAN (R 4.0.4)
#>  curl          4.3.1   2021-04-30 [1] CRAN (R 4.0.5)
#>  DBI           1.1.1   2021-01-15 [1] CRAN (R 4.0.3)
#>  dbplyr        2.1.1   2021-04-06 [1] CRAN (R 4.0.5)
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  dplyr       * 1.0.6   2021-05-05 [1] CRAN (R 4.0.4)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.0.5)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.2)
#>  fansi         0.4.2   2021-01-15 [1] CRAN (R 4.0.3)
#>  forcats     * 0.5.1   2021-01-27 [1] CRAN (R 4.0.3)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
#>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.2)
#>  ggplot2     * 3.3.3   2020-12-30 [1] CRAN (R 4.0.3)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  gtable        0.3.0   2019-03-25 [1] CRAN (R 4.0.2)
#>  haven         2.4.1   2021-04-23 [1] CRAN (R 4.0.5)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.0.4)
#>  hms           1.0.0   2021-01-13 [1] CRAN (R 4.0.3)
#>  htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 4.0.3)
#>  httr          1.4.2   2020-07-20 [1] CRAN (R 4.0.2)
#>  jsonlite      1.7.2   2020-12-09 [1] CRAN (R 4.0.3)
#>  knitr         1.33    2021-04-24 [1] CRAN (R 4.0.5)
#>  lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.0.4)
#>  lubridate     1.7.10  2021-02-26 [1] CRAN (R 4.0.4)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  mime          0.10    2021-02-13 [1] CRAN (R 4.0.4)
#>  modelr        0.1.8   2020-05-19 [1] CRAN (R 4.0.2)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.0.2)
#>  pillar        1.6.0   2021-04-13 [1] CRAN (R 4.0.5)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
#>  ps            1.6.0   2021-02-28 [1] CRAN (R 4.0.5)
#>  purrr       * 0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.2)
#>  Rcpp          1.0.6   2021-01-15 [1] CRAN (R 4.0.3)
#>  readr       * 1.4.0   2020-10-05 [1] CRAN (R 4.0.3)
#>  readxl        1.3.1   2019-03-13 [1] CRAN (R 4.0.2)
#>  reprex        2.0.0   2021-04-02 [1] CRAN (R 4.0.5)
#>  rlang         0.4.11  2021-04-30 [1] CRAN (R 4.0.5)
#>  rmarkdown     2.8     2021-05-07 [1] CRAN (R 4.0.5)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.0.3)
#>  rvest         1.0.0   2021-03-09 [1] CRAN (R 4.0.4)
#>  scales        1.1.1   2020-05-11 [1] CRAN (R 4.0.2)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.2)
#>  stringi       1.6.1   2021-05-10 [1] CRAN (R 4.0.4)
#>  stringr     * 1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
#>  styler        1.4.1   2021-03-30 [1] CRAN (R 4.0.4)
#>  tibble      * 3.1.1   2021-04-18 [1] CRAN (R 4.0.5)
#>  tidyr       * 1.1.3   2021-03-03 [1] CRAN (R 4.0.4)
#>  tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.0.5)
#>  tidyverse   * 1.3.1   2021-04-15 [1] CRAN (R 4.0.4)
#>  utf8          1.2.1   2021-03-12 [1] CRAN (R 4.0.5)
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.0.5)
#>  withr         2.4.2   2021-04-18 [1] CRAN (R 4.0.4)
#>  xfun          0.22    2021-03-11 [1] CRAN (R 4.0.4)
#>  xml2          1.3.2   2020-04-23 [1] CRAN (R 4.0.2)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] C:/Users/teebu/Rlib
#> [2] C:/Program Files/R/R-4.0.4/library

@Teebusch Teebusch closed this as completed Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants