-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control character warning for weird column name in tibble in list #574
Comments
I think is the same problem as tidyverse/dbplyr#223, and is probably a bug in base R. |
I think this is possibly connected to this issue I came accross on Stack Overflow, which I could only recreate using readxl. When I try to recreate the SO issue by creating the tibble using tribble, I don't get the SO error (I only get that with tibbles created by reading from readxl), I get the same error as @cucumberry here, but only if the console is too narrow to print all the columns. my_tibble <- tibble::tribble(~good_column, ~'very bad\ncolumn', ~'terribly\nlong column name here', ~'more', ~'and then even', ~'more than that',
1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12)
my_tibble
#> # A tibble: 2 x 6
#> good_column `very bad\ncolu~ `terribly\nlong~ more `and then even`
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 4 5
#> 2 7 8 9 10 11
#> # ... with 1 more variable: `more than that` <dbl>
list(my_tibble, my_tibble)
#> Warning message:
#>In fansi::strwrap_ctl(x, width = max(width, 0), indent = indent, :
#> Encountered a C0 control character, see `?unhandled_ctl`; you can use `warn=FALSE` to turn off these warnings. I should add that the above is not a reprex as rendering the reprex didn't render the error, no matter how wide or narrow the console and viewer panes were. Much like the screenshot above from @cucumberry, the ÿþ mark also shows up in one of the column names in the first printing of the tibble, but not the second (See picture). But on to the SO example I linked to aboveThe same bad characters result in a different error when the tibble in question is the result of being read in from Excel through Here I get a different error. all_sheets <- readxl::excel_sheets(path = here::here("data", "Posti-Letto-Istat.xls"))
all_sheets %>%
purrr::map(.x = .,
.f = ~readxl::read_excel(path = here::here("data", "Posti-Letto-Istat.xls"),
sheet = .x,
skip = 4))
#>[[1]]
#>Error in nchar(x[is_na], type = "width") :
#> invalid multibyte string, element 1 Of course any measure to get rid of the bad characters before printing the list of tibbles fixes this. all_sheets <- readxl::excel_sheets(path = here::here("data", "Posti-Letto-Istat.xls"))
all_sheets %>%
purrr::map(.x = .,
.f = ~readxl::read_excel(path = here::here("data", "Posti-Letto-Istat.xls"),
sheet = .x,
skip = 4)) %>%
map(janitor::clean_names)
# This prints just fine, obviously Maybe I'm mistaken and it's not connected, but I figured I'd mention it if there's a chance that it is. |
@brodieG: What's the best way to deal with unsanitized user input (in the form of borked column names) for display? I'm happy with printing a demangled version and mentioning in the output that some of the names were mangled originally. Can I safely Also, when reviewing the wrapping we need to take a look at why column names with spaces distort the output, at least in RStudio, when they appear in the footer of a tibble (too many columns). |
You probably want In re: So in short In re: spaces in footer, is this something Footnotes
|
Thanks. I don't think that column names should contain any controls -- will proceed. Related to names with spaces, the following is an example where the tibble is too wide to fit one line and the "with ... more variables" is shown in the footer. Names are wrapped, and the first name in each footer row is printed badly. Not sure whose responsibility this is. (The SGR codes are stripped in the reprex, I can replicate in a terminal and in RStudio.) library(tidyverse)
N <- 16
data <- tibble(letter = letters[1:N], i = 1:N)
cross <- crossing(data, j = 1:N)
row <-
cross %>%
filter(i >= j) %>%
group_by(j) %>%
summarize(name = paste(letter, collapse = " ")) %>%
ungroup() %>%
select(name, j) %>%
deframe()
tbl <- tibble(!!!row)
options(crayon = TRUE)
fmt <- format(tbl)
fmt
#> [1] "# A tibble: 1 x 16"
#> [2] " `a b c d e f g … `b c d e f g h … `c d e f g h i … `d e f g h i j …"
#> [3] " <int> <int> <int> <int>"
#> [4] "1 1 2 3 4"
#> [5] "# … with 12 more variables: `e f g h i j k l m n o p` <int>, `f g h i j k l m n\n# o p` <int>, `g h i j k l m n o p` <int>, `h i j k l m n o p` <int>, `i j k\n# l m n o p` <int>, `j k l m n o p` <int>, `k l m n o p` <int>, `l m n o\n# p` <int>, `m n o p` <int>, `n o p` <int>, `o p` <int>, p <int>"
cat(fmt, sep = "\n")
#> # A tibble: 1 x 16
#> `a b c d e f g … `b c d e f g h … `c d e f g h i … `d e f g h i j …
#> <int> <int> <int> <int>
#> 1 1 2 3 4
#> # … with 12 more variables: `e f g h i j k l m n o p` <int>, `f g h i j k l m n
#> # o p` <int>, `g h i j k l m n o p` <int>, `h i j k l m n o p` <int>, `i j k
#> # l m n o p` <int>, `j k l m n o p` <int>, `k l m n o p` <int>, `l m n o
#> # p` <int>, `m n o p` <int>, `n o p` <int>, `o p` <int>, p <int> Created on 2020-03-21 by the reprex package (v0.3.0) |
Ah, I see. |
I can't replicate the original problem in R 3.6.3. |
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
I've got a tibble where one of the column names is
ComparedtoPHECentres(2015)valueorpercentiles
(not my choice). If I print it within a list, and my console isn't wide enough to fit that column, I get a warning fromfansi::strwrap_ctl()
. It also prints the column name as�ÿþComparedtoPHECentres(2015)valueorpercentiles�ÿþ
. I don't know if it's the length of the column name or the fact that it's got parentheses, or why it only happens if it's an element in a list.The text was updated successfully, but these errors were encountered: