-
Notifications
You must be signed in to change notification settings - Fork 0
/
A3-rachel-jackson.Rmd
119 lines (76 loc) · 4.16 KB
/
A3-rachel-jackson.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
title: "R Notebook"
output:
html_document:
df_print: paged
---
**Rachel Jackson**
**A3 - Data Analysis/Visualization**
**09/21/2020**
## Question 1
#### What's the difference between the R functions "paste" and "paste0"?
+ The difference between the two R functions is that with "paste0" it links strings without spaces, whereas if one were to use "paste()" with the same intention they would have to specify the argument sep. See examples below for further explanation.
```{r}
#paste0() Example
favorite_animal <- paste("el", "e", "phant")
favorite_animal
#paste() Example
favorite_animal_0 <- paste0("el", "e", "phant")
favorite_animal_0
#sep Example
favorite_animal_sep <- paste("el", "e", "phant", sep = "")
favorite_animal_sep
```
## Question 2
#### What does evaluating str_trim(" tacos ") == "tacos" produce? Why? (str_trim is in the package stringr, which is part of the Tidyverse.)
+ The function below produces the value 'TRUE'. The purpose of str_trim is to remove the white space from the beginning and the end of a string.Therefore, when you run the function below, it removed the white space from the start and the end of the string '(" tacos ")', leaving us with a value of 'tacos'. This value does in fact equal the value on the argument was initially testing for, which is why it produces the evaluation 'TRUE'.
+ [I referenced this site to to help me construct my answer to this question.] (https://www.rdocumentation.org/packages/stringr/versions/1.4.0/topics/str_trim)
```{r}
library(tidyverse)
str_trim(" tacos ") == "tacos" #returns TRUE
```
## Question 3
#### What does evaluating NA == NA produce? Why?
+ 'NA' represents an unknown value. Missing values are known to be contagious, because almost any operation involved with an unknown value will most likely also be unknown. Therefore the argument below produces 'NA' because we don't know if NA is equal to NA. We don't know what either value is, thus it is impossible to say whether or not they equal each other.
```{r}
NA == NA
```
## Question 4
#### What does na.rm = TRUE do in mean() and sum()?
+ The function 'na.rm' refers to the logical parameter that tells the function whether or not to remove 'NA' values from the calculation. Therefore, the parameter 'na.rm = TRUE' literally means remove NA.
+ sum() returns the sum of all the values present in its arguments. If 'na.rm = TRUE' is used with the sum function, an NA value in any of the arguments will be ignored and not factored into the sum.
+ mean() is used to calculate the mean in a data series (taking the sum of the values and dividing with the number of values in data set). If mean() includes 'na.rm = TRUE' then it will exclude the missing values from the input vector. See example below.
```{r}
# SUM(NA.RM = TRUE) example
y <- c(2,2,NA,2)
sum(y) # returns NA
sum(y, na.rm = TRUE) #returns 6
# mean(na.rm = TRUE) example
x <- c(2,2,NA,2)
mean(x) #returns NA
mean(x, na.rm=TRUE) #returns 2
```
## Question 5
#### What does mean(is.na(x)) tell you about a vector x?
+ The function is.na() determines if a value is missing. So the expression 'mean(is.na(x))' will calculate the portion of missing values in vector x.
## Question 6
#### What does evaluating 1:4 + 1:2 produce? Why?
+ When adding the two vectors '1:4 + 1:2' together it produces the values '2 4 4 6'.
+ R prefers its vectors to have the same length, so when it's given two vectors of different lengths, it merely replicates (recycles) the shorter vector until it is the same length as the longer vector, then it does the operation. This is called vector recycling.
```{r}
1:4 + 1:2 #returns 2 4 4 6
```
## Question 7
#### What does evaluating 1:4 + 1:3 produce? Why?
```{r}
1:4 + 1:3 #returns error message
```
+ Because 10 is not an integer multiple of 3, when adding the two vectors '1:4 + 1:3' together, it produced an error. If I want to use vector recycling then I should use a scalar to avoid this error message again.
## Question 8
#### Write a filter expression for flights that returns a tibble containing only flights departing on January 1.
```{r}
library(tidyverse)
library(nycflights13)
jan1 <- filter(flights, month == 1, day == 1)
jan1
```