exercise6_module2.Rmd

## Exercise 6: about Module 2...

Create the script "exercise6.R" and save it to the "Rcourse/Module2" directory: you will save all the commands of exercise 6 in that script.
<br>Remember you can comment the code using #.


<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
getwd()
setwd("Rcourse/Module2")
setwd("~/Rcourse/Module2")
```

</details>

**1- Install and load package ggplot2**

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
# Install
install.packages(pkgs="ggplot2")
# Load in the environment
library("ggplot2")
```

Check with sessionInfo() that the package is indeed loaded. Was version of ggplot2 did you get?

</details>

**2- ggplot2 loads automatically the diamonds dataset in the working environment: you can use it as an object after ggplot2 is loaded. You can see it in "Environment -> change "Global Environment" to "package:ggplot2".**

What are the dimensions of diamonds? What are the column names of diamond?

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
# Dimensions of diamonds
dim(diamonds)
# Column names of diamonds
colnames(diamonds)
```

</details>

You can read the help page of the diamonds dataset to understand what it contains!<br>

Note: diamonds is a data frame: you can test it with **is.data.frame(diamonds)** (returns TRUE).

**3. How many diamonds have a "Premium" cut ?**

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
nrow(diamonds[diamonds$cut == "Premium",])
```

</details>

**4. Select diamonds that have a "Premium" cut AND an "E" color. Save in the new object diams1. How many rows does diams1 have ?**


<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
diams1 <- diamonds[diamonds$cut == "Premium" & diamonds$color == "E",]
```

</details>

**5- Install and load the package dplyr from the R Console.**

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
# Install package
install.packages(pkgs="dplyr")
# Load package
library("dplyr")
```

</details>

**6- Use the function "sample_n" from the dplyr package (check the help page for sample_n) to randomly sample 200 lines of diams1: save in the new diams object.**

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
# Subset data frame
diams <- sample_n(tbl=diams1, size=200)
```

</details>

**7- What are the minimum, average and median "carat" of the 200 diamonds in diams ?**

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
min(diams$carat)
mean(diams$carat)
median(diams$carat)
summary(diams$carat)
```

</details>

**8- Save diams into a csv file (try the "write.csv" function.)**

<details>
<summary>
*Answer*
</summary>

```{r, eval=F}
# Write a csv file with csv
write.csv(x=diams, 
	file="diamonds_Premium_E_200.csv",
        row.names=FALSE,
        quote=FALSE)
```

</details>