-
Notifications
You must be signed in to change notification settings - Fork 1
/
4_exercises_pollination.qmd
161 lines (91 loc) · 4.77 KB
/
4_exercises_pollination.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
output:
html_document:
df_print: paged
code_download: TRUE
toc: true
toc_depth: 1
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r libraries}
library(tidyverse)
```
In the materials for this workshop there is a .csv file with pollinator observation data: pollination.csv. Read this in to RStudio using read_csv() and check out the column names and data using the code chunk below.
```{r}
pollination <- read_csv("data/pollination.csv")
names(pollination)
#View(pollination)
```
# What's in the data
Each row of these data is an observation that came from a person or a camera (Observer) that sat in front of a Oenothera harringtonii plant for a certain amount of minutes (ObsevTime), recording all visitors to the flowers in order to learn about the pollination of the plant species. Most visitors to these flowers are bees (Bees) and hawkmoths (HM) of different genera (Hyles, Manduca). Some plants have more flowers than others, so the number of flowers was also recorded (FlowObs). These observations were made for multiple populations (PopName), over multiple dates (Date).
# Subsetting and changing column names
## Exercise 1
We don't need all of the variables. We only want "PopName", Time", "Date", "Observer", "Temp..begin.", "Bee", "FlowObs", "ObsevTime", "HM", "HMrate", and "Brate".
We also want the names of the columns to be more meaningful.
Select the variables we want and rename "PopName" to "pop", Time" to "time", "Date" to "date", "Observer" to "observer", "Temp..begin." to "temp", "FlowObs" to "n_flowers", "Bee" to "n_bees", "HM" to "n_hawkmoths", and "ObsevTime" to "minutes_observed". Keep "HMrate" and "Brate" in the data too.
Save the result back to the pollinators variable (overwrite it)
Hint: when renaming while selecting columns, new_name = existing_name. No quotation marks needed.
```{r}
```
## Exercise 2
We're particularly interested in data from the David's Canyon population. Subset the data so it only includes observations where the population (PopName variable) is "David's Canyon"
Save the subset for David's Canyon in a variable called dc_data.
Hint: What function do you use to subset the observations?
```{r}
```
## Exercise 3
Which observer spent the longest time observing in a single session (remember: each row is a session - an observation of a plant).
```{r}
```
Which HUMAN observer had the longest observation period? Filter out observers with CAM in their name to find human observers. Hint: `str_detect` is a useful function for this: `str_detect(variable_name, "CAM")`
```{r}
```
## Exercise 4
Make a new variable total_pollinators in the pollinators data frame that is the total bees and hawkmoths observed.
```{r}
```
Now find which rows (observation sessions) observed the most pollinators
```{r}
```
# Creating data summaries
## Exercise 5
In the David's Canyon population (dc_data data frame, created above), who (which observer):
- Saw the most hawkmoths?
- The most bees?
- Watched the most flowers?
- Accumulated the most observation minutes?
You could query the data for each of these questions one-by-one but could also calculate all of the answers at once, and create a summary table by grouping by observer.
Hint: The same observer can have multiple observations. How can you tell how many total insects, flowers, etc. they observed?
We don't want to know about the cameras, only the people! Filter out observers with CAM in their name. Hint: `str_detect` is a useful function for this.
```{r}
```
Challenge: do the above using across()
```{r}
```
## Exercise 6
Using the full data, in 2009, in which population did researchers spend the most observation time (the most minutes)? How much time on average did researchers spend per observation in each population?
```{r}
```
# Summarizing values calculated from the data
## Exercise 7
We want to know if bees or hawkmoths are more common visitors to these flowers. In dc_data (David's Canyon), how often were there more hawkmoths than bees at an individual plant? How often were there more bees than hawkmoths?
```{r}
```
# Even More!
## Exercise 8
What proportion of observations were made at temperatures above 70 degrees?
```{r}
```
## Exercise 9
Are more pollinators observed when the temperature is above or below 70 degrees?
There are a few ways you might compare the number of pollinators. Compare the total number of pollinators in each condition, then the average number. Feeling adventurous? Compute the number of pollinators observed per observation minute and then compare that normalized rate instead.
```{r}
```
## Exercise 10
Same as above: Are more pollinators observed when the temperature is above or below 70 degrees? BUT does the answer vary based on the population?
```{r}
```