-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChimpSniffingCleaning.qmd
106 lines (74 loc) · 2.75 KB
/
ChimpSniffingCleaning.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
title: "ChimpSniffingCleaning"
author: "Noah Stetson"
format: html
editor: visual
---
### Uploading stuff
Load library
```{r}
library(tidyverse)
library(dplyr)
```
Load data
```{r}
sniffing <- read.csv("https://raw.githubusercontent.com/nwstetson/ChimpSniffing/main/Data/sniffing_allData_2014-2021.csv")
biomarker <- read.csv("https://raw.githubusercontent.com/nwstetson/ChimpSniffing/main/Data/demographyForNetworks_14Aug2019.csv")
```
### Tidying data
Take out commas after names
```{r}
sniffing$Individual <- str_replace(sniffing$Individual, ",", "")
sniffing$ID2 <- str_replace(sniffing$ID2, ",", "")
```
change "IndividualID" in biomarker df to be called "Individual" so it matches sniffing df
```{r}
colnames(biomarker) [colnames(biomarker) == "individualID"] <- "Individual"
```
Add sex and DOB from biomarker into sniffing based on individual
```{r}
sniffing <- inner_join(sniffing, biomarker, by = "Individual")
sniffing <- sniffing[, !(colnames(sniffing) %in% c("father","mother","Maturity","ID_notes","neighborhood","Focal"))]
sniffing <- sniffing %>% relocate (DOB, Sex, .before = Behavior)
```
#### Use DOB and Date to find Age and make Age a column
Convert DOB to be in same format as data
```{r}
sniffing$DOB <- as.Date(as.character(sniffing$DOB))
sniffing$Date <- as.Date(as.character(sniffing$Date), format = "%m/%d/%y")
```
Calculate age per sample and put age next to individual
```{r}
sniffing$Age <- round(as.numeric((sniffing$Date - sniffing$DOB) / 365.25), 2)
sniffing <- sniffing %>% relocate (Age, .before = Sex)
```
Remove line that says to remove it (lines 189 and 190)
```{r}
sniffing <- sniffing[-c(189:190), ]
sniffing <- sniffing[, !(colnames(sniffing) %in% c("Delete"))]
```
Now need to summarize data into a table and group by individual and calculate the mean age and total number of sniffs
Columns will be: Individual, Mean Age, Sex, Total Sniffs
```{r}
# remane "sniff and sniff.ground" to 1 bc it's one count of a sniff
sniffing$Behavior <- str_replace(sniffing$Behavior, "Sniff.Ground", "1")
sniffing$Behavior <- str_replace(sniffing$Behavior, "Sniff", "1")
# rename "behavior" column to be called "sniffs"
sniffing <- sniffing %>% rename(Sniff = Behavior)
```
What if I just saved it as a new file and then did the summary on that
```{r}
write.csv(sniffing, "./data/sniffing_clean.csv", row.names=FALSE)
sniffing_clean <- read.csv("https://raw.githubusercontent.com/nwstetson/ChimpSniffing/main/Data/sniffing_clean.csv")
```
```{r}
sniff_summary <- sniffing_clean %>%
group_by(Individual) %>%
summarize(Age = mean(Age, na.rm = T), Sex = first(Sex), TotalSniffs = sum(Sniff, na.rm = T))
```
wtf it worked
slay
Okay now I'm going to make a visualization with it
```{r}
write.csv(sniff_summary, "./data/sniff_summary.csv", row.names=FALSE)
```