-
Notifications
You must be signed in to change notification settings - Fork 0
/
P5_graph_generator.Rmd
105 lines (78 loc) · 3.48 KB
/
P5_graph_generator.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: "| Data Science With an Eye Towards Sustainability \n| Activity P5: Data Import and Cleaning\n"
author: "Tane Koh"
output:
html_document:
theme:
bootswatch: solar
bookdown::html_document2:
split_by: none
toc: yes
toc_float: yes
---
This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Ctrl+Shift+Enter*.
```{r}
library(readr)
library(ggplot2)
```
```{r}
# This is a test.
data <- read_csv("Net_generation_for_all_sectors_monthly.csv",show_col_types = FALSE)
data$total_renewable_output <- data$`conventional hydroelectric thousand megawatthours` + data$`other renewables thousand megawatthours`
data$Month <- as.Date(paste0("01-", data$Month), format = "%d-%b-%y")
# Extract year and month
data$year <- as.numeric(format(data$Month, "%Y"))
data$month <- as.numeric(format(data$Month, "%m"))
ggplot(data, aes(x = Month, y = total_renewable_output)) +
geom_line() +
labs(x = "Date", y = "Total Renewable Energy Output in MA (x1000 kWh)", title = "Total Renewable Energy Output vs Date")
```
```{r}
file_names <- c("solar1.csv", "solar2.csv", "solar3.csv", "solar4.csv", "solar5.csv")
# Initialize an empty dataframe
combined_df <- data.frame()
# Loop through each file and combine them
for (file in file_names) {
# Read the CSV file
df <- read.csv(file)
# Combine the dataframes
combined_df <- rbind(combined_df, df)
}
combined_df$date <- as.POSIXct(combined_df$LocalTime, format = "%m/%d/%y %H:%M")
# Extract date without time
combined_df$date <- as.Date(combined_df$date)
# Sum up the values within each day
aggregated_df <- combined_df %>%
group_by(date) %>%
summarise(total_value = sum(Power.MW.))
ggplot(aggregated_df, aes(x = date, y = total_value)) +
geom_line() +
labs(x = "Date", y = "Total Solar Energy Output (mW)", title = "Total Renewable Energy Output vs Date")
```
Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing *Ctrl+Alt+I*.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press *Ctrl+Shift+K* to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike *Knit*, *Preview* does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
```{R}
prices <- read_csv("MA_Residential_Installed Prices (2000-2022).csv",show_col_types = FALSE)
tpo <- read_csv("TPO_Penetration.csv",show_col_types = FALSE)
tpo$Share <- as.numeric(gsub("%", "", tpo$Share))
ggplot(prices,aes(x=`Year`,y=`Median`))+
geom_line() +
labs(x="Year",y="Installation Prices ($/W)",title="Installation Prices of Solar Panels Over Time")
ggplot(tpo,aes(x=`Year`,y=`Share`))+
geom_line() +
labs(x="Year",y="Market Share of Solar Energy",title="Market Penetration of Solar Energy Over Time")
```
```{r}
camden <- read_csv("camdens big dataset.csv",show_col_types = FALSE)
```
```{r}
camden$`Date In Service` <- as.Date(camden$`Date In Service`, format = "%m/%d/%Y")
camden$`Total Cost with Design Fees` <- as.numeric(gsub("[^0-9.]", "", camden$`Total Cost with Design Fees`))
camden$`Total Grant` <- as.numeric(gsub("[^0-9.]", "", camden$`Total Grant`))
```
```{r}
ggplot(camden,aes(x=`Date In Service`,y=`Estimated Annual Production (kWhr)`))+
geom_line()
```