class2023.qmd

---
title: "MEDS Class of 2023"
subtitle: "Program Learning Outcome (PLO) #1 Assessment - Core Knowledge"
author: "Sam Csik"
date: June 9, 2023
format: 
  html:
    
    toc: true
    toc-location: left
    code-tools: 
      source: true
      toggle: false
    theme: 
      - styles.scss
    mainfont: Nunito
execute: 
  eval: true
  echo: false
  message: false
  warning: false
editor_options: 
  chunk_output_type: console
---

```{r}
#..........................load packages.........................
library(googlesheets4)
library(tidyverse)
library(janitor)
library(showtext)
library(ggtext)
library(DT)
library(tidytext)
library(wordcloud)
library(scales)

#........................import functions........................
source("functions.R")

#..........................import data...........................
medsJune2023 <- read_sheet("https://docs.google.com/spreadsheets/d/1Sq1rOmBP-g6iOCBS7NoexCj4K-j3OpeErMJ2Hxd3AHo/edit?usp=sharing")

#...........................clean data...........................
medsJune2023_clean <- clean_PLO_data(medsJune2023)
  
#......................import Google fonts.......................
sysfonts::font_add_google(name = "Sanchez", family = "sanchez")
sysfonts::font_add_google(name = "Nunito", family = "nunito")

# automatically use showtext to render text for future devices ----
showtext::showtext_auto()
```

# **Summary**

{{< include /sections/class2023_after/summary.qmd >}}

```{r}
#| fig-align: center
scores <- medsJune2023_clean |> 
  select(sc0)

mean_score <- mean(scores$sc0)
median_score <- median(scores$sc0)

ggplot(scores, aes(x = sc0)) +
  geom_histogram(binwidth = 1, color = "white", fill = "#047C91") +
  geom_vline(xintercept = median_score,  linetype = "dashed", color = "black") +
  annotate(geom = "label", x = 10.5, y = 8, label = paste0("Median Score = ", median_score), hjust = "right") +
  annotate(geom = "segment", x = 10.5, y = 8, xend = median_score, yend = 7.5,
           arrow = arrow(length = unit(3, "mm"))) +
  scale_x_continuous(breaks = seq(1, 14, 1)) +
  labs(x = "Score", y = "Number of MEDS studnets",
       title = "Distribution of scores",
       caption = "Out of 14 available points") +
  meds_theme()
```

# **Individual Questions**

::: {.callout-note}
Questions that have a correct answer are color-coded green.
:::

## **Part 1: OS and data/document storage**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 1: What OS?  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

q1_os_data <- clean_q1_os_pre(medsJune2023_clean)
plot_q1_os_pre(q1_os_data)
```

```{r}
#| fig-cap: "NOTE: Percentages will not sum to 100%"
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 2: Where do you store data?  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# combine written-in "server" option ----
server <- c("Server", "Taylor server", "external server")

# wrangle ----
q2_store_data <- clean_q2_store_data_pre(medsJune2023_clean)

# plot ----
plot_q2_store_data_pre(q2_store_data, survey = "Pre")
```

## **Part 2: How often do you currently use the following?**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 3: GUI  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q3_gui_data <- clean_q3_gui_pre(medsJune2023_clean)

# plot ----
plot_frequency_use_pre(data = q3_gui_data,
                   title = "A specialized software with a point-and-click graphical user\ninterface (e.g., for statistical analysis: SPSS, SAS...;for Geospatial\nanalysis: ArcGIS, QGIS...; for Genomics analysis: Geneious, …)",
                   caption = "Question 3",
                   survey = "Pre")

```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 4: Programming Languages  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q4_prog_lang_data <- clean_q4_prog_lang_pre(medsJune2023_clean)

# plot ----
plot_frequency_use_pre(data = q4_prog_lang_data,
                   title = "Programming languages (R, Python, etc.)",
                   caption = "Question 4",
                   survey = "Pre")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 5: Databases  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q5_databases_data <- clean_q5_databases_pre(medsJune2023_clean)

# plot ----
plot_frequency_use_pre(data = q5_databases_data, 
                   title = "Databases (SQL, Access, etc.)",
                   caption = "Question 5",
                   survey = "Pre")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 6: Version Control  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q6_version_control_data <- clean_q6_version_control_pre(medsJune2023_clean)

# plot ----
plot_frequency_use_pre(data = q6_version_control_data,
                   title = "Version control software (Git, Subversion (SVN), Mercurial, etc.)",
                   caption = "Question 6",
                   survey = "Pre")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 7: Command Shell  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q7_command_shell_data <- clean_q7_command_shell_pre(medsJune2023_clean)

# plot ----
plot_frequency_use_pre(data = q7_command_shell_data,
                   title = "A command shell (usually accessed through Terminal on macOS or\nPowerShell on Windows)",
                   caption = "Question 7",
                   survey = "Pre")
```

## **Part 3: Workflow satisfaction**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 8: Workflow Satisfaction  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q8_workflow_satisfaction_data <- clean_q8_workflow_satisfaction_pre(medsJune2023_clean)

# plot ----
plot_q8_workflow_satisfaction_pre(q8_workflow_satisfaction_data)
```

## **Part 4: Rank the following from 1 (strongly disagree) to 5 (strongly agree)**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 9: Raw Data  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q9_raw_data_data <- clean_q9_raw_data_pre(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q9_raw_data_data,
                    title = "Having access to the original, raw data is important to be able to\nrepeat an analysis",
                    caption = "Question 9")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 10: Small Program  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q10_small_program_data <- clean_q10_small_program(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q10_small_program_data,
                    title = "I can write a small program, script, or macro to address a problem\nin my own work",
                    caption = "Question 10")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 11: Find Help Online  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q11_find_help_online_data <- clean_q11_find_help_online(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q11_find_help_online_data ,
                    title = "I know how to search for answers to my technical questions online",
                    caption = "Question 11")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 12: Overcoming Problems  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q12_overcoming_problems_data <- clean_q12_overcoming_problems(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q12_overcoming_problems_data,
                    title = "While working on a programming project, if I get stuck, I can find\nways of overcoming the problem",
                    caption = "Question 12")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 13: Confidence  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q13_confidence_data <- clean_q13_confidence(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q13_confidence_data,
                    title = "I am confident in my ability to make use of programming software\nto work with data",
                    caption = "Question 13")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 14: Easier Analyses  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q14_easier_analysis_data <- clean_q14_easier_analysis(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q14_easier_analysis_data,
                     title = "Using a programming language (like R or Python) can make my\nanalyses easier to reproduce",
                    caption = "Question 14")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 15: Increase Efficiency  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q15_increase_efficiency_data <- clean_q15_increase_efficiency(medsJune2023_clean)

# plot ----
plot_rank_data_pre(data = q15_increase_efficiency_data,
                    title = "Using a programming language (like R or Python) can make me\nmore efficient at working with data",
                    caption = "Question 15")
```

## **Part 5: Stats**

<!-- NOTE TO MAINTAINERS: plotting functions expect palette to be called `pal` -->

```{r}

##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 16a: Median  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q16a_median_data <- clean_q16a_median_pre(medsJune2023_clean)

# plot ----
pal <- c("14" = "#7ECD7A", "0" = "#047C91", "10" = "#047C91")
plot_q16a_median_pre(q16a_median_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 16b: Mode  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q16b_mode_data <- clean_q16b_mode_pre(medsJune2023_clean)

# plot ----
pal <- c("14" = "#7ECD7A")
plot_q16b_mode_pre(q16b_mode_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 17a: Linear Regression Familiarity  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q17a_familiarity_lr_data <- clean_q17a_familiar_lr_pre(medsJune2023_clean)

# plot ----
plot_q17a_familiar_lr_pre(q17a_familiarity_lr_data)
```

{{< include /sections/class2023_after/q17-callout-inline-code.qmd >}}

<!-- {{< include /sections/all_classes/q17b-screenshot.qmd >}} -->

Below is a chunk of code showing a simple linear regression relating the number of pieces of microplastics to the number of days per year with rainfall.

```{r}
#| fig-align: center
knitr::include_graphics("images/17-microplastics-lm.png")
```

<!-- NOTE TO MAINTAINERS: see `q17_linear_regression.R` for free response cleaning; will need to be modified for each new set of survey data -->

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 17b: Microplastics  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q17b_microplastics_data <- clean_q17b_microplastics_pre(medsJune2023_clean)

# plot ----
pal <- c("47" = "#7ECD7A", "27" = "#047C91", "900" = "#047C91", "26" = "#047C91",
         "956" = "#047C91","43" = "#047C91", "911" = "#047C91", "46" = "#047C91")
plot_q17b_microplastics_pre(q17b_microplastics_data)
```

::: {.callout-note}
## Question 17b raw responses
Some respondents recorded their answers in sentence form, while others did not round their answers to the nearest integer. Cleaned responses are shown in the plot, above. Responses as they were recorded are included in the table, below:

```{r}
#............................wrangle.............................
q17_microplastics_dt <- medsJune2023_clean |>

  # select necessary cols ----
  select(microplastics_lr)

DT::datatable(q17_microplastics_dt, colnames = c("Free Response Answer to Q17b"),

               options = list(autoWidth = TRUE,
                              pageLength = 5,
                              lengthMenu = c(5, 10, 20, 30),
                              dom = 'ltp')

              )
```
:::

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 18a: Probability Distribution Familiarity  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q18a_familiar_prob_dist_data <- clean_q18a_familiar_prob_dist_pre(medsJune2023_clean)

# plot ----
plot_q18a_familiar_prob_dist_pre(q18a_familiar_prob_dist_data)
```

{{< include /sections/class2023_after/q18-callout-inline-code.qmd >}}

```{r}
#| column: margin
#| fig-cap: "62% of respondents correctly answered question 18b (i.e. chose exactly the following options: normal, uniform, bimodal, symmedtric)"
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 18b: Probabilty Distribution Terms  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# for calculating percentages in wrangling step below ----
q18b_num_answers <- medsJune2023_clean |>
  select(prob_dist_terms) |>
  count() |>
  pull()

# wrangle (FULLY CORRECT; margin plot) ----
q18b_prob_dist_FULLY_CORRECT_data <- clean_q18b_FULLY_CORRECT_pre(medsJune2023_clean)

# plot (FULLY CORRECT; margin plot) ----
pal <- c("yes" = "#7ECD7A", "no" = "#047C91")
plot_q18b_FULLY_CORRECT_pre(q18b_prob_dist_FULLY_CORRECT_data)
```

```{r}
# wrangle (INDIV RESPONSES) ----
q18b_prob_dist_terms_data <- clean_q18b_prob_dist_terms_pre(medsJune2023_clean)

# plot (INDIV RESPONSES) ----
pal <- c("normal" = "#7ECD7A", "uniform" = "#7ECD7A", "bimodal" = "#7ECD7A", "symmetric" = "#7ECD7A", "variable" = "#047C91")
plot_q18b_prob_dist_terms_pre(q18b_prob_dist_terms_data)
```

## **Part 6: Programming 1**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 19a: Familiarity with Functions  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# for calculating percentages in wrangling step below ----
q19_num_answers <- medsJune2023_clean |>
  select(term_function) |>
  count() |>
  pull()

# wrangle ----
q19a_familiar_functions_data <- clean_q19a_familiar_functions_pre(medsJune2023_clean)

# plot ----
plot_q19a_familiar_functions_pre(q19a_familiar_functions_data)
```

{{< include /sections/class2023_after/q19b-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 19b: Writing Functions  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q19b_writing_functions_data <- clean_q19b_writing_functions_pre(medsJune2023_clean)

# plot ----
plot_q19b_writing_functions_pre(q19b_writing_functions_data)
```

{{< include /sections/class2023_after/q19c-callout-inline-code.qmd >}}

{{< include /sections/all_classes/q19c-turbine-function.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 19c: Function Output  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q19c_fxn_output_data <- clean_q19c_fxn_output_pre(medsJune2023_clean)

# plot ----
pal <- c("10" = "#7ECD7A")
plot_q19c_fxn_output_pre(q19c_fxn_output_data)
```

## **Part 7: Environmental Modeling**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 20a: Run Environmental Model  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q20a_run_env_mod_data <- clean_q20a_run_env_mod_pre(medsJune2023_clean)

# plot ----
plot_q20a_run_env_mod_pre(q20a_run_env_mod_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 20b: Sensitivity Analysis  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q20b_sa_data <- clean_q20b_sa_pre(medsJune2023_clean)

# plot ----
plot_q20b_sa_pre(q20b_sa_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 20c: Parameter Interactions  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q20c_param_int_data <- clean_q20c_param_int_pre(medsJune2023_clean)

# plot ----
pal <- c("a global sensitivity analysis" = "#7ECD7A", "a local sensitivity analysis" = "#047C91",
         "I'm not sure" = "#047C91", "No response" = "#047C91")
plot_q20c_param_int_pre(q20c_param_int_data)
```

## **Part 8: Geospatial Analysis & Remote Sensing**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 21a: Comfort with Spatial Data  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# for calculating percentages in wrangling step below ----
q21a_num_answers <- medsJune2023_clean |>
  select(spatial_data) |>
  count() |>
  pull()

# wrangle ----
q21a_comfort_spatial_data <- clean_q21a_comfort_spatial_pre(medsJune2023_clean)

# plot ----
plot_q21a_comfort_spatial_pre(q21a_comfort_spatial_data)
```

{{< include /sections/class2023_after/q21-callout-inline-code.qmd >}}

```{r}
#| column: margin
#| fig-cap: "100% of respondents correctly answered question 21b (i.e. chose exactly the following options: raster, vector)"

##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 21b: Representing Spatial Data  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# for calculating percentages in wrangling step below ----
q21b_num_answers <- medsJune2023_clean |>
  select(rep_spatial_data) |>
  count() |>
  pull()

# wrangle (FULLY CORRECT; margin) ----
q21b_FULLY_CORRECT_data <- clean_q21b_FULLY_CORRECT_pre(medsJune2023_clean)

# plot (FULLY CORRECT; margin) ----
pal <- c("yes" = "#7ECD7A", "no" = "#047C91")
plot_q21b_FULLY_CORRECT_pre(q21b_FULLY_CORRECT_data)
```

```{r}
# wrangle (INDIV RESPONSES) ----
q21b_rep_spatial_data <- clean_q21b_rep_spatial_pre(medsJune2023_clean)

# plot (INDIV RESPONSES) ----
pal <- c("vector" = "#7ECD7A", "raster" = "#7ECD7A", "tabular" = "#047C91", "relational" = "#047C91")
plot_q21b_rep_spatial_pre(q21b_rep_spatial_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 21c: Vector vs. Raster  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q21c_vec_ras_data <- clean_q21c_vec_ras_pre(medsJune2023_clean)

# plot ----
pal <- c("vector" = "#7ECD7A", "raster" = "#047C91", "I'm not sure" = "#047C91")
plot_q21c_vec_ras_pre(q21c_vec_ras_data)
```

```{r}
#| fig-align: center
knitr::include_graphics("images/21c-vector.png")
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 22a: Comfort with Remote Sensing  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q22a_comfort_rs_data <- clean_q22a_comfort_rs_pre(medsJune2023_clean)

# plot ----
plot_q22a_comfort_rs_pre(q22a_comfort_rs_data)
```

{{< include /sections/class2023_after/q22-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 22b: Reflected Radiation  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q22b_rs_sun_data <- clean_q22b_rs_sun_pre(medsJune2023_clean)

# plot ----
pal <- c("passive" = "#7ECD7A", "active" = "#047C91", "I'm not sure" = "#047C91")
plot_q22b_rs_sun_pre(q22b_rs_sun_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 23a: Map Projections  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# for calculating percentages in wrangling step below ----
q23a_num_answers <- medsJune2023_clean |>
  select(map_proj_comfort) |>
  count() |>
  pull()

# wrangle ----
q23a_comfort_map_proj_data <- clean_q23a_comfort_map_proj_pre(medsJune2023_clean)

# plot ----
plot_q23a_comfort_map_proj_pre(q23a_comfort_map_proj_data)
```

{{< include /sections/class2023_after/q23-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 23b: Reprojection  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q23b_reproj_data <- clean_q23b_reproj_pre(medsJune2023_clean)

# plot ----
pal <- c("3D to 2D" = "#7ECD7A", "Imprecise locations to precise locations" = "#047C91",
         "Meters to latitude/longitude" = "#047C91", "No response" = "#047C91")
plot_q23b_reproj_pre(q23b_reproj_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 24a: Familiarity with Reflectance Spectra  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q24a_familiarity_rs_data <- clean_q24a_familiarity_rs_pre(medsJune2023_clean)

# plot ----
plot_q24a_familiarity_rs_pre(q24a_familiarity_rs_data)
```

{{< include /sections/class2023_after/q24-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 24b: Vegetation Wavelength  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q24b_veg_wave_data <- clean_q24b_veg_wave_pre(medsJune2023_clean)

# plot ----
pal <- c("green" = "#7ECD7A", "red" = "#047C91", "blue" = "#047C91", "blue and thermal" = "#047C91",
         "I'm not sure" = "#047C91", "No response" = "#047C91")
plot_q24b_veg_wave_pre(q24b_veg_wave_data)
```

## **Part 9: Machine Learning**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 25a: Familiarity with ML  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q25a_familiar_ml_data <- clean_q25a_familiar_ml_pre(medsJune2023_clean)

# plot ----
plot_q25a_familiar_ml_pre(q25a_familiar_ml_data)
```

{{< include /sections/class2023_after/q25a-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 25b: Unsupervised Learning Algorithm  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q25b_unsup_alg_data <- clean_q25b_unsup_alg_pre(medsJune2023_clean)

# plot ----
plot_q25b_unsup_alg_pre(q25b_unsup_alg_data)
```

{{< include /sections/class2023_after/q25b-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 25c: Kmeans  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q25c_kmeans_data <- clean_q25c_kmeans_pre(medsJune2023_clean)

# plot ----
pal <- c("unsupervised, does not require expert labeling of data" = "#7ECD7A",
         "unsupervised, requires expert labeling of data" = "#047C91",
         "supervised, does not require expert labeling of data" = "#047C91",
         "supervised and requires expert labeling of data" = "#047C91",
         "I'm not sure" = "#047C91")
plot_q25c_kmeans_pre(q25c_kmeans_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 26a: Dividing Data  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q26a_div_data <- clean_q26a_div_data_pre(medsJune2023_clean)

# plot ----
plot_q26a_div_data_pre(q26a_div_data)
```

{{< include /sections/class2023_after/q26a-callout-inline-code.qmd >}}

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 26b: Train, Validate, Split  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q26b_tvs_data <- clean_q26b_tvs_pre(medsJune2023_clean)

# plot ----
plot_q26b_tvs_pre(q26b_tvs_data)
```

{{< include /sections/class2023_after/q26b-callout-inline-code.qmd >}}

```{r}
#| column: margin
#| fig-cap: "75.9% of respondents correctly answered question 26c (i.e. chose exactly the following options: My model is likely to perform very well when applied to new data, My test set has data entry errors in it)"

##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 26c: Model Performance  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# for calculating percentages in wrangling step below ----
q26c_num_answers <- medsJune2023_clean |>
  select(learning_from_model) |>
  count() |>
  pull()

# wrangle (FULLY CORRECT) ----
q26c_FULLY_CORRECT_data <- clean_q26c_FULLY_CORRECT_pre(medsJune2023_clean)

# plot (FULLY CORRECT) ----
pal <- c("yes" = "#7ECD7A", "no" = "#047C91")
plot_q26c_FULLY_CORRECT_pre(q26c_FULLY_CORRECT_data)
```

```{r}
# wrangle (INDIV RESPONSES) ----
q26c_mod_perf_data <- clean_q26c_mod_perf_pre(medsJune2023_clean)

# plot (INDIV RESPONSES) ----
pal <- c("My model is unlikely to perform well when applied to new data" = "#7ECD7A",
         "My model is overfitting the training set" = "#7ECD7A",
         "My model is likely to perform very well when applied to new data" = "#047C91",
         "My test set has data entry errors in it" = "#047C91")
plot_q26c_mod_perf_pre(q26c_mod_perf_data)
```

## **Part 10: Environmental Justice**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 27: Data Justice  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q27_data_justice_data <- clean_q27_data_justice_pre(medsJune2023_clean)

# plot ----
plot_q27_data_justice_pre(q27_data_justice_data)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 28: Bias  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q28_bias_data <- clean_q28_bias_pre(medsJune2023_clean)

# plot ----
plot_q28_bias_pre(q28_bias_data)
```

## **Part 11: Data Viz & Communication**

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 29: Create Data Viz  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q29_create_viz_data <- clean_q29_create_viz_pre(medsJune2023_clean)

# plot ----
plot_q29_create_viz_pre(q29_create_viz_data)
```

#### Identify 4 areas for improvement in the following data visualization that shows information about Michigan counties with highest college attendance.

```{r}
#| fig-align: center
knitr::include_graphics("images/30-plot.png")
```

```{r}
#| fig-width: 3
#| fig-align: center
#| fig-cap: "Wordcloud of most frequently occurring words used to describe suggested improvements to the above data visualization (Question 30)" 
#| fig-cap-location: top

##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 30: Improve Data Viz  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q30_improve_dv <- clean_q30_improve_dv_pre(medsJune2023_clean)

# plot ----
plot_q30_improve_dv_pre(q30_improve_dv)
```

::: {.callout-note}
## Question 30 raw responses
Free responses as they were recorded are included in the table, below:

```{r}
# wrangle ----
q30_improve_dv <- medsJune2023_clean |>

  # select necessary cols ----
  select(improve_data_viz)

# create table ----
DT::datatable(q30_improve_dv, colnames = c("Free Response Answer to Q30"),

               options = list(autoWidth = TRUE,
                              pageLength = 5,
                              lengthMenu = c(5, 10, 20, 30),
                              dom = 'ltp')

              )
```
:::

## **Part 12: Programming 2**

```{r}
#| eval: false
#| echo: true
# define function
def convert_F_to_C(temp_F):
  temp_C = (temp_F-32)*5/9
  return temp_C

# use function
convert_F_to_C(32)
```

```{r}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##  ~ Question 31: What Language  ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# wrangle ----
q31_lang_data <- clean_q31_lang_pre(medsJune2023_clean)

# plot ----
pal <- c("Python" = "#7ECD7A", "R" = "#047C91", "SQL" = "#047C91")
plot_q31_lang_pre(q31_lang_data)
```

<br>

::: {.center-text}
***End MEDS Class of 2023 PLO Assessment Report***
:::

<br>

::: {.center-text}
*Return to [main page](index.html)*
:::