First stab at the final data evaluation function. This includes a tem… #385

jwalsh28 · 2024-09-17T17:56:08Z

…plate for the final data expectations form which is used in the test function. It also includes an example in QMD format based on the housing affordability final data.

Note that there is also a page in the Wiki now describing this function and how to apply it. Some outstanding questions below:

For final data expectations - do we have them read in the form at top of each metric program or is it fine as an argument in the function?
Where should the final data expectations forms live? Right now I'm encouraging the final data folder for the relevant metric.
And where should the final data expectations template live? I have it in the functions/testing folder as of now.
Where should the test function example live? I have it in the functions/testing folder right now.

…plate for the final data expectations form which is used in the test function. It also includes an example in QMD format based on the housing affordability final data

wcurrangroome

See overarching feedback in a comment on the .qmd. Looks good, JP!

wcurrangroome · 2024-09-27T14:14:35Z

functions/testing/evaluate_final_data.R

+#   subgroups (logical): a true or false value indicating if the final file has subgroups
+#   confidence_intervals  (logical): a true or false value indicating if the final file has confidence intervals
+# Returns:
+#   a series of test results that will throw an error is failed 


Suggested change

# a series of test results that will throw an error is failed

# a series of test results that will throw an error if failed

wcurrangroome · 2024-09-27T14:15:07Z

functions/testing/evaluate_final_data.R

+  select(X1:X5) %>% 
+  mutate(quality_title = ifelse(X5 == "Yes", paste0(X2, "_", "quality"), NA_character_),
+         ci_low_title = ifelse(X4 == "Yes", paste0(X2, "_", "lb"), NA_character_),
+         ci_high_title = ifelse(X4 == "Yes", paste0(X2, "_", "up"), NA_character_),


Suggested change

ci_high_title = ifelse(X4 == "Yes", paste0(X2, "_", "up"), NA_character_),

ci_high_title = ifelse(X4 == "Yes", paste0(X2, "_", "ub"), NA_character_),

wcurrangroome · 2024-09-27T14:21:11Z

functions/testing/evaluate_final_data.R

+  )
+
+#For final data with multiple values expand the form results 
+if(exp_form_variables %>% 


Personal preference, but I either like to have a conditional test on a single line, or to separate it out and assign the result of the test to a relevantly named variable, that I then pass to the if expression.

wcurrangroome · 2024-09-27T14:21:19Z

functions/testing/evaluate_final_data.R

+  )
+
+#For final data with multiple values expand the form results 
+if(exp_form_variables %>% 


Suggested change

if(exp_form_variables %>%

if (exp_form_variables %>%

wcurrangroome · 2024-09-27T14:25:23Z

functions/testing/evaluate_final_data.R

+                            geography = "county", subgroups = FALSE, confidence_intervals = TRUE) {
+
+#Read in the data expectation form
+exp_form <- read_csv(here::here(exp_form_path),


I'd suggest renaming all variables here. I don't like working with variables named X1, X2, etc. because I never know what they're supposed to represent, and so it's difficult to know if they're treated incorrectly at any point. Can we skip the first few lines of metadata in read_csv() so that these are auto-named, then either manually rename each or use janitor::clean_names() or something similar?

wcurrangroome · 2024-09-27T15:13:32Z