Skip to content
Morrigan edited this page Sep 1, 2023 · 8 revisions

Welcome to the detect_pilot_test_1y wiki!

Repository Contents

  • analysis. This folder contains code files used for analysis (e.g., for reports and manuscripts).

    • nij_reports. This folder contains code files used to create reports for the National Institute of Justice.
      • analysis_demographics_nij_report.Rmd. This file contains the code and description of the process to generate initial demographic data for the MedStar ePCR data set.
    • paper_01_evaluation. This folder contains code files used to complete the analyses for Evaluation of the Detection of Elder Mistreatment Through Emergency Care Technicians Project Screening Tool.
      • analysis_demographics_evaluation_manuscript.Rmd. This file contains the code and description of the process to generate descriptive statistics of the APS data set. This included the number of unique subjects and mean age.
      • figs_aps_change_in_reporting.Rmd. This file contains the code and description of the process to generate figures illustrating the change in reporting to APS over time. Outputs "fig_aps_ddd_change_in_reporting.eps" and "fig_aps_ddd_change_in_reporting.png"
      • table_medstar_epcr_patient_symptoms.Rmd. This file contains the code and description of the process to generate the symptom table from MedStar ePCR data. Outputs "fig_complain_symptom_heatmap.png" and "fig_relative_symptoms_heatmap.png".
    • paper_02_1year. This folder contains code files used to complete the analyses for the 1 year pilot paper
      • analysis_medstar_aps_01_consort_table.qmd. This file contains the code and description of the process that generated the data used in creating consort tables for the study related outcomes of each MedStar Response.
      • analysis_medstar_aps_02_detect_response_patterns.qmd. This file contains the code and description of the process that generated the data used in creating response pattern frequency tables for each of the DETECT screening tool items. Includes the examination of screening performance, positive screening occurrence, specific DETECT item completion, and positive DETECT item completion ("YES" selection) over the study months.
      • analysis_medstar_aps_03_fidelity_agreement.qmd. This file contains the code and description of the process that generated the data used to calculate fidelity to reporting procedures and DETECT item completion. This included kappa agreements (Intent to Report & Matching Reports), (DETECT Screening Outcome & APS Determination), and (DETECT Screening Outcome & Matching Reports). It also included an exploration of medic use of the "APS Report Number" field as a comment box.
      • analysis_medstar_aps_04_demographics_analysis.qmd. This file contains the code and description of the process that evaluated demographics for all subjects, including manual cleaning to assign a uniform designation for demographic data in the case of subjects with multiple values (e.g., one subject listed as "Male" in an encounter, and "Female" in another encounter).
  • data. This folder contains data files (e.g., csv, Rds).

    • Study data should never be pushed to GitHub.

    • aps_data.xlsx. The data provided by Texas APS, in Excel format. Contains APS Investigation and determination data for Dallas, Tarrant, and Johnson Counties.

    • medstar_epcr.xlsx. The data provided by MedStar, in Excel format. Contains ePCR records, including DETECT screening tool variable data.

    • medstar_compliance.xlsx. The data provided by MedStar, in Excel format. Contains Legal Compliance Department data on APS reports made by medics.

    • aps_cleaning. This folder contains APS data files, and files utilized in cleaning the APS data.

      • aps_01.rds. Created from "aps_data.xlsx" in "aps_cleaning_01_initial_clean.qmd". Variable names are reformatted, and initial cleaning and wrangling of the data is completed.
      • aps_01_phi_cleaning.csv. Utilized to point-clean observations when clean would otherwise require PHI in code. Used in "aps_cleaning_01_initial_clean.qmd" as part of creating "aps_01.rds"
      • aps_02.rds. Created from "aps_01.rds" in "aps_cleaning_02_determination_vars.qmd." Subject-level data is cleaned based on feedback from APS clarifying APS determination variables.
      • aps_03.rds. Created from "aps_02.rds" in "merge_aps_medstar_02_group_ids.qmd". Subject-IDs are revised based on "aps_id_revision_pattern_01.rds", as there were subjects that appeared to have more than one APS Person ID (confirmed by APS). Cross-set Subject IDs were added after fastLink record linkage.
      • aps_04.rds. Created from "aps_03.rds" in "aps_cleaning_03_revisions_after_merge_map.qmd". Subject-level data is corrected based on the merge-map data created in "merge_aps_medstar_02_group_ids.qmd".
      • aps_05.rds. Created from "aps_04.rds" in "merge_aps_medstar_03_refining_observations.qmd". Direct matched pairs are added.
      • aps_06.rds. Created from "aps_05.rds" in "merge_aps_medstar_04_creating_merge_datasets.qmd."
      • aps_data_isolation_01.qmd. This file was utilized to generate an Excel file for APS, for use in clarifying data regarding determinations.
      • aps_ids_replacement_map.rds. This file contained the table mapping replacement APS Person IDs when more than one Person ID was found to refer to the same subject. Initiated in "aps_cleaning_01_initial_clean.qmd". It is replaced by "aps_id_revision_pattern_01.rds", created in "merge_aps_medstar_02_group_ids.qmd". Columns are reorganized to facilitate analyses.
    • medstar_cleaning.

      • medstar_epcr_01.rds. Created from "medstar_epcr.xlsx" and "medstar_epcr_01_phi_cleaning.csv" in "medstar_epcr_cleaning_01_initial_clean.qmd". Initial cleaning, organization, and deduplication of the MedStar ePCR records.
      • medstar_epcr_01_phi_cleaning.csv. Utilized to point-clean observations when clean would otherwise require PHI in code. Used in "medstar_epcr_cleaning_01_initial_clean.qmd" as part of creating "edstar_epcr_01.rds"
      • medstar_epcr_02_fastlinkobj.rds. The fastLink object created from the within-set record linkage in "medstar_epcr_03_unique_ids.qmd." Utilized to create unique subject IDs within the MedStar ePCR data set.
      • medstar_epcr_02_id_initial.rds. The version of the MedStar ePCR data set with the initial Unique Subject IDs created from the within-set record linkage in "medstar_epcr_03_unique_ids.qmd." Contains record duplication errors introduced in the assignment process.
      • medstar_epcr_02_stacked_pairs.rds. The stacked pairs created from the within-set record linkage in "medstar_epcr_03_unique_ids.qmd." Utilized to create unique subject IDs within the MedStar ePCR data set.
      • medstar_epcr_03_cleaned_ids.rds. The cleaned MedStar ePCR data containing a Unique Subject ID for each subject, cleansed of duplication after manual cleaning in "medstar_epcr_cleaning_03_unique_ids.qmd."
      • medstar_compliance_01_cleaned.rds. The originally cleansed and wrangled MedStar Compliance data, created from "medstar_compliance.xlsx" in "medstar_compliance_01_cleaning.qmd."
      • medstar_01.rds. Created from "medstar_epcr_03_cleaned_ids" and "medstar_compliance_01_cleaned.rds" in "medstar_merging_epcr_compliance.qmd." Contains the merged MedStar ePCR and Compliance data sets in one file.
      • medstar_02.rds. Created from "medstar_01.rds" in "merge_aps_medstar_02_group_ids.qmd." Manual verification and adjustment of cross-set unique subject IDs was performed.
      • medstar_03.rds. Created from "medstar_02.rds" in "merge_aps_medstar_03_refining_observations.qmd". Direct-pair data was added.
      • medstar_04.rds. Created from "medstar_03.rds" in "medstar_aps_merged_04_creating_merge_datasets.qmd". Subject-level data was added, and variables were reorganized.
    • merge_aps_medstar.

      • aps_id_revision_pattern_01.rds. Map of revisions for APS Person ID created in "merge_aps_medstar_02_group_ids.qmd".
      • medstar_id_revision_pattern_01.rds. Map of revisions for the MedStar ID created in "merge_aps_medstar_02_group_ids.qmd".
      • aps_medstar_full_subjid_map_01.rds. Created in "merge_aps_medstar_02_group_ids.qmd". Map indicating the direct linkage between the Cross-set unique subject ID, and the unique subject ID present in the MedStar and APS data sets.
      • medstar_aps_full_row_map_01.rds. Created from "medstar_02.rds", "aps_04.rds", and "aps_medstar_full_subjid_map_01.rds" in "merge_aps_medstar_03_refining_observations.qmd." Map of the direct-pairs of MedStar Responses to APS Intakes.
      • medstar_aps_merged_01_timeline_all_rows.rds. Created from "medstar_04.rds" and "aps_06.rds" in "merge_aps_medstar_04_creating_merge_datasets.qmd." Contains all observations in both data sets, facilitating a chronological view of all data points by subject.
      • medstar_aps_merged_02_response_based_row_pairs.rds. Created from "medstar_04.rds" and "aps_06.rds" in "merge_aps_medstar_04_creating_merge_datasets.qmd." Contains all MedStar response observations, with APS data contained in direct-paired APS Intakes.
      • medstar_aps_merged_03_single_subject_per_row.rds. Created from "medstar_04.rds" and "aps_06.rds" in "merge_aps_medstar_04_creating_merge_datasets.qmd." Contains subject-level aggregate data across both datasets, where each row references a single subject.
      • medstar_aps_merged_04_temporal_case_nums.rds. Created from "medstar_04.rds" and "aps_06.rds" in "merge_aps_medstar_04_creating_merge_datasets.qmd." Contains all MedStar response observations, with APS data contained in temporally-linked APS Case Numbers.
      • merge_aps_medstar_01_fastlinkobj.rds. Created from "medstar_01.rds" and "aps_02.rds" in "merge_aps_medstar_01_fastLink.qmd". It is the fastLink object created from the cross-set subject linkage.
      • merge_aps_medstar_01_stacked_pairs.rds. Created from "medstar_01.rds" and "aps_02.rds" in "merge_aps_medstar_01_fastLink.qmd". It is the stacked pairs created from the cross-set subject linkage.
      • merge_aps_medstar_00_demographics_by_subject.rds. Created from "medstar_aps_merged_04_temporal_case_nums.rds" in "analysis_medstar_aps_04_demographics_analysis.qmd"
  • data_management. This folder contains code files used to import, clean, and transform data.

    • fastlink_benchmark_medstar_epcr.qmd. This file contains the code and description of the exploration of limitations in the use of the fastLink package for record linkage, primarily CPU/RAM limitations and the number of variables. It utilized "medstar_epcr_01.rds" in testing.

    • aps. This folder contains code files used in cleaning and processing the APS data set.

      • aps_cleaning_01_initial_clean.qmd. This file contains the code and description of the process used to initially clean and wrangle the APS data. It utilizes "aps_data.xlsx" and "aps_01_phi_cleaning.csv". It outputs "aps_01.rds" and "aps_id_replacement_map.rds". It uses the functions get_unique_value_summary, get_cases_from_person, get_person_from_cases, and get_all_cases_persons.
      • aps_cleaning_02_determination_vars.qmd. This file contains the code and description of the processed utilized to further clean and wrangle the APS data after receipt of clarifying information regarding APS determination data variables. It utilizes "aps_01.rds". It outputs "aps_02.rds". It uses the functions get_unique_value_summary and redefine_aps_determinations.
      • aps_cleaning_03_revisions_afer_merge_map.qmd. This file contains the code and description of the process for revising subject-level data after the subject ID revisions made during the MedStar-APS merge. It utilizes "aps_03.rds". It outputs "aps_04.rds". It uses the functions get_unique_value_summary and redefine_determinations_modified.
      • aps_codebookr_01.qmd. This file contains the code and process for creating the codebook "aps_codebook_01.docx" for the data file "aps_01.rds"
      • aps_codebookr_06.qmd. This file contains the code and process for creating the codebook "aps_codebook_06.docx" for the data file "aps_06.rds"
    • medstar. This folder contains code files used in cleaning and processing the MedStar data sets

      • medstar_epcr_cleaning_01_initial_clean.qmd. This file contains the code and description of the process used to initially clean and wrangle the MedStar ePCR data. It utilizes "medstar_epcr.xlsx" and "medstar_epcr_01_phi_cleaning.csv". It outputs "medstar_epcr_01.rds". It uses the function get_unique_value_summary.
      • medstar_epcr_cleaning_02_fastLink.qmd. This file contains the code and description of the process to generate a unique subject ID within the MedStar ePCR data set using fastLink. It includes the exploration of variable selection for record linkage to establish a productive posterior probability range for manual verification and creation of initial ID assignments. It utilizes "medstar_epcr_01.rds". It outputs "medstar_epcr_02_fastlinkobj.rds", "medstar_epcr_02_stacked_pairs.rds", and "medstar_epcr_02_id_initial.rds". It uses the functions get_unique_value_summary, fmr_fastlink_stack_matches, and fmr_add_unique_id.
      • medstar_epcr_cleaning_03_unique_ids.qmd. This file contains the code and description of the process to manually verify and clean unique subject ID assignments in the MedStar ePCR data set. It uses "medstar_02_fastlinkobj.rds", "medstar_epcr_02_stacked_pairs.rds", and "medstar_02_id_initial.rds". It outputs "medstar_epcr_03_cleaned_ids.rds". It uses the functions get_unique_value_summary, fmr_fastlink_stack_matches, fmr_add_unique_id, and add_replacement_rows.
      • medstar_compliance_01_cleaning.qmd. This file contains the code and description of the process to initially clean and wrangle the MedStar Compliance data. It utilizes "medstar_compliance.xlsx". It outputs "medstar_compliance_01_cleaned.rds". It uses the function get_unique_value_summary.
      • medstar_merging_epcr_compliance.qmd. This file contains the code and description of the process to merge the MedStar ePCR and Compliance data sets, which was highly limited due to a lack of identifiers in the Compliance data. It utilizes "medsdtar_epcr_03_cleaned_ids.rds" and "medstar_compliance_01_cleaned.rds". It outputs "medstar_01.rds"
      • medstar_codebookr_01.qmd. This file contains the code and process for creating the codebook "medstar_codebook_01.docx" for the data file "medstar_01.rds"
      • medstar_codebookr_04.qmd. This file contains the code and process for creating the codebook "medstar_codebook_04.docx" for the data file "medstar_04.rds"
    • merge_aps_medstar. This folder contains code files used in the process of merging the APS and MedStar data sets.

      • merge_aps_medstar_01_fastLink.qmd. This file contains the code used to generate initial cross-set IDs linking the MedStar and APS Data sets using fastLink, and description of the process. It includes the exploration of variable selection for record linkage to establish a productive posterior probability range for manual verification and creation of initial ID assignments. It utilizes "medstar_01.rds" and "aps_02.rds". It outputs "merge_aps_medstar_01_fastlinkobj.rds", "merge_aps_medstar_01_stacked_pairs.rds", and "merge_aps_medstar_01_id_initial.rds". It uses the functions get_unique_value_summary, get_pair_data, and stack_ids.
      • merge_aps_medstar_02_group_ids.qmd. This file contains the code and description of the process for refining initial ID assignments, including manual verification and adjustment of assignments. It utilizes "merge_aps_medstar_01_id_initial.rds", "medstar_01.rds", "aps_02.rds", and "aps_id_replacement_map.rds". It outputs "aps_id_revision_pattern_01.rds", "medstar_id_revision_pattern_01.rds", "aps_medstar_full_subjid_map_01.rds", "medstar_02.rds", and "aps_03.rds". It uses the functions get_unique_value_summary, get_stacked_pairs, get_subjid_from_groups, get_groups_from_subjids, get_all_groups_subjids, search_by_group, get_subjid_from_row, build_source_map, and add_replacement_rows.
      • merge_aps_medstar_03_refining_obsrvations.qmd. This file contains the code and description of the process for creating direct-matched pairs between MedStar responses and APS Intakes. It utilizes "medstar_02.rds", "aps_04.rds", and "aps_medstar_full_subjid_map_01.rds". It outputs "medstar_aps_full_row_map_01.rds", "medstar_03.rds", and "aps_05.rds". It uses the functions get_unique_value_summary, and point_map_aps.
      • merge_aps_medstar_04_creating_merge_datasets.qmd. This file contains the code and description of the process for creating the merged datasets from the MedStar and APS data sets. It utilizes "medstar_03.rds", "aps_05.rds", and "medstar_aps_full_row_map_01.rds". It outputs "medstar_04.rds", "aps_06.rds", "medstar_aps_mergd_01_timeline_all_rows.rds", "medstar_aps_merged_02_response_based_row_pairs.rds", "medstar_aps_merged_03_single_subject_per_row.rds", and "medstar_aps_merged_04_temporal_case_nums.rds".
      • medstar_aps_merged_codebookr_01_timeline_all_rows.qmd. This file contains the code and process for creating the codebook "medstar_aps_merged_codebook_01_timeline_all_rows.docx" for the data file "medstar_aps_merged_01_timeline_all_rows.rds"
      • medstar_aps_merged_codebookr_02_response_based_row_pairs.qmd. This file contains the code and process for creating the codebook "medstar_aps_merged_codebook_02_response_based_row_pairs.docx" for the data file "medstar_aps_merged_02_response_based_row_pairs.rds".
      • medstar_aps_merged_codebookr_03_single_subject_per_row.qmd. This file contains the code and process for creating the codebook "medstar_aps_merged_codebook_03_single_subject_per_row.docx" for the data file "medstar_aps_merged_03_single_subject_per_row.rds".
      • medstar_aps_merged_codebookr_04_temporal_case_nums.qmd. This file contains the code and process for creating the codebook "medstar_aps_merged_codebook_04_temporal_case_nums.docx" for the data file "medstar_aps_merged_04_temporal_case_nums.rds"
    • redcap_processing. This folder contains the code files used to process the original XLSX data files for upload into REDCap utilizing API calls, including the generation of the required DataDictionary.

    • paper_01_evaluation. This folder contains code files used to process the data for Evaluation of the Detection of Elder Mistreatment Through Emergency Care Technicians Project Screening Tool.

  • docs. This folder contains Word, PDF, and other documents that aren't direct inputs to, or outputs of, any data management or analysis code, but they do provide context or other useful information.

    • codebooks. This folder contains the codebook(s) for the DETECT 1-year pilot study data frames. The names of the codebook files should match the names of the data files, with "codebook" appended between the prefix and numeral in the file name (i.e. the codebook for the data set "data_01_testing.rds" should be "data_codebook_01_testing.docx", created with "data_codebookr_01_testing.qmd"). Files containing the code should be in data_management or the other most appropriate code-based folder.
      • aps_codebook_01.docx
      • aps_codebook_06.docx
      • medstar_codebook_01.docx
      • medstar_codebook_04.docx
      • medstar_aps_merged_codebook_01_timeline_all_rows.docx
      • medstar_aps_merged_codebook_02_response_based_row_pairs.docx
      • medstar_aps_merged_codebook_03_single_subject_per_row.docx
      • medstar_aps_merged_codebook_04_temporal_case_nums.docx
    • Incident Complaint Table - Detect - Google Docs.webloc. This XML file contains a reference to a Google Document. This document contains the table of the frequency of incident complaints.
    • Matching Rules.xlsx. This Excel file contains a description of the original rules utilized to manually filter potentially matching subjects.
    • Merge APS and MedStar 1-Year Data.pptx. This PowerPoint file gives an overview of the project prior to completion of the data merge (2023-03-06) used to orient new team members.
    • Results of ITS Analysis.xlsx. This Excel file gives an overview of the results of ITS analysis, including Arima Models (crude and adjusted for other reports) and Sensitivity analysis with Poisson Regression with Robust SEs (model, crude, and adjusted for other reports). Used in Evaluation of the Detection of Elder Mistreatment Through Emergency Care Technicians Project Screening Tool.
  • exploratory. This folder contains code for minor one-off and exploratory analyses.

    • analysis_follow_up_flow_chart_estimates.Rmd. This file contains the code and description of the exploration of follow-up interview data, such as estimating the number of screenings that would be conducted each week. Uses "custom-css.css" in rendering to HTML.
    • custom-css.css. This file contains the CSS code used to render "analysis_follow_up_flow_chart_estimates.Rmd" in HTML.
  • img. This folder contains image files.

  • r. This folder contains R scripts. Typically, R scripts are only used for writing custom functions.

    • fmr_add_match_id.R. This code adds a sequential numeric ID number to a data frame based on fastLink record matches.
    • fmr_add_pair_num.R. This code adds pair numbers to stacked pairs of matched records, essentially sequentially numbering every two rows in a data frame as a part of fuzzy matching.
    • fmr_block_data.R. This code assists in subsetting data frames into blocks for fuzzy matching.
    • fmr_stack_fastlink_matches.R. This code is meant to stack fastLink matches for ease of comparison of the variable values used in fuzzy matching.
    • incident_complaint_table_code.R. This code uses the feather package to create a table of incident complaints.
    • theme_bfuncs.R. This code generates themes utilized to control non-data display with ggplot2.
  • sas. This folder contains SAS code files. We may end up wanting to integrate these into data_management and analysis.

File naming conventions

  • Separate data cleaning and data analysis into separate Rmd files.

    • Data cleaning files should be named:
      • data_[order number]_[purpose]
      • Example: data_03_prep_for_sna
    • Analysis files that do not directly create a table or figure should be named:
      • analysis_[order number]_[brief summary of content]
      • Example: analysis_01_exploratory
    • Analysis files that DO directly create a table or figure should be named:
      • table_[brief summary of content] or
      • fig_[brief summary of content]
      • Example: table_network_characteristics
  • Images should be png and should be saved to the img folder and given a descriptive name.

  • Word and pdf files should be saved to the docs folder and given a descriptive name.

  • RDS, RData, CSV, Excel, etc. files should be saved to the data folder and given a descriptive name.

Clone this wiki locally