The code in this repository transforms the data from dqa_library into a format more easily read by dqa_shiny to reduce computation required to display visualizations on the PEDSnet Data Quality Dashboard and to generate threshold-based performance measures to feed back to individual institutions.
The code was developed on R version 4.2.0 (2022-04-22). To execute the R code in this repository, users will need to install the packages named in the top lines of driver.R.
The data is expected to be in the format of the output from the dqa_library step of the PEDSnet DQA process.
- Set up configurations for execution of PEDSnet standardized R framework code in run.R and site_info.R, including setting up srcr .json config file to successfully establish connection to the database containing the DQA results.
- Edit run.R:
config('results_schema', 'dqa_rox')
: change 'dqa_rox' to the name of the schema containing the DQA outputconfig('new_site_pp',FALSE)
: if running against all sites, set toFALSE
. If there is already output for some sites and you want to add in another site, set toTRUE
config('results_schema_other', NA)
: if running against all sites, set toNA
. If there is already output for some sites and you want to add in another site, set to the name of the results schema containing the output of dqa_library for the other siteconfig('results_name_tag', NA)
: if the suffix on the tables output from dqa_library is anything other than the default, change to the suffix on the tables. By default, set toNA
to expect no suffix on the table namesconfig('current_version', 'vxx')
: change 'vxx' to the name of the current version of the data. Should match the name assigned in the DQA library step in the column database_version in the outputconfig('previous_version', 'vyy')
: change 'vyy' to the name of the previous version of the data. Should match the name assigned in the DQA library step in the column database_version in the output
- Edit site_info.R:
config('db_src')
should set up the connection to the database containing the schema with the data output from dqa_library where you will also output the results from processing.- You will also need to establish a connection with the database containing output from
dqa_redcap
, specifically the tabledqa_issues_redcap
.config('db_src_prev')
will do this for you, but you need to make sure the connection information is either in a file namedconfig_dqa_prev.json
or, if it is contained in a file with another name, the environmental variablePEDSNET_DB_SRC_CONFIG_BASE_PREV
is set to the name of the file. This can be done either in the console or in your .Rprofile. It is assumed that the results from the previous DQ run are in a schema nameddqa_rox
. If this is not the case, this needs to be edited in the call to generate redcap_prev in driver.R
- Either set
config('execution_mode', '')
toproduction
in run.R or set todevelopment
and highlight the contents within .run{} in driver.R
All of the processing steps execute through the driver.R file. This code:
- Establishes thresholds to apply to the DQ output based on standard PEDSnet thresholds or a site-specific threshold that has been established, if one exists.
- Generates and tracks a history of threshold values.
- Generates at least one table for each table output from dqa_library, containing a version of the data with post-processing steps applied, and outputs each table with the suffix
pp
.- The tables with a
pp
suffix are the ones accessed by the dashboard code
- The tables with a
- Generates a version of each of the
pp
tables in a format in which the thresholds should be applied and outputs each table with the prefixthr
. - Applies the thresholds established in step 1 to the tables generated in step 4 and outputs violations that were not previously indicated as issues to stop raising. An indicator for whether to continue or stop raising a consistent issue across cycles is pulled from the REDCap review form.
- Generates an anonymous site identifier for each site name in the
pp
tables and creates a column in each of thepp
tables with the anonymous identifier.