Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*Add ds_* use case #25

Open
meerapatelmd opened this issue Jan 12, 2021 · 1 comment
Open

*Add ds_* use case #25

meerapatelmd opened this issue Jan 12, 2021 · 1 comment

Comments

@meerapatelmd
Copy link
Owner

Use Case

The most commonly seen medications with a non-null administration_dose field
are first derived from the Drug Exposures table.

WITH ct AS (
SELECT de.drug_concept_id, COUNT(de.drug_concept_id) AS drug_concept_count
FROM omop_cdm_1.drug_exposure de 
WHERE de.administration_dose IS NOT NULL
GROUP BY de.drug_concept_id 
ORDER BY COUNT(de.drug_concept_id) DESC
)
SELECT ct.drug_concept_count, c.*
FROM ct 
LEFT JOIN omop_vocabulary.concept c
ON c.concept_id = ct.drug_concept_id
;
drug_concept_count <-
pg13::query(
  conn = conn, 
  sql_statement = 
  "
   WITH ct AS (
  SELECT de.drug_concept_id, COUNT(de.drug_concept_id) AS drug_concept_count
  FROM omop_cdm_1.drug_exposure de 
  WHERE de.administration_dose IS NOT NULL
  GROUP BY de.drug_concept_id 
  ORDER BY COUNT(de.drug_concept_id) DESC
  )
  SELECT ct.drug_concept_count, c.*
  FROM ct 
  LEFT JOIN omop_vocabulary.concept c
  ON c.concept_id = ct.drug_concept_id
  ;
  "
)
drug_concept_count

For testing, the top 10 most frequently seen drugs in the Drug Exposure table
are filtered.

drug_concept <-
  drug_concept_count %>%
  slice(1:10)
drug_concept

The top 10 drugs are joined back with the Drug Exposures table to retrieve the
administration_dose, administration_unit, and frequency_concept_id fields.

SELECT DISTINCT 
  a.*, b.administration_dose,b.administration_unit,b.frequency_concept_id 
FROM @drug_concept a 
LEFT JOIN omop_cdm_1.drug_exposure b 
ON a.concept_id = b.drug_concept_id
drug_record <-
pg13::join1(
  conn = conn, 
  write_schema = "patelm9",
  data = drug_concept, 
  column = "concept_id",
  select_join_on_fields = c("administration_dose", 
                            "administration_unit"),
  join_on_schema = "omop_cdm_1",
  join_on_table = "drug_exposure",
  join_on_column = "drug_concept_id",
  distinct = TRUE
)
drug_record

For easier visualization, the formatting of the concept attributes are merged
into a single drug string, with the concept_id field now called the drug_id.

drug_record2 <-
  drug_record %>%
  chariot::merge_strip(into = "drug")
drug_record2

This dataset is then joined to the Drug Strength Staged table to get the staged
value and unit fields for each drug.

SELECT a.*, b.ingredient_concept_id,b.value,b.unit 
FROM @drug_record2 a 
LEFT JOIN patelm9.drug_strength_staged b 
ON a.drug_id = b.drug_concept_id;
drug_strength_record <-
  pg13::join1(
    conn = conn, 
    write_schema = "patelm9",
    data = drug_record2,
    column = "drug_id",
    select_join_on_fields = c("ingredient_concept_id",
                              "value",
                              "unit"),
    join_on_schema = "patelm9",
    join_on_table = "drug_strength_staged",
    join_on_column = "drug_concept_id"
  ) 

The resulting table tells the story of the drug exposure for a given record. The
dose of the drug at each administration, the units of administration, the corresponding
ingredient_concept_id from the Drug Strength table, and the staged value and unit
corresponding to the amount of the ingredient in 1 unit of the drug.

drug_strength_record %>%
  select(drug_id, administration_dose, administration_unit, ingredient_concept_id, value, unit)

The value field requires evaluation as a numeric value, which would require looping
over almost 40,000 rows. Instead, each unique value is isolated, resulting in 9
rows. These 9 values are mapped to their corresponding numeric value.

values <-
  drug_strength_record %>%
  select(value) %>%
  distinct()
values
values$numeric_value <- sapply(values$value, function(x) eval(rlang::parse_expr(x)))
values

The resulting dataset is joined back with the original data.

drug_strength_record2 <- 
drug_strength_record %>%
  left_join(values, by = "value")
drug_strength_record2
fantasia::dcOMOP(conn = conn)

Themes

Themes can be viewed at: https://bootswatch.com/3/.

Syntax Highlighting

Syntax Highlighting Styles can be viewed at https://www.garrickadenbuie.com/blog/pandoc-syntax-highlighting-examples/.

Dataframe

Dataframe printing options include default, kable, tibble, or paged.

For paged dataframes, the chunk options include:

  • max.print: the number of rows to print
  • rows.print: the number of rows to display
  • cols.print: the number of columns to print
  • cols.min.print: the minimum number of columns to display
  • pages.print: the number of pages to display under page navigation
  • paged.print: when set to FALSE turns off paged display for the chunk
  • rownames.print: when set to FALSE turns off row names for the chunk

Figure Captions

library(tidyverse)
mpg %>%
  ggplot( aes(x=reorder(class, hwy), y=hwy, fill=class)) + 
    geom_boxplot() +
    xlab("class") +
    theme(legend.position="none")
@meerapatelmd
Copy link
Owner Author

meerapatelmd commented Jan 12, 2021

Overview

The proposed additions to the Drug Exposure table are related to the
drug administration and frequency from the source data.

The drug administration attributes administration_dose and
administration_unit were designed to provide standardized and verified values
sourced from the quantity and dose_unit_source_value fields in the Drug
Exposures table, respectively. For solid formulations, the amount would be in
mass such as 'grams', while liquid preparations would be in measurements of
volume such as 'milliliters'. These fields are destined to be used alongside
the Drug Strength table to calculate the total mass of the RxNorm Ingredient
in a given administration, regardless of the original formulation. When this
calculation is used in conjunction with the daily frequency related to the
frequency_concept_id, the total active ingredient administered can be returned
at a rate per day or an aggregate spanning the timeframe of the drug exposure record.

Solid formulations taken orally have a straightforward conversion because the
information required to calculate the active ingredient mass is a multiplier of
the number of tablets that were administered. Therefore, thequantity field
suffices in providing this information.


$$ \text{quantity}_\text{de} * \text{dose_unit_source_value}_\text{de} * \text{amount_value}_\text{ds} * \text{amount_unit}_\text{ds} = \frac{\text{total active ingredient mass}}{\text{1 administration}} $$

However, for all other formulations such as liquids reported as
concentrations (i.e. milligrams per milliliter), the volume
administered in the quantity and dose_unit_source_value fields require
additional conversions.


$$ \text{quantity}_\text{de} * \text{dose_unit_source_value}_\text{de} * \frac{\text{numerator_value}_\text{ds}}{\text{denominator_value}_\text{ds}} * \frac{\text{numerator_unit}_\text{ds}}{\text{denominator_unit}_\text{ds}} = \frac{\text{total active ingredient mass}}{\text{1 administration}} $$

In parallel to this, the frequency of drug administration is also carried over
from the source data and standardized to a concept id as frequency_concept_id.
The frequency_concept_id normalizes the amount of an active ingredient administered
to a rate of per day.


$$ \frac{\text{total active ingredient mass}}{\text{1 administration}} * \frac{\text{x administrations}_\text{de*}}{\text{day}} = \frac{\text{x * total active ingredient mass}}{\text{day}} $$

Finally, the total active ingredient for a drug exposure record is calculated by
deriving the timeframe of the drug exposure record in units of days, which is
multiplied by the results above.


$$ (\text{drug_exposure_end_date}_\text{de} - \text{drug_exposure_start_date}_\text{de}) * \frac{\text{x * total active ingredient mass}}{\text{day}} = \text{total active ingredient mass in drug exposure} $$

The intent of the administration_dose and administration_unit fields is to
have a normalized and qa'd representation of the quantity and
dose_unit_source_value across all different types of drug formulations within the
Drug Exposures table.


$$ \text{administration_dose}_\text{de*} \sim \text{quantity}_\text{de} $$

$$
\text{administration_unit}\text{de*} \sim \text{dose_unit_source_value}\text{de}
$$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant