Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ICD-9 / ICD-10 crosswalk #190

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

dpritchLibre
Copy link

This PR addresses issue #186 and #189. In summary it adds a generic S3 function icd_gem and associated methods that takes a vector of input ICD-9 or ICD-10 codes and returns a data frame of the corresponding mappings.

The raw 2018 GEMs as found at https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-CM-and-GEMs.html are stored in the project in the following files:

  • data-raw/icd-gem-2018-convert-10-to-9.txt
  • data-raw/icd-gem-2018-convert-9-to-10.txt

These raw GEMs are translated into R data frames using the script in tools/icd-gem-import-routines.R. The GEMs are translated to an equivalent form that has one scenario (using the terminology from the GEMs) per row that we find to be more useful for code lookup. The object documentation for these converted GEMs is added to the R/datadocs.R file. Testing for the data import / conversion is included at the bottom of icd-gem-import-routines.R. The data frames are stored in the following files:

  • data/icd_gem_9_to_10.rda
  • data/icd_gem_10_to_9.rda

The icd_gem generic function and associated methods are found in the R/icd-gem.R file. These routines are essentially convenience functions that subset the GEM data frames (and add rows for input codes that aren't found in the GEMs).

@wenjie2wang
Copy link

The ICD-9 / ICD-10 crosswalk is also implemented in the touch R package: https://hub.wenjie-stat.me/touch/reference/icd_map.html. It will be interesting to compare the implementation in this PR with touch::icd_map().

@dpritchLibre
Copy link
Author

Hi @wenjie2wang, thanks for mentioning the touch package, unfortunately I did not find your package when I was originally searching for this functionality. It sounds like we both are working with similar data, I would be happy to collaborate on this or future projects.

At first glance it appears that there is some difference between this PR and the touch package in how combination codes are handled. For example, consider the ICD-9 code 24951. For touch, we have the following.

> icd_map(c("24951"), output = "list")
[[1]]
[1] "E08311" "E08319" "E0836"  "E0839"  "E0865"  "E09311" "E09319" "E0936"  "E0939" 

And for the icd PR we have the following, which has the interpretation either a code from c("E0839" "E0939") or a code from both c("E08311" "E08319" "E0836" "E09311" "E09319" "E0936" ) and "E0865" (see the function documentation for more details on the meaning).

> conv <- as_tibble(icd_gem("24951"))
> conv
# A tibble: 2 x 5
  source scenario type        approx codes           
  <chr>  <chr>    <chr>       <lgl>  <list>          
1 24951  0        simple      TRUE   <named list [1]>
2 24951  1        combination TRUE   <named list [2]>

> conv$codes
[[1]]
[[1]]$`0`
[1] "E0839" "E0939"


[[2]]
[[2]]$`1`
[1] "E08311" "E08319" "E0836"  "E09311" "E09319" "E0936" 

[[2]]$`2`
[1] "E0865"

@wenjie2wang
Copy link

Hi David, thanks for looking into the difference and providing the example!

The touch::icd_map() was motivated for a fast conversion of hundreds of millions of ICD codes in one project. The conversion follows the MapIT toolkit from AHRQ, where the conversion can be done by GEM with its reverse mappings (see the documentation of the mapping tool for details). I think the combination flags are no longer informative in the reverse mappings. Thus, they were completely ignored when I was implementing touch::icd_map() for simplicity. The implementation in this PR does provide more information for conversion of codes with positive combination flags by one step GEM.

@wenjie2wang
Copy link

Just a follow-up: I did some quick updates to the touch::icd_map() for the codes with positive combination flags, such as the ICD-9 code 24951.

icd_map("24951", output = "list")
#> [[1]]
#> [1] "E0839"        "E0939"        "E08311+E0865" "E08319+E0865" "E0836+E0865" 
#> [6] "E09311+E0865" "E09319+E0865" "E0936+E0865" 
#> 

where the + indicates the code combination.

@dpritchLibre
Copy link
Author

This is great -- thanks for sharing. I hadn't heard of the reverse mappings strategy before. It seems like this is useful for getting a wider definition of the codes. I think we'll want to consider using these mappings for our research projects as some point.

@jackwasey
Copy link
Owner

Hi all, sorry to have been absent from discussion. I haven't had time to think this through yet. From the work of the NIH hackathon group, it seemed that once the GEM mappings were converted (which is a one-off process at the package maintainer level), the ICD comorbidity engine could be used. See the PDF vignette on 'efficiency' for how this is so fast.

My philosophy is not to import other packages ( http://www.tinyverse.org/ ), but I'm glad this note is here to show users another way. I will be merging the hackathon work.

All the best,
Jack

@wenjie2wang
Copy link

Thanks, Jack! Could you please provide the link to the work of the NIH hackathon group?

I am also a fan of tinyverse: the current version of touch package only depends R itself and imports Rcpp for integrating C++ with R.

@dpritchLibre
Copy link
Author

Hi Jack, thanks for this fantastic package! It wasn't clear to me from your message -- is there any interest in merging in this PR after whatever modifications you see fit, or would it be better to continue this work as a standalone package using icd as a dependency?

@jackwasey
Copy link
Owner

Absolutely. Thanks for your attention to this. I will definitely get the cross-walk code in. The NIH hackathon came up with various approaches: the simplest was using data.table. The surrounding testing will need elaboration, and this is a complicated area. Having a solid set of tests will be important for this to move forward. I'm working on a small CRAN-required update, and will spend some time on this after that is done.

David Pritchard added 5 commits April 14, 2020 14:57
Includes updates to icd-gem-import-routines.R, convert.R and datadocs.R.

Adds code to import ICD procedure code GEMs and save the files to
`data/icd_gem_9pc_to_10pc.rda` and `data/icd_gem_10pc_to_9pc.rda`.

Also changes the format of the internal representation of the GEMs to include
both short form and decimal forms for the source and target codes.
These tables are stored at `data-raw/icd-gem-2016-convert-9pc-to-10pc.txt` and
`data-raw/icd-gem-2016-convert-10pc-to-9pc.txt`
These files are located at:

    data/icd_gem_9_to_10.rda
    data/icd_gem_10_to_9.rda
    data/icd_gem_9pc_to_10pc.rda
    data/icd_gem_10pc_to_9pc.rda
Includes modifications to reflect the updated format of the internal GEM
versions.
@dapritchard
Copy link

@jackwasey I've added in a crosswalk for the ICD-9 / ICD-10 procedure codes. Please let me know if you are interested in including them crosswalks into your package, otherwise I would be happy to extract them into a small add-on package. Many thanks for your work providing this resource to us all.

@dpritchLibre
Copy link
Author

Hi all, has there been any thoughts on whether this PR would be considered for inclusion in the project?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants