-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Long-format table annotation Part 3] annotator R CLI #59
Commits on Jul 15, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 0b316f4 - Browse repository at this point
Copy the full SHA 0b316f4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 75769b5 - Browse repository at this point
Copy the full SHA 75769b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for b750ed5 - Browse repository at this point
Copy the full SHA b750ed5View commit details
Commits on Jul 16, 2021
-
Download `Gene_full_name` and `Protein_RefSeq_ID` from https://mygene.info/ using the mygene package.
Configuration menu - View commit details
-
Copy full SHA for 18ea05a - Browse repository at this point
Copy the full SHA 18ea05aView commit details -
Clean up ensg-gene-full-name-refseq-protein.tsv
Remove rows that have both Gene_full_name and Protein_RefSeq_ID values missing. Write NA for missing values rather than "NA" or "", in order to be consistent with the data release.
Configuration menu - View commit details
-
Copy full SHA for f40e997 - Browse repository at this point
Copy the full SHA f40e997View commit details -
Add echo commands in shell scripts
Print messages after done running shell scripts.
Configuration menu - View commit details
-
Copy full SHA for f941405 - Browse repository at this point
Copy the full SHA f941405View commit details -
Sort mygene returned character values
mygene API may return results in different orders, so sorting the values before output is necessary to reproduce previous results.
Configuration menu - View commit details
-
Copy full SHA for 96e8885 - Browse repository at this point
Copy the full SHA 96e8885View commit details -
Configuration menu - View commit details
-
Copy full SHA for 80b794e - Browse repository at this point
Copy the full SHA 80b794eView commit details -
Describe how to update annotator/annotation-data/oncokb-cancer-gene-list.tsv.
Configuration menu - View commit details
-
Copy full SHA for 9784aae - Browse repository at this point
Copy the full SHA 9784aaeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0a80994 - Browse repository at this point
Copy the full SHA 0a80994View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3adf7de - Browse repository at this point
Copy the full SHA 3adf7deView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4e5cdde - Browse repository at this point
Copy the full SHA 4e5cddeView commit details
Commits on Jul 17, 2021
-
Configuration menu - View commit details
-
Copy full SHA for bfc8470 - Browse repository at this point
Copy the full SHA bfc8470View commit details -
Update OncoKB annotation table source to analyses/long-format-table-utils/annotator/annotation-data/oncokb-cancer-gene-list.tsv
Configuration menu - View commit details
-
Copy full SHA for 9817edf - Browse repository at this point
Copy the full SHA 9817edfView commit details -
Merge branch 'lft-utils-ann-data-download' into lft-utils
Update README.md.
Configuration menu - View commit details
-
Copy full SHA for e402fb9 - Browse repository at this point
Copy the full SHA e402fb9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2202441 - Browse repository at this point
Copy the full SHA 2202441View commit details -
Check no NA or duplicate in annotation key columns
For an annotation table, NA or duplicate should not exist in the key column that is used to join_by for adding annotations to input table.
Configuration menu - View commit details
-
Copy full SHA for b1da9d0 - Browse repository at this point
Copy the full SHA b1da9d0View commit details -
Rename ensg_hgsb_rmtl_df to ensg_rmtl_df
Gene symbol is not added in this function, and gene symbol is required. It is also not necessary to check the gene_symbol column.
Configuration menu - View commit details
-
Copy full SHA for 5686fc1 - Browse repository at this point
Copy the full SHA 5686fc1View commit details -
Configuration menu - View commit details
-
Copy full SHA for ebda13b - Browse repository at this point
Copy the full SHA ebda13bView commit details -
Require that input long_format_table is tibble
The annotate_long_format_table function now only supports tibble, because handling data.frame or other types of table is complex, such as preserving rownames, orders, and column types. Support for other types of table could be added if required at a later point.
Configuration menu - View commit details
-
Copy full SHA for 2c044ed - Browse repository at this point
Copy the full SHA 2c044edView commit details
Commits on Jul 18, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 9d0061f - Browse repository at this point
Copy the full SHA 9d0061fView commit details -
Configuration menu - View commit details
-
Copy full SHA for c6b4553 - Browse repository at this point
Copy the full SHA c6b4553View commit details -
Change is_gene_level_table param to add_Gene_type
add_Gene_type is more informative.
Configuration menu - View commit details
-
Copy full SHA for e168a0d - Browse repository at this point
Copy the full SHA e168a0dView commit details -
Add add_OncoKB_columns parameter
In annotate_long_format_table(), add a add_OncoKB_columns parameter. add_OncoKB_columns: TRUE or FALSE on whether to add OncoKB_cancer_gene and OncoKB_oncogene_TSG columns. Default value is FALSE.
Configuration menu - View commit details
-
Copy full SHA for 1eb2da3 - Browse repository at this point
Copy the full SHA 1eb2da3View commit details
Commits on Jul 19, 2021
-
Change default to add all annotation columns
The interface of annotate_long_format_table is changed, so that all annotation columns are added by default, and the columns_to_add parameter can be passed to add only certain columns. The current interface annotate_long_format_table is designed to favor flexibility and readability over efficiency. The funciton can be further refactored to be more efficient, e.g. not check certain tables, process certain tables, or add certain columns if they are not required.
Configuration menu - View commit details
-
Copy full SHA for d1a10ec - Browse repository at this point
Copy the full SHA d1a10ecView commit details -
Add a note about the order of added columns
The order of added columns may not be the same as the values of columns_to_add, in annotate_long_format_table.
Configuration menu - View commit details
-
Copy full SHA for 92bd0a8 - Browse repository at this point
Copy the full SHA 92bd0a8View commit details -
Order the columns of the annotated table
In annotate_long_format_table, reorder the columns of the returned table to have the same column order as the input table and added columns in the order of columns_to_add parameter.
Configuration menu - View commit details
-
Copy full SHA for 51cbe37 - Browse repository at this point
Copy the full SHA 51cbe37View commit details -
Configuration menu - View commit details
-
Copy full SHA for 492a30d - Browse repository at this point
Copy the full SHA 492a30dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4983aff - Browse repository at this point
Copy the full SHA 4983affView commit details -
Only left_join an annotation table if it is required.
Configuration menu - View commit details
-
Copy full SHA for 79bc8ed - Browse repository at this point
Copy the full SHA 79bc8edView commit details -
Refactor annotation data processing with %>%
Use %>% to chain multiple processing steps
Configuration menu - View commit details
-
Copy full SHA for 1eec9bd - Browse repository at this point
Copy the full SHA 1eec9bdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5a2811e - Browse repository at this point
Copy the full SHA 5a2811eView commit details -
Configuration menu - View commit details
-
Copy full SHA for fac1a84 - Browse repository at this point
Copy the full SHA fac1a84View commit details -
Configuration menu - View commit details
-
Copy full SHA for e64f82f - Browse repository at this point
Copy the full SHA e64f82fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 66e8a7c - Browse repository at this point
Copy the full SHA 66e8a7cView commit details -
Configuration menu - View commit details
-
Copy full SHA for c735ca8 - Browse repository at this point
Copy the full SHA c735ca8View commit details -
Add non-missing value descriptions for each annotation column.
Configuration menu - View commit details
-
Copy full SHA for c409e1f - Browse repository at this point
Copy the full SHA c409e1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e5a331 - Browse repository at this point
Copy the full SHA 7e5a331View commit details
Commits on Jul 20, 2021
-
Configuration menu - View commit details
-
Copy full SHA for c2fa1d3 - Browse repository at this point
Copy the full SHA c2fa1d3View commit details -
Configuration menu - View commit details
-
Copy full SHA for eac83e8 - Browse repository at this point
Copy the full SHA eac83e8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6820050 - Browse repository at this point
Copy the full SHA 6820050View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7344502 - Browse repository at this point
Copy the full SHA 7344502View commit details -
Configuration menu - View commit details
-
Copy full SHA for d49344f - Browse repository at this point
Copy the full SHA d49344fView commit details -
Add documentation for R CLI usage of long-format table annotator.
Configuration menu - View commit details
-
Copy full SHA for 2f4eeed - Browse repository at this point
Copy the full SHA 2f4eeedView commit details -
Revise annotator R API example usage, so the code does not rely on any loaded package.
Configuration menu - View commit details
-
Copy full SHA for 9387a22 - Browse repository at this point
Copy the full SHA 9387a22View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5b76fb0 - Browse repository at this point
Copy the full SHA 5b76fb0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2a36e00 - Browse repository at this point
Copy the full SHA 2a36e00View commit details -
Configuration menu - View commit details
-
Copy full SHA for 721c043 - Browse repository at this point
Copy the full SHA 721c043View commit details -
Configuration menu - View commit details
-
Copy full SHA for 029b238 - Browse repository at this point
Copy the full SHA 029b238View commit details -
Update error message in download-annotation-data.R
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 909277d - Browse repository at this point
Copy the full SHA 909277dView commit details -
Update error message in download-annotation-data.R
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 80b6af7 - Browse repository at this point
Copy the full SHA 80b6af7View commit details -
Update error message in download-annotation-data.R
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for a1399bc - Browse repository at this point
Copy the full SHA a1399bcView commit details -
Added a note on TSV format in the help message: *NOTE** on the --input-long-format-table-tsv file: 1) the TSV file should use double quotes for field values thatneed escape, e.g. "NA" for string literal "NA" and "\t" for tab; 2) only unquoted NA field values are treated as missing values internally; 3) leading and trailing white spaces in field values are **NOT** trimmed before parsing. Changed parameters in read_tsv in order to preserve the TSV content. See comments for more details.
Configuration menu - View commit details
-
Copy full SHA for 4ef7685 - Browse repository at this point
Copy the full SHA 4ef7685View commit details -
Add notes to CLI input TSV specifications that are implemented in the last commit.
Configuration menu - View commit details
-
Copy full SHA for 16d23b4 - Browse repository at this point
Copy the full SHA 16d23b4View commit details -
Rename update-long-format-table-utils.sh to run-update-long-format-ta…
…ble-utils.sh README.md is also updated accordingly. This is suggested by @jharenza at d3b-center#55 (comment) , in order to follow the shell script name convention of analysis modules.
Configuration menu - View commit details
-
Copy full SHA for 090f080 - Browse repository at this point
Copy the full SHA 090f080View commit details
Commits on Jul 21, 2021
-
Specify annotation data versions in README.md
Add annotation data versions and data of the last update in the "Update downloaded data that are used in this module" section, as suggested by @jharenza at d3b-center#55 (comment) Combine gene and disease (/cancer_group) annotations into one table. Add additional notes on annotation data versions to the "Implementation of long-format table annotator" section.
Configuration menu - View commit details
-
Copy full SHA for a183470 - Browse repository at this point
Copy the full SHA a183470View commit details -
Change the date of the last update of annotator/annotation-data/oncokb-cancer-gene-list.tsv to 07/16/2021. The 07/16/2021 annotator/annotation-data/oncokb-cancer-gene-list.tsv is identical to the previous 06/16/2021 version, even though the website at https://www.oncokb.org/cancerGenes has changed last update from 06/16/2021 to 07/16/2021.
Configuration menu - View commit details
-
Copy full SHA for 5716523 - Browse repository at this point
Copy the full SHA 5716523View commit details -
Merge branch 'lft-utils-ann-data-download' into lft-utils-ann-r-api
Merge changes in the data downloading PR d3b-center#55 . Rename update-long-format-table-utils.sh to run-update-long-format-table-utils.sh . Specify annotation data versions in README.md. Change the date of the last update of annotator/annotation-data/oncokb-cancer-gene-list.tsv to 07/16/2021.
Configuration menu - View commit details
-
Copy full SHA for 64fb672 - Browse repository at this point
Copy the full SHA 64fb672View commit details -
Merge branch 'lft-utils-ann-r-api' into lft-utils-ann-r-cli
Merge changes in the data downloading PR d3b-center#55 . Rename update-long-format-table-utils.sh to run-update-long-format-table-utils.sh . Specify annotation data versions in README.md. Change the date of the last update of annotator/annotation-data/oncokb-cancer-gene-list.tsv to 07/16/2021.
Configuration menu - View commit details
-
Copy full SHA for d61e21b - Browse repository at this point
Copy the full SHA d61e21bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 03516c9 - Browse repository at this point
Copy the full SHA 03516c9View commit details -
Remove test cases in download-annotation-data.R
As suggested by @jharenza at <d3b-center#55 (comment)>, test cases should be removed from the source code file.
Configuration menu - View commit details
-
Copy full SHA for 6f76fae - Browse repository at this point
Copy the full SHA 6f76faeView commit details -
Add unit testing using the testthat package
Run `bash run-tests.sh` to run all tests. In order to import a funciton for testing from an R file without running the whole file, a helper function import_function is defined at tests/helper_import_function.R, and the import_function is also tested in the tests/test_helper_import_function.R file.
Configuration menu - View commit details
-
Copy full SHA for 48a131d - Browse repository at this point
Copy the full SHA 48a131dView commit details -
Fix typos in download-annotation-data.R
Suggested by @NHJohnson at d3b-center#55 (review)
Configuration menu - View commit details
-
Copy full SHA for 0773a1e - Browse repository at this point
Copy the full SHA 0773a1eView commit details -
Add "Unit testing for long-format table annotator" section to descript how to use the unit testing framework.
Configuration menu - View commit details
-
Copy full SHA for a08cf5f - Browse repository at this point
Copy the full SHA a08cf5fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 796a26c - Browse repository at this point
Copy the full SHA 796a26cView commit details -
Merge branch 'lft-utils-ann-data-download' into lft-utils-ann-r-api
Merge changes from the data downloading PR <d3b-center#55>
Configuration menu - View commit details
-
Copy full SHA for 6d6838c - Browse repository at this point
Copy the full SHA 6d6838cView commit details -
Merge branch 'lft-utils-ann-r-api' into lft-utils-ann-r-cli
Merge changes from the data downloading PR <d3b-center#55>
Configuration menu - View commit details
-
Copy full SHA for 5029d4b - Browse repository at this point
Copy the full SHA 5029d4bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 29325c7 - Browse repository at this point
Copy the full SHA 29325c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3fc592e - Browse repository at this point
Copy the full SHA 3fc592eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 76dea21 - Browse repository at this point
Copy the full SHA 76dea21View commit details -
Add note on requiring both Gene_Ensembl_ID and Gene_symbol
Clarify the reasons for requiring both Gene_Ensembl_ID and Gene_symbol in annotator/annotator-api.R, as suggested by @jharenza at d3b-center#56 (review)
Configuration menu - View commit details
-
Copy full SHA for cb683a3 - Browse repository at this point
Copy the full SHA cb683a3View commit details -
Group notes on annotation data versions into a list.
Configuration menu - View commit details
-
Copy full SHA for 76b1322 - Browse repository at this point
Copy the full SHA 76b1322View commit details -
Note that the annotation columns to be added should not already exist in the table that needs to be annotated, as suggested by @NHJohnson at d3b-center#56 (review)
Configuration menu - View commit details
-
Copy full SHA for 24ba5d2 - Browse repository at this point
Copy the full SHA 24ba5d2View commit details -
Note that that the names of the annotation columns will be standardized at a later point, as suggested by @jharenza at <d3b-center#56 (review)>, so it is recommended to use the annotation column names in this module for the results.
Configuration menu - View commit details
-
Copy full SHA for e5a653c - Browse repository at this point
Copy the full SHA e5a653cView commit details
Commits on Jul 22, 2021
-
Change root criteria in rprojroot::find_root
Add rprojroot::has_file(".git") as an alternative root criterion to handle linked git working trees created by `git worktree add` in rprojroot::find_root. Also fixed typo identified by @NHJohnson at d3b-center#56 (comment)
Configuration menu - View commit details
-
Copy full SHA for fd99539 - Browse repository at this point
Copy the full SHA fd99539View commit details -
Remove git diff in run-download-annotation-data.sh
git commands do not work in a linked working tree in a Docker image/container, because the linked working tree uses host absolute paths to locate the main working tree. Therefore, drop the git diff --stat command to improve compatibility with Docker image/container. The users could still use git diff --stat on host machine.
Configuration menu - View commit details
-
Copy full SHA for a85a97d - Browse repository at this point
Copy the full SHA a85a97dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2cdadf4 - Browse repository at this point
Copy the full SHA 2cdadf4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0f3d912 - Browse repository at this point
Copy the full SHA 0f3d912View commit details -
Refactor annotator-api.R using functional paradigm
As suggested by @NHJohnson at <https://github.com/PediatricOpenTargets/OpenPedCan-analysis/pull/56/files#r674342194>, reduce code duplication by refactoring the annotate_long_format_table funtion in annotator-api.R with a functional paradigm. See comments in annotator-api.R for more details. The interface is not changed, so previous examples and descriptions should all still apply.
Configuration menu - View commit details
-
Copy full SHA for 3292b7f - Browse repository at this point
Copy the full SHA 3292b7fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0cd9a8c - Browse repository at this point
Copy the full SHA 0cd9a8cView commit details -
Fix a bug that can add duplicated annotation cols
Duplicated .x and .y annotation columns will be added in the following scenario: - annotation table has multiple columns - one or more annotatoin columns in the annotatoin table are in the columns_to_add vector - one or more other annotation columns in the annotation table are already in the table to be annotated, but these columns are not in the columns_to_add vector Fix this bug by joining only annotation columns that are not in the table to be annotated.
Configuration menu - View commit details
-
Copy full SHA for 81a041e - Browse repository at this point
Copy the full SHA 81a041eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e38c19 - Browse repository at this point
Copy the full SHA 7e38c19View commit details -
Configuration menu - View commit details
-
Copy full SHA for 53f27fd - Browse repository at this point
Copy the full SHA 53f27fdView commit details -
Fix package version compatibility issues for tests
As found by @NHJohnson at <d3b-center#56 (comment)>, some tests failed out side of the Docker image. See comments in annotator/tests/test_annotate_long_format_table.R for more details about the package version compatibility issues.
Configuration menu - View commit details
-
Copy full SHA for e337abd - Browse repository at this point
Copy the full SHA e337abdView commit details -
Configuration menu - View commit details
-
Copy full SHA for f9f6d17 - Browse repository at this point
Copy the full SHA f9f6d17View commit details -
Change root criteria in rprojroot::find_root
Add rprojroot::has_file(".git") as an alternative root criterion to handle linked git working trees created by `git worktree add` in rprojroot::find_root.
Configuration menu - View commit details
-
Copy full SHA for eca90e5 - Browse repository at this point
Copy the full SHA eca90e5View commit details -
Properly handle
-c ''
CLI callsPreviously, `-c ''` calls will fail because `""` is passed to API calls as required columns to add, which is unavailable. Now, `-c ''` calls will pass `character(0)` to API calls, so the output table will not have any additional annotation column.
Configuration menu - View commit details
-
Copy full SHA for ebcc6ac - Browse repository at this point
Copy the full SHA ebcc6acView commit details -
Configuration menu - View commit details
-
Copy full SHA for 52ff63e - Browse repository at this point
Copy the full SHA 52ff63eView commit details -
Add a note on the naming conventions of test files in the tsetthat package 2.1.1.
Configuration menu - View commit details
-
Copy full SHA for a725257 - Browse repository at this point
Copy the full SHA a725257View commit details
Commits on Jul 23, 2021
-
Remove input file after running certain CLI tests
Remove input files after running CLI tests, if the input files are created by tests as intermediate files, in order to ensure a clean start for the next test to run.
Configuration menu - View commit details
-
Copy full SHA for 9c4e47f - Browse repository at this point
Copy the full SHA 9c4e47fView commit details -
Test CLI calls with unavailable input/output paths
Should fail on annotator CLI calls with unavailable input files or output dirs.
Configuration menu - View commit details
-
Copy full SHA for 3e0ef2f - Browse repository at this point
Copy the full SHA 3e0ef2fView commit details -
Add comments for import_function usage
Note that nested functions cannot be imported, and unnamed functions cannot be imported.
Configuration menu - View commit details
-
Copy full SHA for 2455609 - Browse repository at this point
Copy the full SHA 2455609View commit details -
Add tests for importing nested functions
Should fail on importing functions that are nestedly defined in other expressions.
Configuration menu - View commit details
-
Copy full SHA for 365fcf9 - Browse repository at this point
Copy the full SHA 365fcf9View commit details -
Add tests on importing functions defined multiple times
Should fail, even if the function is defined in different ways.
Configuration menu - View commit details
-
Copy full SHA for 0224e3b - Browse repository at this point
Copy the full SHA 0224e3bView commit details -
Test comments and line breaks for import_function
Comments and line breaks should not change the behaviors of import_function.
Configuration menu - View commit details
-
Copy full SHA for ea90284 - Browse repository at this point
Copy the full SHA ea90284View commit details -
Corrected test helpfer function context. Added test/ prefix to test annotator/tests/test_annotator_cli.R.
Configuration menu - View commit details
-
Copy full SHA for be21860 - Browse repository at this point
Copy the full SHA be21860View commit details -
Put imported function in the importing environment
Put the imported function in the same environment as the import_function being called, so the imported functions can call other imported functions. Also add a test for such use cases.
Configuration menu - View commit details
-
Copy full SHA for 85aa52f - Browse repository at this point
Copy the full SHA 85aa52fView commit details
Commits on Jul 24, 2021
-
Specify envir = parent.frame() in eval call
In import_function, envir = parent.frame() makes the environment(imported_function) to be the same as the import function being called. Even though the eval documentation says the default envir parameter is parent.frame(), leaving envir = parent.frame() in the eval call will surprisingly make the environment(imported_function) to be the same as the environment of the import_function call. The tests also fail without specifying envir = parent.frame(). Maybe this is caused by the place where parent.frame() is evaluated? If specified, parent.frame() is evaluated in the calling function; if not specified, parent.frame() is evaluated in the eval call environment? Examples executed in terminal R, in order to avoid RStudio customizations. > print(environment()) <environment: R_GlobalEnv> > > foo <- function() { + print(parent.frame()) + print(environment()) + return(eval(quote(function() { return(2) }))) + } > > bar <- foo() <environment: R_GlobalEnv> <environment: 0x5645ead67670> > print(environment(bar)) <environment: 0x5645ead67670> > > baz <- function() { + print(parent.frame()) + print(environment()) + return(eval(quote(function() { return(2) }), + envir = parent.frame())) + } > > qux <- baz() <environment: R_GlobalEnv> <environment: 0x5645ead5fec0> > print(environment(qux)) <environment: R_GlobalEnv>
Configuration menu - View commit details
-
Copy full SHA for 2047329 - Browse repository at this point
Copy the full SHA 2047329View commit details
Commits on Jul 25, 2021
-
Only replace NA with empty string in columns that have NA
Previously, if replace_na_with_empty_string=TRUE in annotate_long_format_table, replace_na with empty string is applied to all columns, including columns that have no NA. If a non-character column that has no NA, its data type will be comverted to character, which may break backward compatibility of the JSON/JSONL tables that were generated without using the annotator API. Now, if replace_na_with_empty_string=TRUE in annotate_long_format_table, only replace NA with empty string in columns that have NA in them. This will not change the value types of the columns that have no NA, so the output JSON/JSONL table will be backward compatible with the ones that were generated without using the annotator API. CLI help message is also changed accordingly.
Configuration menu - View commit details
-
Copy full SHA for 0545a41 - Browse repository at this point
Copy the full SHA 0545a41View commit details -
Test annotator API replace_na_with_empty_string
Test that replace_na_with_empty_string = TRUE replaces NA with "". Test that non-character columns without NA are not converted to character columns. Test that replace_na_with_empty_string = FALSE does not replace NA with "".
Configuration menu - View commit details
-
Copy full SHA for f87a4c3 - Browse repository at this point
Copy the full SHA f87a4c3View commit details -
Test annotator CLI --replace-na-with-empty-string
Test that --replace-na-with-empty-string replaces NA with "". Test that no --replace-na-with-empty-string does not replace NA with "". The --replace-na-with-empty-string does not need to be tested for column type conversions. The CLI write_tsv output table is the same even if a non-character column is converted to a character column before write_tsv, because "Values are only quoted if they contain a comma, quote or newline" (-- help("write_tsv", "readr") 1.3.1).
Configuration menu - View commit details
-
Copy full SHA for dd72ce6 - Browse repository at this point
Copy the full SHA dd72ce6View commit details