Error dec_celltype using cell2loc #11

ccruizm · 2022-09-26T13:53:38Z

Good day!

I have gicen a try to use this great tool with my data but have encountered a problem when using cell2loc. The pipeline ran and generated the cell2loca_results matrix but one it usues the function generate_newmeta_cell generates the error below:

New names:
• `` -> `...1`
Rows: 3967 Columns: 30
── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): ...1
dbl (29): q05cell_abundance_w_sf_AC_like, q05cell_abundance_w_sf_Astrocyte, ...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Generating single-cell data for each spot 
Error in {: task 1 failed - "incompatible dimensions"
Traceback:

1. dec_celltype(object = obj, sc_data = as.matrix(GetAssayData(sc, 
 .     slot = "counts")), sc_celltype = as.character(sc@meta.data$celltype), 
 .     method = 7, env = "cell2loc_env")
2. .generate_newmeta_cell(newmeta, st_ndata, sc_ndata, sc_celltype, 
 .     iter_num, if_doParallel)
3. foreach::foreach(i = 1:length(newmeta_spotname), .combine = "rbind", 
 .     .packages = "Matrix", .export = ".generate_newmeta_spot") %dopar% 
 .     {
 .         spot_name <- newmeta_spotname[i]
 .         .generate_newmeta_spot(spot_name, newmeta, st_ndata, 
 .             sc_ndata, sc_celltype, iter_num)
 .     }
4. e$fun(obj, substitute(ex), parent.frame(), e$data)

Do you know where the problem might be?

My sessionInfo()

R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /hpc/pmc_stunnenberg/cruiz/miniconda3/envs/r_pHGG_project/lib/libopenblasp-r0.3.12.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] sceasy_0.0.7       reticulate_1.26    future_1.28.0      patchwork_1.1.2   
 [5] RColorBrewer_1.1-3 igraph_1.3.1       SpaTalk_1.0        doParallel_1.0.17 
 [9] iterators_1.0.14   foreach_1.5.2      ggalluvial_0.12.3  ggplot2_3.3.6     
[13] dplyr_1.0.10       sp_1.4-7           SeuratObject_4.1.0 Seurat_4.1.0      
[17] data.table_1.14.2  Matrix_1.3-3      

loaded via a namespace (and not attached):
  [1] backports_1.4.1       uuid_1.1-0            plyr_1.8.7           
  [4] repr_1.1.3            lazyeval_0.2.2        splines_4.0.3        
  [7] RcppHNSW_0.4.1        listenv_0.8.0         scattermore_0.8      
 [10] digest_0.6.29         htmltools_0.5.3       fansi_1.0.3          
 [13] magrittr_2.0.3        tensor_1.5            cluster_2.1.2        
 [16] ROCR_1.0-11           tzdb_0.3.0            globals_0.16.1       
 [19] readr_2.1.2           matrixStats_0.62.0    vroom_1.5.7          
 [22] spatstat.sparse_2.1-1 prettyunits_1.1.1     colorspace_2.0-3     
 [25] rappdirs_0.3.3        ggrepel_0.9.1         crayon_1.5.1         
 [28] jsonlite_1.8.0        scatterpie_0.1.8      progressr_0.11.0     
 [31] spatstat.data_2.2-0   survival_3.2-11       zoo_1.8-10           
 [34] glue_1.6.2            polyclip_1.10-0       gtable_0.3.1         
 [37] leiden_0.4.3          car_3.1-0             future.apply_1.9.1   
 [40] abind_1.4-5           scales_1.2.1          pheatmap_1.0.12      
 [43] DBI_1.1.2             rstatix_0.7.0         spatstat.random_2.2-0
 [46] miniUI_0.1.1.1        Rcpp_1.0.9            viridisLite_0.4.1    
 [49] xtable_1.8-4          progress_1.2.2        spatstat.core_2.4-2  
 [52] bit_4.0.4             NNLM_0.4.4            htmlwidgets_1.5.4    
 [55] httr_1.4.4            ellipsis_0.3.2        ica_1.0-3            
 [58] pkgconfig_2.0.3       farver_2.1.1          uwot_0.1.11          
 [61] deldir_1.0-6          here_1.0.1            utf8_1.2.2           
 [64] labeling_0.4.2        tidyselect_1.1.2      rlang_1.0.4          
 [67] reshape2_1.4.4        later_1.3.0           munsell_0.5.0        
 [70] tools_4.0.3           cli_3.3.0             generics_0.1.3       
 [73] broom_1.0.1           ggridges_0.5.3        evaluate_0.16        
 [76] stringr_1.4.1         fastmap_1.1.0         goftest_1.2-3        
 [79] bit64_4.0.5           fitdistrplus_1.1-8    purrr_0.3.4.9000     
 [82] RANN_2.6.1            pbapply_1.5-0         nlme_3.1-152         
 [85] mime_0.12             ggExtra_0.10.0        hdf5r_1.3.5          
 [88] compiler_4.0.3        plotly_4.10.0.9001    png_0.1-7            
 [91] ggsignif_0.6.3        spatstat.utils_2.3-1  tibble_3.1.8         
 [94] tweenr_2.0.2          stringi_1.7.8         RSpectra_0.16-1      
 [97] rgeos_0.5-9           lattice_0.20-44       IRdisplay_1.0        
[100] vctrs_0.4.1           pillar_1.8.1          lifecycle_1.0.1      
[103] spatstat.geom_2.4-0   lmtest_0.9-40         RcppAnnoy_0.0.19     
[106] cowplot_1.1.1         irlba_2.3.5           httpuv_1.6.6         
[109] R6_2.5.1              promises_1.2.0.9000   KernSmooth_2.23-20   
[112] gridExtra_2.3         parallelly_1.32.1     codetools_0.2-18     
[115] fastDummies_1.6.3     MASS_7.3-54           assertthat_0.2.1     
[118] rprojroot_2.0.3       withr_2.5.0           sctransform_0.3.3    
[121] mgcv_1.8-35           hms_1.1.2             grid_4.0.3           
[124] rpart_4.1-15          ggfun_0.0.7           IRkernel_1.1.1       
[127] tidyr_1.2.1           carData_3.0-5         Cairo_1.5-15         
[130] Rtsne_0.16            ggpubr_0.4.0          pbdZMQ_0.3-5         
[133] ggforce_0.3.4         shiny_1.7.2           base64enc_0.1-3

Thanks in advance!

The text was updated successfully, but these errors were encountered:

ccruizm · 2022-09-27T14:32:57Z

I ran it using the example dataset, and it worked. Maybe I am not creating the SpaTalk object correctly. Could you please tell me how to import Visium data into the pipeline?

multitalk · 2022-09-28T02:10:03Z

Thanks your feedback. For Visium data, you just need prepare the st_data and st_meta as showed in spot-based (vignette) ST data

ccruizm · 2022-09-28T13:13:36Z

Yes, I followed it, but it still gives me an error. I created it according to that vignette, and it creates the SpaTalk data using createSpaTalk, and it runs all cell2loc with no problem. The issue comes when returned st_coef matrix and tries to generate synthetic single-cell data from each spot (generate_newmeta_cell).

Could you please share the code you used to read Visium data and analyze it with SpaTalk?

multitalk · 2022-09-30T12:33:01Z

@ccruizm Here is my code used to read Visium data and analyze it with SpaTalk.

library(Seurat)
library(SpaTalk)
# 10X mouse kidney spatial data
rawdata <- Load10X_Spatial(data.dir = 'kidney/')
st_data <- rawdata@assays$Spatial@data
st_data <- rev_gene(data = st_data,data_type = "count",species = "Human",geneinfo = geneinfo)
st_meta <- rawdata@images[["slice1"]]@coordinates
st_meta <- st_meta[,c("tissue","imagerow","imagecol")]
colnames(st_meta) <- c("spot", "x", "y")
st_meta$spot <- rownames(st_meta)
rownames(st_meta) <- 1:nrow(st_meta)
obj <- createSpaTalk(st_data = st_data, st_meta = st_meta,species = "Human",if_st_is_sc = F,spot_max_cell = 30)
# sc_data: scRNA-seq data
# sc_celltype: cell type for each cell
obj <- dec_celltype(object = obj,sc_data = sc_data,sc_celltype = sc_celltype)
obj <- find_lr_path(object = obj,lrpairs = lrpairs,pathways = pathways)
obj <- dec_cci_all(object = obj)

ccruizm · 2022-10-02T21:25:41Z

I tried using your code, and I got the same error. Is there a way I can share the files with you so we can troubleshoot where the issue is, please?

multitalk · 2022-10-04T01:59:09Z

@ccruizm You can share the files by emailing to me (xin_shao@zju.edu.cn)

ccruizm · 2022-10-24T09:02:40Z

Hello @shaoxin0801 ,
Have you had the chance to check the files I sent you and see whether you can also reproduce the problem?
Thanks in advance!

multitalk · 2022-10-24T10:20:10Z

@ccruizm Sorry that I am so busy recently and forget to check the files. Now, I have downloaded the scRNA-seq reference data and it is okay. But I can't download the ST data which is invalid. Could you please share the relavant link. Thank you.

ccruizm · 2022-10-24T10:30:34Z

No worries! I understand ;) I have sent you an email with the link to download the ST data. Please let me know whether you can download it and contains all the files needed for testing. Thanks for your help!

multitalk · 2022-10-24T10:37:43Z

Good. I have downloaded the ST.zip and it is okay. I am going to perform the SpaTalk pipeline.

ccruizm · 2022-10-24T10:40:10Z

Thank you so much! Best, Cristian

…

On 24 Oct 2022, 12:37 +0200, Shao, Xin ***@***.***>, wrote: Good. I have downloaded the ST.zip and it is okay. I am going to perform the SpaTalk pipeline. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

multitalk · 2022-10-25T06:49:00Z

@ccruizm I have checked the code and figured out the error in generate_newmeta_cell. It is because of the unmatched genes between sc_data and st_data when using other methods including cell2location, which will generate different length of vectors when calculating the Pearson's correlation. I have fixed the bugs and it works now. Here is the code I used cell2location to generate st_coef first and performed dec_celltype with your provided data. You can try it now by yourself. Look forward to your reply. Thanks a lot.

> obj <- dec_celltype(object = obj,sc_data = sc_data,sc_celltype = sc_meta$celltype,method = 7,env = 'cell2location_env',dec_result = as.matrix(st_coef),if_doParallel = F)
Generating single-cell data for each spot 
***Done***

ccruizm · 2022-10-25T15:46:32Z

That's wonderful! I am running the pipeline now. So far it has run for 12h but still has not finished (already done with the cell2loc deconvolution) and using several threads.

I noticed you added in the script dec_result = as.matrix(st_coef), if_doParallel = F. The default is TRUE but not sure if I should change it and that will improve the speed of the computations (paradoxically). I will wait and see whether the pipeline still takes more time.

Thanks for helping troubleshoot this issue :)

multitalk · 2022-10-26T03:26:40Z

You can try if_doParallel = F. Also, I have fixed some bugs in performing parallel functions and allow to retain genes consistent with sc_data when genes between st_data and sc_data are different. Thanks for your timely feedback.

ccruizm · 2022-10-27T06:57:56Z

That's good to know! I canceled that first run in multithreading and started a new one with if_doParallel = F. However, it has been running for 32h hours and still does not finish. is that normal? how long did it take you with the data I shared?

Also i could not set dec_result = as.matrix(st_coef). When doing this, it did not find the st_coef variable in the session and decided not to include it in the arguments. I am using:

obj <- dec_celltype(object = obj,
                    sc_data = as.matrix(GetAssayData(sc, slot = 'counts')),
                    sc_celltype = as.character(sc@meta.data$celltype),
                    method = 7, 
                    env = "cell2loc_env_2",
                    # dec_result = TRUE,
                    if_doParallel = F
                   )

Is there a way I can generate a log file to share so we can check why it is taking this long?

multitalk · 2022-10-27T09:22:57Z

To test your data, I randomly sample 50 cells for each cell type as the reference and test 50 spots. I didn't run SpaTalk for all spots with all cells in the reference. Actually, It might take a long time when you set if_doParallel = F. In addition, the more spots and more genes in sc_data and st_data, the more time it will take (some days for large visium data). You can wait and see or use if_doParallel = T

ccruizm · 2022-10-27T09:42:26Z

Perfect! then it is normal and will need to be patient. Thanks for the info!

multitalk closed this as completed Oct 6, 2022

multitalk added the bug Something isn't working label Nov 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error dec_celltype using cell2loc #11

Error dec_celltype using cell2loc #11

ccruizm commented Sep 26, 2022

ccruizm commented Sep 27, 2022

multitalk commented Sep 28, 2022

ccruizm commented Sep 28, 2022

multitalk commented Sep 30, 2022

ccruizm commented Oct 2, 2022

multitalk commented Oct 4, 2022

ccruizm commented Oct 24, 2022

multitalk commented Oct 24, 2022

ccruizm commented Oct 24, 2022

multitalk commented Oct 24, 2022

ccruizm commented Oct 24, 2022 via email

multitalk commented Oct 25, 2022

ccruizm commented Oct 25, 2022

multitalk commented Oct 26, 2022

ccruizm commented Oct 27, 2022

multitalk commented Oct 27, 2022

ccruizm commented Oct 27, 2022

Error dec_celltype using cell2loc #11

Error dec_celltype using cell2loc #11

Comments

ccruizm commented Sep 26, 2022

ccruizm commented Sep 27, 2022

multitalk commented Sep 28, 2022

ccruizm commented Sep 28, 2022

multitalk commented Sep 30, 2022

ccruizm commented Oct 2, 2022

multitalk commented Oct 4, 2022

ccruizm commented Oct 24, 2022

multitalk commented Oct 24, 2022

ccruizm commented Oct 24, 2022

multitalk commented Oct 24, 2022

ccruizm commented Oct 24, 2022 via email

multitalk commented Oct 25, 2022

ccruizm commented Oct 25, 2022

multitalk commented Oct 26, 2022

ccruizm commented Oct 27, 2022

multitalk commented Oct 27, 2022

ccruizm commented Oct 27, 2022