In addition to conducting simulation analysis on spatial data derived from in situ hybridization-based STARmap technique (see folder Simulation), we also generated sequencing-based simulation data. For this dataset, we retained the spatial information from the STARmap dataset but replaced the STARmap cells with cells from single-cell RNA sequencing (scRNA-seq) technique.
For the generation of this sequencing-based simulation data, we utilized a scRNA-seq dataset from the mouse visual cortex, acquired using the inDrops technique (GSE102827). The spatial locations and cell type annotations of original cells, coarse-graining procedure, true proportions of cell types in generated square spots, and adjacency matrix were all maintained consistent with the methodology used for generating the STARmap-based simulated data. The only difference lies in the gene expression profiles observed in each spot, resulting from substituting the STARmap cell gene expression profiles with those derived from scRNA-seq cells. The notebook detailing the process of creating this simulated data can be found at generate_simulated_spatial_data.nb.html, located within the Generate_simulation_data folder.
For cell type deconvolution, we employ the scRNA-seq dataset (GSE102827) as the internal reference, identical to the dataset used for generating the simulated data. As for the external reference, we continue to use the scRNA-seq dataset (GSE115746), consistent with our approach in the STARmap-based simulation analysis.
Notebooks of running SDePER and corresponding deconvoluted cell type proportions are listed in folder Run_SDePER_on_simulation_data.
Single cells with the matched 12 cell types are included as reference.
Single cells from the GSE102827 dataset are used as reference for deconvolution, therefore it's free of platform effect.
- NO platform effect removal by CVAE: see S1_ref_spatial_SDePER_NO_CVAE.ipynb
- WITH platform effect removal by CVAE: see S1_ref_spatial_SDePER_WITH_CVAE.ipynb
Single cells from the GSE115746 scRNA-seq dataset are used as reference for deconvolution, therefore platform effect exists.
- NO platform effect removal by CVAE: see S1_ref_scRNA_SDePER_NO_CVAE.ipynb
- WITH platform effect removal by CVAE: see S1_ref_scRNA_SDePER_WITH_CVAE.ipynb
In the current sequencing-based simulation, we synthesize spatial spots by aggregating neighboring single cells. This results in spots containing between 1 to 12 cells, with an average of 3.6 cells per spot.
We have also generated two additional simulation datasets with an increased number of cells per spot:
- 3x Setting: Approximately 3 to 36 cells per spot, with an average of 10.8 cells per spot.
- 6x Setting: Approximately 6 to 72 cells per spot, with an average of 21.6 cells per spot.
To access the methods used for generating these datasets, please refer generate_simulated_spatial_data.nb.html in folder Generate_high_density_simulation_data.
Notebooks of running SDePER and corresponding deconvoluted cell type proportions are available in folder Run_SDePER_on_high_density_simulation_data.
Results are shown in generate_seq_based_high_density_figures.nb.html.