-
Notifications
You must be signed in to change notification settings - Fork 0
/
notes_travail.txt
242 lines (119 loc) · 8.81 KB
/
notes_travail.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
le fichier main.nf est le fichier principal de la pipeline, il contient l allep à la méthode runmethods
da la pipeline auxiiaire run_methods.nf.
run_methods.nf fait appel à la méthode de run de la méthode prise en parametre qui est definie dans
subworkflows/deconvolution/$methode/run_method.nf definie pour chaque methode.
lancer les calculs sur Sila.
nextflow run -profile local,docker subworkflows_sm/subworkflows/data_generation/generate_data.nf -c my_config.config
snakemake -s subworkflows_sm/data_generation/generate_data.smk -c4 --config sc_input="standards/reference/silver_standard_1_brain_cortex.rds" dataset_type="real" reps=1 rootdir
="." --use-singularity
pour runer:
nextflow run main.nf -profile local,docker --mode run_dataset --sc_input unit-test/test_sc_data.rds --sp_input unit-test/test_sp_data.rds --annot subclass --methods rctd
snakemake -s main.smk -c8 --config methods=cell2location sc_input="unit-test/test_sc_data.rds" sp_input="unit-test/test_sp_data.rds" output="res" --use-singularity
##############################"
execute with gpu:
snakemake -s main.smk -c8 --config mode="run_dataset" methods=cell2location,rctd sc_input="unit-test/test_sc_data.rds" sp_input="unit-test/test_sp_data.rds" output="res" use_gpu="true" --use-singularity --singularity-args '\--nv'
execute with cpu:
snakemake -s main.smk -c8 --config mode="run_dataset" methods=cell2location,rctd sc_input="unit-test/test_sc_data.rds" sp_input="unit-test/test_sp_data.rds" output="res" use_gpu="false" --use-singularity
#AVEC LES PARAAMETRS PAR default
Completed at: 13-Jun-2024 10:42:24
Duration : 3m 49s
CPU hours : 0.1
Succeeded : 8
with snakemake : 55m 13 s
/30000 55:13<00:00, 9.22it/s, Epoch 30000/30000: 100%|█| 30000/30000 [55:13<00:00, 9.22it/s, Epoch 30000/30000: 100%|█| 30000/30000 [55:13<00:00, 9.05it/s, v_num=1, elbo_
autres metriques:
- Corrélation de Spearman et Pearson correlation (almost the same)
- normalised mean absolute error (nmae) : calculated as mean error divided by the mean of true proportions
lancer golden standard:
nextflow run main.nf -profile local,docker --mode run_standard --standard gold_standard_1 -c standards/standard.config --methods rctd
lance golden standard avec snakemake:
snakemake -s main.smk -c12 --config mode="run_dataset" methods=rctd sc_input="standards/reference/gold_standard_1.rds" sp_input="standards/gold_standard_1/Eng2019_cortex_svz_fov5.rds" output="res" annot="celltype" use_gpu="true" --use-singularity --singularity-args '\--nv'
runtime comp: with same parameters using gpu
snakemake: real 7m19,582s
user 5m16,641s
sys 1m26,301s
Completed at: 13-Jun-2024 17:36:16
nextflow:
Duration : 6m 17s
CPU hours : 0.1
Succeeded : 8
real 6m21,932s
user 0m34,410s
sys 0m1,372s
execute a command with docker container:
singularity exec "docker://csangara/seuratdisk:latest" Rscript ./scratch.R
to generate data:
snakemake -s main.smk -c12 --config mode="generate_data" sc_input="standards/reference/gold_standard_1.rds" dataset_type="real" reps=1 rootdir="." --use-singularity
generate from golden standars: // 1741 spots
snakemake -s main.smk -c12 --config mode="generate_data" sc_input="standards/reference/gold_standard_1.rds" dataset_type="aud" reps=1 rootdir="." region_var="celltype_coarse" --use-singularity
execution de ddls sur CPU avec epochs 7000 et echantillons 5000
real 273m55,758s
user 627m30,081s
sys 13m9,822s
nouvelle meilleure execution : 12h30m epochs 7000 et echantillons 7000
run with generated data:
snakemake -s main.smk -c8 --config mode="run_dataset" methods=rctd sc_input="standards/reference/gold_standard_1.rds" sp_input="synthetic_data_sm/gold_standard_1_artificial_uniform_distinct_rep1.rds" output="res" use_gpu="true" annot="celltype" --use-singularity --singularity-args '\--nv'
################################
cell2location with 1000obs and 10000 genes
real 234m55,233s
user 99m7,484s
sys 1m18,247s
cell2location with 100obs and 10000genes
real 2 hours, 16 minutes
cell2location with 10000obs and 10000 genes
real 3 hours, 37 minutes
rctd with 1000spots and 10000 genes
real 41m1,962s
user 4m17,196s
sys 0m8,151s
rctd with 100spots and 10000 genes
real 6m51,875s
user 1m44,752s
sys 0m5,522s
rctd with 10000obs and 10000 genes
real 60m52,732s
user 6m54,860s
sys 0m20,528s
time Rscript subworkflows_sm/deconvolution/rctd/script_nf.R --sc_input datafiles_st_deconvolution/core_GBMap.rds --sp_input datafiles_st_deconvolution/UKF313_T_ST_1_raw.rds --annot cell_type --output proportions_rctd_sample6 -num_cores 24
Reading input scRNA-seq reference from datafiles_st_deconvolution/core_GBMap.rds
Found 17 cell types in the reference.
Converting to Reference object...
Warning message:
In Reference(counts = GetAssayData(seurat_obj_scRNA, slot = "counts"), :
Reference: number of cells per cell type is 127521, larger than maximum allowable of 10000. Downsampling number of cells to: 10000
Reading input spatial data from datafiles_st_deconvolution/UKF334_T_ST_1_raw.rds
Converting spatial data to SpatialRNA object...
'select()' returned 1:many mapping between keys and columns
oaak
Running RCTD with 24 cores...
[1] "Begin: process_cell_type_info"
[1] "process_cell_type_info: number of cells in reference: 74234"
[1] "process_cell_type_info: number of genes in reference: 28045"
Bcell astrocyte
1250 173
dendriticcell endothelialcell
3961 673
macrophage malignantcell
10000 10000
mastcell matureTcell
373 10000
microglialcell monocyte
10000 10000
muralcell naturalkillercell
1418 2489
neuron oligodendrocyte
22 10000
oligodendrocyteprecursorcell plasmacell
496 572
radialglialcell
2807
ssh alagraoui@bb8.cbib.u-bordeaux.fr
singularity exec "docker://csangara/sp_rctd:latest" /bin/bash
time Rscript subworkflows_sm/deconvolution/rctd/script_nf.R --sc_input datafiles_st_deconvolution/core_GBMap.rds --sp_input datafiles_st_deconvolution/UKF243_T_ST_1_raw.rds --annot annotation_level_4 --output proportions_rctd_sample243 -num_cores 19
snakemake -s main.smk -c8 --config mode="run_dataset" methods=cell2location sc_input="datafiles_st_deconvolution/core_GBMap_chunk_1.rds" sp_input="datafiles_st_deconvolution/UKF243_T_ST_1_raw.rds" output="res" use_gpu="true" skip_metrics="true" annot="annotation_level_4" --use-singularity --singularity-args '\--nv'
snakemake -s main.smk -c8 --config mode="generate_vis" sp_input="UKF243_T_ST_1_raw.rds" output="vis_output" norm_weights_filepath="res_rctd_cluster/proportions_rctd_sample243" st_coords_filepath="tissue_positions_list_243.csv" data_clustered="seurat_metadata_UKF243_T_ST.csv" image_path="original_tissue_images/tissue_hires_image_243.png" scale_factor='0.24414062'
snakemake -s main.smk -c8 --config mode="generate_vis" sp_input="UKF243_T_ST_1_raw.rds" output="vis_output" norm_weights_filepaths="res_rctd_cluster/proportions_rctd_sample243,res/proportions_cell2location_UKF243_T_ST_1_raw_001_chunk_1.tsv" st_coords_filepath="tissue_positions_list_243.csv" data_clustered="seurat_metadata_UKF243_T_ST.csv" image_path="original_tissue_images/tissue_hires_image_243.png" scale_factor='0.24414062' deconv_methods=rctd,cell2location
source activate cell2loc_env && python subworkflows_sm/deconvolution/cell2location/build_model.py core_GBMap.h5ad cpu -a annotation_level_4 -o res
python subworkflows_sm/deconvolution/cell2location/fit_model.py UKF243_T_ST_1_raw.h5ad res/sc.h5ad cpu -e 100 -o res -m true && mv res/proportions.tsv res/tst.tsv
python subworkflows_sm/deconvolution/cell2location/fit_model.py UKF243_T_ST_1_raw.h5ad /abderahim/data/sc.h5ad 0 -e 30000 -o res -m true && mv res/proportions.tsv res/proportions_cell2location_ab.tsv
snakemake -s main.smk -c8 --config mode="run_dataset" methods=cell2location sc_input="unit-test/test_sc_data.rds" sp_input="unit-test/test_sp_data.rds" output="res" use_gpu="true" skip_metrics="true" annot="subclass" map_genes='false' load_model="true" model_path="res/sc_test_sc_data_test_sp_data.h5ad" --use-singularity --singularity-args '\--nv'