- Download/Create clinical data, put data file and map file on a server location where you can find.
- Change and save clinical.params file. (Parameters definition list in Params List section)
- Load clinical data.
./load_clinical.sh clinical.params
- Check if gpl is already in tranSMART
./check_gpl.sh
- If gpl already in the list, skip next 3 steps.
- (Optional) Download annotation file from transmart dataset
- (Optional) Change and save annoataion.params file.
- (Optional) Load annotation data.
./load_annotation.sh annotation.params
- Change and save expression.params file
- Load expression data.
./load_expression.sh expression.params
Done!
- load_clinial.sh
- load_expression.sh
- load_annotation.sh
- clinical.params
- expression.params
- annotation.params
# data
DATA_LOCATION="/home/transmart/datasets/RanchoGSE4698/clinical"
COLUMN_MAP_FILE="Acute_Lymphoblastic_Leukemia_Kirschner_Schwabe_GSE4698_Mapping_File.txt"
# info
STUDY_ID="GSE4698"
TOP_NODE="\\Public Studies\\Acute Lymphoblastic Leukemia_Kirschner_Schwabe_GSE4698"
# security
SECURITY_REQUIRED="N"
# not using
WORD_MAP_FILE=x
RECORD_EXCLUSION_FILE=x
Field Name | Meaning |
---|---|
DATA_LOCATION | Location of clinical data folder. |
COLUMN_MAP_FILE | Map file. |
STUDY_ID | Study id. |
TOP_NODE | Top node. (TOP_NODE=\\TOP_NODE_PREFIX\\STUDY_NAME) |
SECURITY_REQUIRED | Is private? |
WORD_MAP_FILE | Word map file. |
RECORD_EXCLUSION_FILE | Record exclusion file. |
# data
DATA_LOCATION="/home/transmart/datasets/RanchoGSE4698/expression"
DATA_FILE_PREFIX="Acute_Lymphoblastic_Leukemia_Kirschner_Schwabe_GSE4698_Gene_Expression_Data"
MAP_FILENAME=\
"Acute_Lymphoblastic_Leukemia_"\
"Kirschner_Schwabe_GSE4698_"\
"Subject_Sample_Mapping_File.txt"\
# info
STUDY_ID="GSE4698"
TOP_NODE="\\Public Studies\\Acute Lymphoblastic Leukemia_Kirschner_Schwabe_GSE4698"
SOURCE_CD=""
# security
SECURITY_REQUIRED="N"
Field Name | Meaning |
---|---|
DATA_LOCATION | Location of clinical data folder. |
DATA_FILE_PREFIX | Prefix of data file. |
MAP_FILENAME | Map file name. |
STUDY_ID | Study id. |
TOP_NODE | Top node. (TOP_NODE=\\TOP_NODE_PREFIX\\STUDY_NAME) |
SOURCE_CD | Source to include. Corresponding to SOURCE_CD filed in map file. Default value: STD |
SECURITY_REQUIRED | Is private? |
# data
DATA_LOCATION="/home/transmart/datasets/EtriksGSE43696/annotation"
SOURCE_FILENAME="GPL6480.txt"
# info
ANNOTATION_TITLE="Agilent-014850 Whole Human Genome Microarray 4x44K G4112F (Probe Name Version)"
GPL_ID="GPL6480"
# col numbers
PROBE_COL=2
GENE_SYMBOL_COL=3
GENE_ID_COL=4
ORGANISM_COL=5
# header?
SKIP_ROWS=0
Field Name | Meaning |
---|---|
DATA_LOCATION | Location of annotation data folder. |
SOURCE_FILENAME | Data file name. |
ANNOTATION_TITLE | Title. (copy from xz) |
GPL_ID | GPL id. |
PROBE_COL | Column index of the probe ID. |
GENE_SYMBOL_COL | Column index of the gene symbol. |
GENE_ID_COL | Column index of the gene ID. |
ORGANISM_COL | Column index of the organism. |
SKIP_ROWS | Number of rows to skip. Note: This script does not assume a header row is present. If a header row exists, this should be set to one. |