Sample dataset and preprocessing

In this tutorial, we are analyzing 1M reads from Arabidopsis thaliana leaf RNA-seq dataset (SRR7947123) from Zhao et al., 2018. Example data is available from CyVerse datastore.

Input Data:

Input	Description	Example
Leaf RNA-seq data	1M reads dataset from SRR7947123	iplantcollaborative > example_data > HAMR_tutorial -> fastqfiles

Preprocessing

Evaluate the quality of your sequencing data using FastQC

Preprocessing will assess the quality of the raw reads to identify possible sequencing errors or biases. FastQC can be used for an overview of the data quality.

Login to the Discovery Environment.
Click on "Apps" tab in the Discovery Environment and search for "fastqc".
Click on the app icon.
Change the name of the analysis and output folder as needed or leave for defaults.
Under "Input" click on Add to provide input files. Sample dataset location iplantcollaborative > example_data > HAMR_tutorial -> fastqfiles. Check both files and click 'OK'.
For next section "Resource Requirements" request resources as needed or leave for defaults
Click Launch Analysis. You will receive a notification that the job has been submitted and running. Click on the Analyses tab to check the status of your job. When the analysis completes, click on the right three dots menu and click on 'Go to output folder' to access you output files.

Output/Results

Output	Description	Example
html and zip files	FastqQC report	SRR7947123_1M_fastqc.html

Description of output and results

Click on the html report files and check if your sequencing data has any red flags that you should be aware of. For more details on each module of the fastqc report, check FastQC documentation

Fix or improve this documentation

Search for an answer: |CyVerse Learning Center|
Ask us for help: click |Intercom| on the lower right-hand side of the page
Report an issue or submit a change: Github Repo Link
Send feedback: learning@CyVerse.org

|Home_Icon|_ Learning Center Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

step1.rst

step1.rst

Sample dataset and preprocessing

Files

step1.rst

Latest commit

History

step1.rst

File metadata and controls

Sample dataset and preprocessing