Training Data for "PlugNSeq: An Easy, Rapid, and Streamlined mRNA-Seq Data Analysis Pipeline Empowering Insightful Exploration with Well-Annotated Organisms, Requiring Minimal Bioinformatic Expertise"
No Thumbnail Available
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
protocols.io
Abstract
Here, we provide training data for the PlugNSeq mRNA-Seq data analysis pipeline. It contains a total of 24 gzip-compressed archives containing the mRNA-Seq reads of a rice experiment (Kar, S., Mai, HJ. et al., 2024, doi: 10.1093/pcp/pcab018 ).
The experimental setup is as follows: Plants from the two rice accessions "Hacha" and "Lachit" were grown hydroponically in control medium, and then exposed to control conditions or excess iron for three days, respectively. We harvested the leaves, extracted total RNA and after mRNA enrichment, we performed RNA-Seq (Illumina).
The files represent the following samples:
[sample_file_name] >>> [accession] - [treatment] - [replicate] - [direction]
4_S5_L003_R1_001.fastq.gz >>> Hacha - excess Fe - 1 - forward
4_S5_L003_R2_001.fastq.gz >>> Hacha - excess Fe - 1 - reverse
5_S6_L003_R1_001.fastq.gz >>> Hacha - excess Fe - 2 - forward
5_S6_L003_R2_001.fastq.gz >>> Hacha - excess Fe - 2 - reverse
6_S7_L003_R1_001.fastq.gz >>> Hacha - excess Fe - 3 - forward
6_S7_L003_R2_001.fastq.gz >>> Hacha - excess Fe - 3 - reverse
10_S11_L003_R1_001.fastq.gz >>> Lachit - excess Fe - 1 - forward
10_S11_L003_R2_001.fastq.gz >>> Lachit - excess Fe - 1 - reverse
11_S12_L003_R1_001.fastq.gz >>> Lachit - excess Fe - 2 - forward
11_S12_L003_R2_001.fastq.gz >>> Lachit - excess Fe - 2 - reverse
12_S13_L003_R1_001.fastq.gz >>> Lachit - excess Fe - 3 - forward
12_S13_L003_R2_001.fastq.gz >>> Lachit - excess Fe - 3 - reverse
16_S17_L004_R1_001.fastq.gz >>> Hacha - control - 1 - forward
16_S17_L004_R2_001.fastq.gz >>> Hacha - control - 1 - reverse
17_S18_L004_R1_001.fastq.gz >>> Hacha - control - 2 - forward
17_S18_L004_R2_001.fastq.gz >>> Hacha - control - 2 - reverse
18_S19_L004_R1_001.fastq.gz >>> Hacha - control - 3 - forward
18_S19_L004_R2_001.fastq.gz >>> Hacha - control - 3 - reverse
22_S23_L004_R1_001.fastq.gz >>> Lachit - control - 1 - forward
22_S23_L004_R2_001.fastq.gz >>> Lachit - control - 1 - reverse
23_S24_L004_R1_001.fastq.gz >>> Lachit - control - 2 - forward
23_S24_L004_R2_001.fastq.gz >>> Lachit - control - 2 - reverse
24_S25_L004_R1_001.fastq.gz >>> Lachit - control - 3 - forward
24_S25_L004_R2_001.fastq.gz >>> Lachit - control - 3 - reverse
The total size of all 24 reads files is ca. 57 GB. The average individual file size is ca. 2.4 GB. These are paired-end reads. Thus, equally named files that only differ in "R1" (forward reads) or "R2" (reverse reads) in their names, form pairs. If you want to check how preprocessing and analysis of single-end reads works, you may only use files with "R1" in the name. If there is no "R2" partner file, they will automatically be treated as single-end reads by the scripts in the pipeline. The original data included root samples, which are not included here.
Furthermore, we provide sample configuration files as they must be filled by the user after preprocessing and quantification has finished. You may copy and paste the respective file to the destination folder, or use it as a template for your own values. Important: We provide a "configuration_samples.xlsx" file for analysis in paired-end mode (if you use all packed reads files provided here) and an equally named file for analysis in single-end mode (if you use only files with "R1" but not "R2" in the file name. In the configuration file for single-end mode analysis, the TSV files listed in column A all have "R1" in the file name, while the file names in column A of the configuration file for analysis in paired-end mode all have "RX" in the file name instead.
With the provided data, and given enough free disk space on your machine, you can easily explore the simplicity and efficiency of the PlugNSeq pipeline without requiring your own data.
Description
Keywords
PlugNSeq, mRNA-Seq, preprocessing, alignment, pseudoalignment, quantification, statistical, analysis, plots, graphs, tables, pairwise comparisons, differentially expressed, genes, DEG