/
Trout1 iSeq Test bioinformatics
Trout1 iSeq Test bioinformatics
Raw reads
We retrieved four files from the iSeq. One of these contains the 1x150bp reads. The file with the raw reads of interest are ( in /project/microbiome/data_queue/seq/trout1/rawreads
).
Trout-Pool3_S1_L001_R1_001.fastq.gz (273M) – 457,726,974 reads (1.5 GBytes uncompressed)
gunzip Trout-Pool3_S1_L001_R1_001.fastq.gz
Demultiplexing
Split into 100000 line files
mkdir /gscratch/grandol1/trout1
cd /gscratch/grandol1/trout1
cat /project/microbiome/data_queue/seq/trout1/Trout1_Pool3_S1_L001_R1_001.fastq | split -l 1000000 -d --suffix-length=3 --additional-suffix=.fastq - Trout1_Pool3_
Remove underscores and extraneous spaces
sed 's/_/-/' Trout1Pool3_Demux.csv > Trout1Pool3_Demux1.csv
sed -E 's/^([[:alnum:]-]+),([[:alnum:]-]+),([[:alnum:]-]+).*/\1,\2,\3/' Trout1Pool3_Demux1.csv > Trout1Pool3_Demux_fixed.csv
Parse split files
/project/microbiome/analyses/gtl/HMAX1/demultiplex/run_parsebarcodes_onSplitInput.pl
Recombine by sample name and mid
./run_splitFastq_gbs.sh