5TroutExp bioinformatics
I split the files for faster processing:
mkdir -p /gscratch/grandol1/5Trout1Exp/rawreads
cd /gscratch/grandol1/5Trout1Exp/rawreads
unpigz --to-stdout /project/gtl/data/distribution/Wagner/Rosenthall/5Trout/1/Exp/Exp5Trout1_S1_R1_001.fastq.gz | split -l 16000000 -d --suffix-length=3 --additional-suffix=.fastq - 5Trout1Exp_
Making 236 files
I split the files for faster processing:
mkdir -p /gscratch/grandol1/5Trout3/rawreads
cd /gscratch/grandol1/5Trout3/rawreads
unpigz --to-stdout /project/gtl/data/distribution/Wagner/Rosenthall/5Trout/3/Std/5Trout3_S1_R1_001.fastq.gz | split -l 16000000 -d --suffix-length=3 --additional-suffix=.fastq - 5Trout1Std_
Making 236 files
Demultiplexing
In /project/gtl/data/RHIR1/rawreads/
I removed extraneous spaces in the file that maps MIDS to individual identifiers (RHIR1_Demux.csv
). I made a fixed version (now we have RHIR1Demux_fixed.csv
):
sed -E 's/^([[:alnum:]-]+),([[:alnum:]-]+),([[:alnum:]-]+).*/\1,\2,\3/' RHIR1_Demux.csv > RHIR1Demux_fixed.csv
cd /gscratch/grandol1/RHIR1/rawreads/
Parse split files
/project/gtl/data/raw/5Trout1Exp/demultiplex/run_parsebarcodes_onSplitInput.pl
Recombine by sample name and mid
/project/gtl/data/raw/RHIR1/demultiplex/run_splitFastq_gbs.sh
Â