/
Bioinformatics v2 for 5RM1

Bioinformatics v2 for 5RM1

1st Attempt:

cat ./*_R1_001.fastq > 5RM1_R1.fastq

cat ./*_R2_001.fastq > 5RM1_R2.fastq

gzip 5RM1_R1.fastq

gzip 5RM1_R2.fastq

mkdir -p /gscratch/grandol1/loc_ad1/rawdata

cd /gscratch/grandol1/loc_ad1/rawdata

unpigz --to-stdout /project/microbiome/data_queue/seq/loc_ad1/rawdata/5RM1_R1.fastq | split -l 1000000 -d --suffix-length=3 --additional-suffix=.fastq - 5RM1_R1_ ;
unpigz --to-stdout /project/microbiome/data_queue/seq/loc_ad1/rawdata/5RM1_R2.fastq | split -l 1000000 -d --suffix-length=3 --additional-suffix=.fastq - 5RM1_R2_

//project/microbiome/data_queue/seq/loc_ad1/rawdata/run_parse_count_onSplitInput.pl

cd /project/microbiome/data_queue/seq/loc_ad1/rawdata

./run_splitFastq_fwd.sh

./run_splitFastq_rev.sh

./run_aggregate.sh

cd /project/microbiome/data_queue/seq/loc_ad1/rawdata/sample_fastq/16S/loc_ad1

rename $'\r' '' *

cd /project/microbiome/data_queue/seq/loc_ad1/tfmergedreads

./run_slurm_mergereads.pl

cd /project/microbiome/data_queue/seq/loc_ad1/otu

./run_slurm_mkotu.pl

 

Just analyzing trimmed R1s to avoid suspected merge/join bias because of 2 x 150 sequencing:

cd /project/microbiome/data_queue/seq/loc_ad1/tfmergedreads/16S/loc_ad1/trimmed

cp *.R1.fq /project/microbiome/data_queue/seq/loc_ad1/R1only/

cd /project/microbiome/data_queue/seq/loc_ad1/R1only

sed -n '1~4s/^@/>/p;2~4p' ./*.fq > ./LocAdR1.fa

 

vsearch --derep_fulllength $s LocAdR1.fa \ --strand plus \ --output $s derep.fa \ --sizeout \ --uc $s.derep.uc \ --relabel $s. \ --fasta_width 0

vsearch --cluster_unoise derep.fa --centroids zotus_vsearch.fa --sizein --sizeout

3rd Attempt:

Reran 1st attempt commands starting with “./run_slurm_mergereads.pl

Edited “215” in the below code chunk of trim_merge.pl to “115”

system("vsearch --fastx_filter $R1tmpA --fastq_trunclen 215 --fastqout $R1tmpB --threads 32"); system("vsearch --fastx_filter $R2tmpA --fastq_trunclen 215 --fastqout $R2tmpB --threads 32"); print "vsearch step1 complete\n"; if(-e $R1tmpB && -e $R1tmpB){ system("vsearch --fastq_join $R1tmpB --reverse $R2tmpB --fastaout $joinedfile --threads 32"); unlink($R1tmpA, $R2tmpA, $R1tmpB, $R2tmpB); }

 

Related content

LowRead Bioinformatics
LowRead Bioinformatics
More like this
3AMF MiSeq Bioinformatics
3AMF MiSeq Bioinformatics
More like this
AMF1 MiSeq Bioinformatics
AMF1 MiSeq Bioinformatics
More like this
redone TRNL1 demultiplexing
redone TRNL1 demultiplexing
More like this
AMF1 iSeq Test Bioinformatics
AMF1 iSeq Test Bioinformatics
More like this
LRII Bioinformatics
More like this