Info |
---|
Status (20 March 2021) Revisiting I repeated the trim, merge, join, and otutable generation steps, to have equivalent methods that we have now applied to Novaseq 1A, 1B, 1C, 3, and 4. Also redid otu table generation and coligo table construction. |
Table of Contents |
---|
Demultiplexing and splitting
...
We generated otu tables for 16S and ITS, which can be found in the /project/microbiome/data/seq/psomagen_17sep20_novaseq2/otu
/16S
and /project/microbiome/data/seq/psomagen_17sep20_novaseq2/otu
/ITS
directories for all projects. The script to do this is /project/microbiome/data/seq/psomagen_17sep20_novaseq2/otu/run_slurm_mkotu.pl.
This script crawls all of the projects (and loci) in the tfmerged/
folder, and concatenates all input sequences files and passes them through a UNIX pipe for analysis (rather than making another copy of all the sequence data). This was rerun on with the addition of line 5 (--search_exact
) and --relabel 'otu'
added to line 2. The key part the script is the job that is created for each project:
Code Block | ||||
---|---|---|---|---|
| ||||
$job .= "cat $tfmdirectory/*tfmergedreads.fa $tfmdirectory/joined/*tfmergedreads.fa | vsearch --derep_fulllength - --threads 32 --output uniqueSequences.fa --sizeout;\n"; $job .= "vsearch --cluster_unoise uniqueSequences.fa --relabel 'otu' --sizein --sizeout --consout zotus.fa --minsize 8 ;\n"; $job .= "vsearch --uchime3_denovo zotus.fa --nonchimeras zotus_nonchimeric.fa --threads 32;\n"; $job .= "cat $tfmdirectory/*tfmergedreads.fa $tfmdirectory/joined/*tfmergedreads.fa | vsearch --usearch_global - --db zotus_nonchimeric.fa --otutabout - --id 0.99 --threads 32 | sed 's/^#OTU ID/OTUID/' > 'otutable';\n"; $job .= 'cat $tfmdirectory/*tfmergedreads.fa $tfmdirectory/joined/*tfmergedreads.fa | vsearch --search_exact - --db zotus_nonchimeric.fa --otutabout - --id 0.99 --threads 32'. " | sed 's/^#OTU ID/OTUID/' > 'otutable.esv';\n"; |
Make coligo table
In /project/microbiome/data/seq/psomagen_17sep20_novaseq2/coligoISD
, there are 16S
and ITS
directories for all projects. These contain a file named coligoISDtable.txt
with counts of the coligos and the ISD found in the trimmed forward reads, per sample. The file run_slurm_mkcoligoISDtable.pl
has the code that passes over all of the projects and uses vsearch
for making the table.
...