Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Info

Status (20 March 2021)

Revisiting I repeated the trim, merge, join, and otutable generation steps, to have equivalent methods that we have now applied to Novaseq 1A, 1B, 1C, 3, and 4. Also redid otu table generation and coligo table construction.

Table of Contents

Demultiplexing and splitting

...

We generated otu tables for 16S and ITS, which can be found in the /project/microbiome/data/seq/psomagen_17sep20_novaseq2/otu/16S and /project/microbiome/data/seq/psomagen_17sep20_novaseq2/otu/ITS directories for all projects. The script to do this is /project/microbiome/data/seq/psomagen_17sep20_novaseq2/otu/run_slurm_mkotu.pl. This script crawls all of the projects (and loci) in the tfmerged/ folder, and concatenates all input sequences files and passes them through a UNIX pipe for analysis (rather than making another copy of all the sequence data). This was rerun on with the addition of line 5 (--search_exact) and --relabel 'otu' added to line 2. The key part the script is the job that is created for each project:

Code Block
breakoutModefull-width
languagebash
    $job .= "cat $tfmdirectory/*tfmergedreads.fa $tfmdirectory/joined/*tfmergedreads.fa | vsearch --derep_fulllength - --threads 32 --output uniqueSequences.fa --sizeout;\n";
    $job .= "vsearch --cluster_unoise uniqueSequences.fa --relabel 'otu' --sizein --sizeout --consout zotus.fa --minsize 8 ;\n";
    $job .= "vsearch --uchime3_denovo zotus.fa --nonchimeras zotus_nonchimeric.fa --threads 32;\n";
    $job .= "cat $tfmdirectory/*tfmergedreads.fa $tfmdirectory/joined/*tfmergedreads.fa | vsearch --usearch_global - --db zotus_nonchimeric.fa --otutabout - --id 0.99 --threads 32 | sed 's/^#OTU ID/OTUID/' > 'otutable';\n";
    $job .= 'cat $tfmdirectory/*tfmergedreads.fa $tfmdirectory/joined/*tfmergedreads.fa | vsearch --search_exact - --db zotus_nonchimeric.fa --otutabout - --id 0.99 --threads 32'. " | sed 's/^#OTU ID/OTUID/' > 'otutable.esv';\n";

Make coligo table

In /project/microbiome/data/seq/psomagen_17sep20_novaseq2/coligoISD, there are 16S and ITS directories for all projects. These contain a file named coligoISDtable.txt with counts of the coligos and the ISD found in the trimmed forward reads, per sample. The file run_slurm_mkcoligoISDtable.pl has the code that passes over all of the projects and uses vsearch for making the table.

...