Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Info

Status (02 May 2022)

  • Data arrived by sftp on 28 April 2022. Everything below is a draft modified from Bioinformatics for Novaseq run 4 ). I expect to finish processing tomorrow, Data processing finished 5-0212-22. Most of the text should remain the same. Current figures are irrelevant placeholders.

Table of Contents

Demultiplexing and splitting

...

Statistics on the initial number reads, the number of reads that merged, and the number of reads that remain after filtering are in filtermergestats.csv in each project folder. Please note that this will not include the number of reads that failed to merge, but we were able to join. This category is likely to include ITS sequences for which the amplicon was large enough that our 2x250bp reads could not span the whole length. The greater number removed in ITS (orange) in the plot below is consistent with this idea. For the full lane these summaries were concatenated in tfmergedreads/ with

...

cat */*/filtermergestats.csv > all_filtermergestats.csv

...

In /project/microbiome/data_queue/seq/psomagen_6mar20NS5/coligoISD, /project/microbiome/data/seq/psomagen_26may20NS5/coligoISD, and /project/microbiome/data/seq/psomagen_29jan21novaseq1cNS5/coligoISD, there are 16S and ITS directories for all projects. These contain a file named coligoISDtable.txt with counts of the coligos and the ISD found in the trimmed forward reads, per sample. The file run_slurm_mkcoligoISDtable.pl has the code that passes over all of the projects and uses vsearch for making the table.

...

5-2-2022 Alex Buerkle transferred all of this to /project/microbiome/data/seq/cu_29april22novaseq5