...
In a sequence library’s rawdata/
directory (e.g., /project/gtl/data/raw/ALF1/ITS/rawdata/
) I made run_aggregate.sh
, to run aggregate_usearch_fastx_info.pl
with a slurm job. Summaries are written to summary_sample_fastq.csv
.
Stopped here on 89-3101-22
Gregg Randolph : please see /project/gtl/data/raw/ALF1/16S/tfmergedreads
where I made mergereads.nf
, teton.conf
and edited trim_merge.pl
to trim_mergecab.pl
(initially this was because I didn't have permissions to run the file, so I copied it, but I found I need to make some changes, with are in the *cab.pl
version). You run the nextflow script with: module load nextflow
and nextflow run -bg mergereads.nf -c teton.config
. See inside of mergereads.nf
for other ways of running it (i.e., not in the background). I tried this on pair of input files and one of the vsearch steps in the middle fails because the inputs are too small. But the nextflow script completes. Please have a look and see what you find and can figure out. It might be that some of the input files are genuinely too small.
Trim, merge and filter reads
...
In /project/gtl/data/raw/ALF1/ITS/coligoISD
and /project/gtl/data/raw/ALF1/ITS/otu
there are 16S
and ITS
directories for all projects. These contain a file named coligoISDtable.txt
with counts of the coligos and the ISD found in the trimmed forward reads, per sample. The file run_slurm_mkcoligoISDtable.pl
has the code that passes over all of the projects and uses vsearch
for making the table.
...