...
Eight individuals were duplicated, with different MIDs. Was this planned? I didn’t account for this in the parsing script (the info line only has the individual sample ID, not the MID. I could add it back in. But then the replicates would need to be merged. As is now, all reads for an individual are going into one file. There are also four tubes labeled ‘BLANK' that will all have been merged (all the reads went into BLANK.GGATCCTT.fq).
compressed all
sample_fastq/
files with pigz: usingsbatch /project/microbiome/data/seq/HMAX1/demultiplex/run_pigz.sh
moved fastq for all four blank samples (data are all in one file because names are collapsed; noted above) to a subfolder (
/project/microbiome/data/seq/HMAX1/demultiplex/sample_fastq/blanks
), to get them out of the way.worked on getting denovo assembly started
To do:
Do de novo assemblies
Summarize the parse report files in /gscratch with some code to iterate over all the individual reports and get an overall count.
variant calling
...