Raw reads
We received four files with sequence reads. Two of these contain the 1x100bp reads, because two lanes were used on the instrument. Two of these because CU unnecessarily ran indexing reads on the fragments. I deleted these nonsense files. The two files with the raw reads of interest are (these are in /project/microbiome/data/seq/HMAX1/rawreads
).
...
Ran
bwa index -a bwtsw GCF_002127325.2_HanXRQr2.0-SUNRISE_genomic.fna.gz
by hand in an interactive nodeCommands are in
0_assem.nf
. Run this withnextflow run -bg 0_assem.nf -c teton.config
. These are jobs are using:module load swset/2018.05 gcc/7.3.0 bwa/0.7.17 samtools/1.12
as specified inteton.config
in this directory (bwa is version 0.7.17-r1188). Output is in/project/microbiome/data/seq/HMAX1/assem/sambam/
. Gave each job 60 minutes, which was unnecessarily long, but conservative. Longest running jobs I could see were less than 20 minutes. Moved all 477 inputs files through in about 30 minutes total.
Variant calling
...
Built bcftools version 1.16 and installed in
/project/evolgen/bin/
.
Copying steps from https://github.com/zgompert/DimensionsExperiment
...
Code Block |
---|
bcftools mpileup -C 50 -d 250 -f /uufs/chpc.utah.edu/common/home/gompert-group3/data/LmelGenome/Lmel_dovetailPacBio_genome.fasta -q 30 -Q 20 -I -b lmel_bams.txt -o lmel_variants.bcf -O u -a FORMAT/AD,FORMAT/DP bcftools call -v -c -p 0.01 -P 0.001 -O v -o lmel_variants.vcf lmel_variants.bcf |
...