Test Sequencing of 4AH1

Run sample on iSeq:

Dilute to 1 nM based off qPCR results. qPCR results are in pM, but 1:1000 dilution used. The results are effectively in nM for pool.

  • 1000/Results = ul of Pool to Add

1000/25 = 40uL of Pool to Add

  • 1000 - ul of Pool to Add = ul of “10 mM Tris 8.5” to Add

1000- 40 = 960uL of 10mM Tris 8.5

Dilute 1 nM full pool to loading concentration of 50 pM:

  • Add 5 ul 1 nM Pool to 85 ul “10 mM Tris 8.5” and 10 ul 50 pM PhiX

  • Remove iSeq 100 i1 Flow Cell from refrigerator 5’s crisper drawer and open white foil pack and allow to equilibrate to RT for 10-15 minutes.

  • Open “iSeq 100 i1 Reagent Cartridge v2”. Turn on iSeq100

  • Click on “Sequence”. Watch Video. Do what video tells you to do. Follow on screen instructions until run starts.

  • Results located: Data/SequencingRuns/”folder with applicable date”/Alignment_1/Fastq/*.fastq.gz

Bioinformatics

Demux as per usual. Parese report in the same folder as the data. Parsing seemed to work well.

 

Curious if cross-contamination is not as bad for the one-step since there is no chance of primer and template carryover from the first PCR to the second.

Code used to determine cross-contamination:

cutadapt -g ^GTGYCAGCMGCCGCGGTAA -o R1out_no16s parsed_4AH1-1STEP_S1_L001_R1_001.fastq -e 0.25

cutadapt -g ^GGACTACHVGGGTWTCTAAT -o R2out_no16s parsed_4AH1-1STEP_S1_L001_R2_001.fastq  -e 0.25

cutadapt -g ^CTTGGTCATTTAGAGGAAGTAA -o R1out_no16s_noits R1out_no16s -e 0.25

cutadapt -g ^GCTGCGTTCTTCATCGATGC -o R2out_no16s_noits R2out_no16s -e 0.25

sed -i 's/\s/_/g' R1out_no16s_noits

sed -i 's/^>16/>rna16/' R1out_no16s_noits

sed -i 's/-/_/g' R1out_no16s_noits

awk '{if ($1 ~ /@/) {print}else{print substr ($0, 0, 13)}}' R1out_no16s_noits > short_polyg_filtered.fa

module load vsearch

vsearch --fastq_filter short_polyg_filtered.fa -fastaout short_polyg_filteredFA.fa

echo "Making coligo table"

#NOTE that the heuristics for usearch_global don't seem to work well for short sequences (as shown by various tests JH did), therefore using search_exact

vsearch -search_exact short_polyg_filteredFA.fa -db /project/microbiome/ref_db/coligos_and_abbreviatedISD.fa -strand plus -otutabout coligoTable  -maxaccepts 0 -minseqlen 5

Shifted to R to examine coligo table:

#Determine the extent to which coligos occur where they are not supposed to.

dat <- read.table("~/Desktop/teton/coligoTable", stringsAsFactors = F, header = T)
dat[1:5,1:5]
dim(dat)

key <- read.csv("~/Desktop/teton/NS4_4AH1_Demux.csv", stringsAsFactors = F)

names(dat) <- gsub("X(.*)","\1",names(dat))

key$forward_barcode <- toupper(key$forward_mid)
key$reverse_barcode <- toupper(key$reverse_mid)

key$combo <- paste(key$locus, key$forward_barcode, key$reverse_barcode, key$samplename, sep = "_")

for(i in 1:length(key$combo)){
names(dat)[which(key$combo[i] == names(dat))] <- key$wellposition[i]
}

dat$OTUID <- gsub("Coligo_", "",dat$OTUID)

dat <- dat[dat$OTUID != "ISD",]

percent_target <- NA

for(i in 2:length(names(dat))){
if(dat[dat$OTUID == names(dat)[i],i] > 15){
percent_target[i] <- dat[dat$OTUID == names(dat)[i],i] / sum(dat[,i])
}
}
mean(na.omit(percent_target))
summary(percent_target)
table(percent_target > 0.90)

dat[,1:15e

While cursory, it appears to me that cross-contamination is a bit reduced by the one-step PCR library prep. protocol. It would be of interest to do this analysis on the data for this plate that was generated using the two-step protocol (so we can compare apples to apples). I think those data are forthcoming.