Test Sequencing of 4AH1

Run sample on iSeq:

Dilute to 1 nM based off qPCR results. qPCR results are in pM, but 1:1000 dilution used. The results are effectively in nM for pool.

1000/Results = ul of Pool to Add

1000/25 = 40uL of Pool to Add

1000 - ul of Pool to Add = ul of “10 mM Tris 8.5” to Add

1000- 40 = 960uL of 10mM Tris 8.5

Dilute 1 nM full pool to loading concentration of 50 pM:

Add 5 ul 1 nM Pool to 85 ul “10 mM Tris 8.5” and 10 ul 50 pM PhiX
Remove iSeq 100 i1 Flow Cell from refrigerator 5’s crisper drawer and open white foil pack and allow to equilibrate to RT for 10-15 minutes.
Open “iSeq 100 i1 Reagent Cartridge v2”. Turn on iSeq100
Click on “Sequence”. Watch Video. Do what video tells you to do. Follow on screen instructions until run starts.
Results located: Data/SequencingRuns/”folder with applicable date”/Alignment_1/Fastq/*.fastq.gz

Bioinformatics

Demux as per usual. Parese report in the same folder as the data. Parsing seemed to work well.

Curious if cross-contamination is not as bad for the one-step since there is no chance of primer and template carryover from the first PCR to the second.

Code used to determine cross-contamination:

cutadapt -g ^GTGYCAGCMGCCGCGGTAA -o R1out_no16s parsed_4AH1-1STEP_S1_L001_R1_001.fastq -e 0.25

cutadapt -g ^GGACTACHVGGGTWTCTAAT -o R2out_no16s parsed_4AH1-1STEP_S1_L001_R2_001.fastq -e 0.25

cutadapt -g ^CTTGGTCATTTAGAGGAAGTAA -o R1out_no16s_noits R1out_no16s -e 0.25

cutadapt -g ^GCTGCGTTCTTCATCGATGC -o R2out_no16s_noits R2out_no16s -e 0.25

sed -i 's/\s/_/g' R1out_no16s_noits

sed -i 's/^>16/>rna16/' R1out_no16s_noits

sed -i 's/-/_/g' R1out_no16s_noits

awk '{if ($1 ~ /@/) {print}else{print substr ($0, 0, 13)}}' R1out_no16s_noits > short_polyg_filtered.fa

module load vsearch

vsearch --fastq_filter short_polyg_filtered.fa -fastaout short_polyg_filteredFA.fa

echo "Making coligo table"

#NOTE that the heuristics for usearch_global don't seem to work well for short sequences (as shown by various tests JH did), therefore using search_exact

vsearch -search_exact short_polyg_filteredFA.fa -db /project/microbiome/ref_db/coligos_and_abbreviatedISD.fa -strand plus -otutabout coligoTable -maxaccepts 0 -minseqlen 5

Shifted to R to examine coligo table:

#Determine the extent to which coligos occur where they are not supposed to.

dat <- read.table("~/Desktop/teton/coligoTable", stringsAsFactors = F, header = T)
dat[1:5,1:5]
dim(dat)

key <- read.csv("~/Desktop/teton/NS4_4AH1_Demux.csv", stringsAsFactors = F)

names(dat) <- gsub("X(.*)","\1",names(dat))

key$forward_barcode <- toupper(key$forward_mid)
key$reverse_barcode <- toupper(key$reverse_mid)

key$combo <- paste(key$locus, key$forward_barcode, key$reverse_barcode, key$samplename, sep = "_")

for(i in 1:length(key$combo)){
names(dat)[which(key$combo[i] == names(dat))] <- key$wellposition[i]
}

dat$OTUID <- gsub("Coligo_", "",dat$OTUID)

dat <- dat[dat$OTUID != "ISD",]

percent_target <- NA

for(i in 2:length(names(dat))){
if(dat[dat$OTUID == names(dat)[i],i] > 15){
percent_target[i] <- dat[dat$OTUID == names(dat)[i],i] / sum(dat[,i])
}
}
mean(na.omit(percent_target))
summary(percent_target)
table(percent_target > 0.90)

dat[,1:15e

While cursory, it appears to me that cross-contamination is a bit reduced by the one-step PCR library prep. protocol. It would be of interest to do this analysis on the data for this plate that was generated using the two-step protocol (so we can compare apples to apples). I think those data are forthcoming.