Test Sequencing of 4AH1
Run sample on iSeq:
Dilute to 1 nM based off qPCR results. qPCR results are in pM, but 1:1000 dilution used. The results are effectively in nM for pool.
1000/Results = ul of Pool to Add
1000/25 = 40uL of Pool to Add
1000 - ul of Pool to Add = ul of “10 mM Tris 8.5” to Add
1000- 40 = 960uL of 10mM Tris 8.5
Dilute 1 nM full pool to loading concentration of 50 pM:
Add 5 ul 1 nM Pool to 85 ul “10 mM Tris 8.5” and 10 ul 50 pM PhiX
Remove iSeq 100 i1 Flow Cell from refrigerator 5’s crisper drawer and open white foil pack and allow to equilibrate to RT for 10-15 minutes.
Open “iSeq 100 i1 Reagent Cartridge v2”. Turn on iSeq100
Click on “Sequence”. Watch Video. Do what video tells you to do. Follow on screen instructions until run starts.
Results located: Data/SequencingRuns/”folder with applicable date”/Alignment_1/Fastq/*.fastq.gz
Bioinformatics
Demux as per usual. Parese report in the same folder as the data. Parsing seemed to work well.
Curious if cross-contamination is not as bad for the one-step since there is no chance of primer and template carryover from the first PCR to the second.
Code used to determine cross-contamination:
cutadapt -g ^GTGYCAGCMGCCGCGGTAA -o R1out_no16s parsed_4AH1-1STEP_S1_L001_R1_001.fastq -e 0.25
cutadapt -g ^GGACTACHVGGGTWTCTAAT -o R2out_no16s parsed_4AH1-1STEP_S1_L001_R2_001.fastq -e 0.25
cutadapt -g ^CTTGGTCATTTAGAGGAAGTAA -o R1out_no16s_noits R1out_no16s -e 0.25
cutadapt -g ^GCTGCGTTCTTCATCGATGC -o R2out_no16s_noits R2out_no16s -e 0.25
sed -i 's/\s/_/g' R1out_no16s_noits
sed -i 's/^>16/>rna16/' R1out_no16s_noits
sed -i 's/-/_/g' R1out_no16s_noits
awk '{if ($1 ~ /@/) {print}else{print substr ($0, 0, 13)}}' R1out_no16s_noits > short_polyg_filtered.fa
module load vsearch
vsearch --fastq_filter short_polyg_filtered.fa -fastaout short_polyg_filteredFA.fa
echo "Making coligo table"
#NOTE that the heuristics for usearch_global don't seem to work well for short sequences (as shown by various tests JH did), therefore using search_exact
vsearch -search_exact short_polyg_filteredFA.fa -db /project/microbiome/ref_db/coligos_and_abbreviatedISD.fa -strand plus -otutabout coligoTable -maxaccepts 0 -minseqlen 5
Shifted to R to examine coligo table:
#Determine the extent to which coligos occur where they are not supposed to.
dat <- read.table("~/Desktop/teton/coligoTable", stringsAsFactors = F, header = T)
dat[1:5,1:5]
dim(dat)
key <- read.csv("~/Desktop/teton/NS4_4AH1_Demux.csv", stringsAsFactors = F)
names(dat) <- gsub("X(.*)","\1",names(dat))
key$forward_barcode <- toupper(key$forward_mid)
key$reverse_barcode <- toupper(key$reverse_mid)
key$combo <- paste(key$locus, key$forward_barcode, key$reverse_barcode, key$samplename, sep = "_")
for(i in 1:length(key$combo)){
names(dat)[which(key$combo[i] == names(dat))] <- key$wellposition[i]
}
dat$OTUID <- gsub("Coligo_", "",dat$OTUID)
dat <- dat[dat$OTUID != "ISD",]
percent_target <- NA
for(i in 2:length(names(dat))){
if(dat[dat$OTUID == names(dat)[i],i] > 15){
percent_target[i] <- dat[dat$OTUID == names(dat)[i],i] / sum(dat[,i])
}
}
mean(na.omit(percent_target))
summary(percent_target)
table(percent_target > 0.90)
dat[,1:15e
While cursory, it appears to me that cross-contamination is a bit reduced by the one-step PCR library prep. protocol. It would be of interest to do this analysis on the data for this plate that was generated using the two-step protocol (so we can compare apples to apples). I think those data are forthcoming.