Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languager
library(tidyverse)
library(ggplot2)
library(ggpubr)#Read in sequencing count data

LRIIIReads <- read.table("/Volumes/Macintosh HD/Users/gregg/Downloads/LRIII.txt",

                         header=FALSE, stringsAsFactors = FALSE, na.strings = "")

#adjust names back to simple sample names

LRIIIReads <- separate(LRIIIReads, 2, sep="[.]", into = c(NA, "samplename", NA, NA, NA, NA, NA, NA, NA))

#remove the total line and rename columns

LRIIIReads <- LRIIIReads[(1:144),]

colnames(LRIIIReads) <- c("readsx4", "samplename")

#Divide read count by 4 to get the proper number

LRIIIReads$readsk <- LRIIIReads$readsx4 / 4

#Add the replicates together

LRIIIReads <- LRIIIReads %>%

  group_by(samplename) %>%

  summarise(reads = sum(readsk))

#read in the qPCR data, subset columns to sample name, mean qty, and standard deviation

DGqPCR <- read.csv("/Volumes/Macintosh HD/Users/gregg/Downloads/5DG5_COL456.csv",

                   header=TRUE, stringsAsFactors = FALSE, na.strings = "", skip =14 )

DGqPCR <- DGqPCR[ , c(2, 8, 9)]

DGqPCR$plate <- "5DG5"

LVDqPCR <- read.csv("/Volumes/Macintosh HD/Users/gregg/Downloads/5LVD5_Col123.csv",

                    header=TRUE, stringsAsFactors = FALSE, na.strings = "", skip =14 )

LVDqPCR <- LVDqPCR[ , c(2, 8, 9)]

LVDqPCR$plate <- "5LVD5"

MCqPCR <- read.csv("/Volumes/Macintosh HD/Users/gregg/Downloads/MC_MB_Comparison.csv",

                   header=TRUE, stringsAsFactors = FALSE, na.strings = "", skip =14 )

MCqPCR <- MCqPCR[ , c(2, 8, 9)]

MCqPCR$plate <- "MC"

#adjust to match naming conventionsbetween files

MCqPCR[,1] <- gsub("MC_A", "MCA_", MCqPCR[,1])

MCqPCR[,1] <- gsub("MC_B", "MCB_", MCqPCR[,1])

MCqPCR[,1] <- gsub("MC_C", "MCC_", MCqPCR[,1])

MCqPCR[,1] <- gsub("MC_D", "MCD_", MCqPCR[,1])

#combine data

qPCR <- rbind(DGqPCR, LVDqPCR, MCqPCR)

#reduce to one incidence of each sample

qPCR <- qPCR %>%

  group_by(Sample.Name) %>%

  slice(1)

colnames(qPCR) <- c("samplename", "mean", "stddev", "plate")

#combine qPCR and read data together

LRIIIReads <- left_join(LRIIIReads, qPCR, by = "samplename")

#plot reads vs qPCR qty mean

ggplot(LRIIIReads, aes(readsmean, meanreads, color=plate, shape=plate))+
   geom_point(show.legend = TRUE)+
   geom_errorbar(aes(yminxmin=mean-stddev, ymaxxmax=mean+stddev), width=.2)+
   geom_smooth(method='lm', formula= y~x)+
   stat_regline_equation(label.y = 250200000, aes(label = ..eq.label..)) +
   stat_regline_equation(label.y = 150150000, aes(label = ..rr.label..))+
   facet_wrap(~plate)
unknown.pngImage RemovedImage Added

This initial graph points toward there being more of a plate effect than a MagBead treatment effect. But, some of the standard deviations are large and we have seen poor qPCR results from this machine or our prep before. So, removal of outliers even from such a small group does not seem unreasonable. There are two outliers in the 5DG5 data that seem to be reducing the fit when compared to the 5LVD5 data. A compounding factor is that 5LVD5’s reads and qPCR numbers are all largely the same, so any carryover from tip sharing could only really have impacted one low read and low qPCR sample.

Code Block
languager
#Remove the 2 outliers

adjLRIIIReads <- LRIIIReads[c(1:6, 8:13, 15:72),]

ggplot(adjLRIIIReads, aes(readsmean, meanreads, color=plate, shape=plate))+
   geom_point(show.legend = TRUE)+
   geom_errorbar(aes(yminxmin=mean-stddev, ymaxxmax=mean+stddev), width=.2)+
   geom_smooth(method='lm', formula= y~x)+
   stat_regline_equation(label.y = 250200000, aes(label = ..eq.label..)) +
   stat_regline_equation(label.y = 150150000, aes(label = ..rr.label..))+
   facet_wrap(~plate)

Removing the 2 outliers from the 5DG5 (max tip use) data, drastically improves the line fit above 5LVD5’s fit.

...

I repeated the same analyses with the filtermergestats.csv data. Nothing changed.

I think we need to either do a larger experiment prior to sending out NovaSeq5. Or, we pool NovaSeq5 as normal, qPCR the first 72 samples singly from 10? plates and then use the sequencing data from it with the qPCR data to decide the pooling standard moving forward. I would advocate for option B.

We are going to add absorbance into the mix as a cheaper and quicker tool for normalized pooling. We will get absorbance readings for these same products and do the same comparison to reads.

Files:

View file
nameLRIIIanalysis.pdf
View file
nameLRIIIanalysis.Rmd
View file
nameMC_MB_Comparison.csv
View file
nameLRIIIfiltermergestats.csv
View file
nameLRIII.txt
View file
name5LVD5_Col123.csv
View file
name5DG5_COL456.csv
View file
nameLRIIIanalysis.pdf
View file
nameLRIIIanalysis.Rmd