Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

A common question in sequencing relates to the diminishing returns of sequencing the same material to greater and greater depth and at what point our uncertainty about the frequency of a component has diminished sufficiently (an allele at a locus, or a taxonomic unit in an environmental sample), given the trade-off between sequencing more at this locus or sequencing more samples or genomic regions.

We can simply this problem to wanting to know the frequency of an allele or taxon, and the frequency of everything else (the complement). This binomial categorization is a simplification of the multinomial and has everything we need (there is no gain in considering the frequencies of other categories also for this illustration; the binomial model is entirely representative of those multinomial models).

The greatest uncertainty exists when the category of interest is at frequency 0.5, so I present that as an upper bound below. I also illustrate the uncertainty for a frequency of 0.05, and the diminishing returns on sequencing at greater depth for that sample.

Here is the code should you want to play with this.

  • No labels