Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

The red Low Read II samples were renormalized via pooling and the new MC samples were added to that same pool at 1 ul per sample. This pool was then adjusted to 1nM. The pool of the repeated RMJan22 samples was also adjusted to 1 nM. 50 ul of Low Read was added to 100 ul RMJan22. However, when the LowRead pool was qPCRed, one replicate was much higher than the other 2. The 1:2 ratio might be off because of this. We are running an RNaseP plate to recheck the recalibration of the 7500 qPCR machine. We ran an RNaseP Test plate to check the recalibration of the qPCR machine and it looks good. Because the reads passing filter was so low (35%), we are going to re-qPCR new dilutions of the pools and try this again.

...

Code Block
/project/microbiome/data_queue/seq/LowReadIIloc_ad2/rawdata

salloc --account=microbiome -t 0-06:00

mkdir -p /gscratch/grandol1/loc_ad2/rawdata

cd /gscratch/grandol1/loc_ad2/rawdata

unpigz --to-stdout /project/microbiome/data_queue/seq/loc_ad2/rawdata/LRII-RMJAN22_S1_L001_R1_001.fastq.gz | split -l 1000000 -d --suffix-length=3 --additional-suffix=.fastq - LowReadII_R1_ ;unpigz --to-stdout /project/microbiome/data_queue/seq/loc_ad2/rawdata/LRII-RMJAN22_S1_L001_R2_001.fastq.gz | split -l 1000000 -d --suffix-length=3 --additional-suffix=.fastq - LowReadII_R2_

//project/microbiome/data_queue/seq/loc_ad2/rawdata/run_parse_count_onSplitInput.pl

cd /project/microbiome/data_queue/seq/loc_ad2/rawdata

./run_splitFastq_fwd.sh

./run_splitFastq_rev.sh

cd /project/microbiome/data_queue/seq/loc_ad2/rawdata

./run_aggregate.sh

...

Returns: 15371232

Divided by 8: 1921404 assigned to samples.

Assigned/Total (*100) = percent assigned: ~73%

...

Divided by 8: 1412494

LRII + locad2: 1921404

Even if all the unassigned reads are from locad2, this does not fix the expected ration of 2lo:1LR.

[508910+(2630678-1921404)] = 1218184 total possible locad2 reads

cd /project/microbiome/data_queue/seq/loc_ad2/tfmergedreads

./run_slurm_mergereads.pl

cd /project/microbiome/data_queue/seq/LowReadIIloc_ad2/otu

./run_slurm_mkotu.pl

Assign taxonomy

Code Block
salloc --account=microbiome -t 0-02:00 --mem=500G

module load swset/2018.05  gcc/7.3.0

module load vsearch/2.15.1

vsearch --sintax zotus.fa --db /project/microbiome/users/grandol1/ref_db/gg_16s_13.5.fa -tabbedout LRII.sintax -sintax_cutoff 0.8

Output:

Reading file /project/microbiome/users/grandol1/ref_db/gg_16s_13.5.fa 100%  

1769520677 nt in 1262986 seqs, min 1111, max 2368, avg 1401

Counting k-mers 100% 

Creating k-mer index 100% 

Classifying sequences 100%   

Classified 4038 of 4042 sequences (99.90%)

Convert into useful form:

Code Blockawk -F "\t" '{OFS=","} NR==1 {print "OTU_ID","SEQS","SIZE","DOMAIN","KINGDOM","PHYLUM","CLASS","ORDER","FAMILY","GENUS","SPECIES"} {gsub(";", ","); gsub("centroid=", ""); gsub("seqs=", ""); gsub("size=", ""); match($4, /d:[^,]+/, d); match($4, /k:[^,]+/, k); match($4, /p:[^,]+/, p); match($4, /c:[^,]+/, c); match($4, /o:[^,]+/, o); match($4, /f:[^,]+/, f); match($4, /g:[^,]+/, g); match($4, /s:[^,]+/, s); print $1, d[0]=="" ? "NA" : d[0], k[0]=="" ? "NA" : k[0], p[0]=="" ? "NA" : p[0], c[0]=="" ? "NA" : c[0], o[0]=="" ? "NA" : o[0], f[0]=="" ? "NA" : f[0], g[0]=="" ? "NA" : g[0], s[0]=="" ? "NA" : s[0] }' LRII.sintax > LRIItaxonomy.csv

...