Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Splitting the raw (uncompressed) data was accomplish with the program split (done in an interactive SLURM job), with 16x106 lines (4x106 reads) being written to each file (with a remainder in the final file). These files were written to /gscratch and, as intermediate files that can be reconstructed readily, will not be retained long-term.

Code Block
mkdir -p /gscratch/buerkle/psomagen_9oct20_novaseq3/rawdata
cd /gscratch/buerkle/psomagen_9oct20_novaseq3/rawdata
split -l 16000000 -d --suffix-length=3 --additional-suffix=.fastq  /pfs/tsfs1/project/microbiome/data/seq/psomagen_9oct20_novaseq3/rawdata/NovaSeq3_R1.fastq  novaseq3_R1_ ;
split -l 16000000 -d --suffix-length=3 --additional-suffix=.fastq  /pfs/tsfs1/project/microbiome/data/seq/psomagen_9oct20_novaseq3/rawdata/NovaSeq3_R2.fastq  novaseq3_R2_

...