Download and align single cell RNA-seq data: Part II
Here is the code I use to align downloaded RNA-seq datasets.
I use the popular STAR aligner, which is fast and simple to implement. Before the actual aligning step, the genome needs to be generated with the following command:
#!/bin/bash
#PBS -q hotel
#PBS -N genGen
#PBS -l nodes=1:ppn=8
#PBS -l walltime=168:00:00
#PBS -M z4li@ucsd.edu
#PBS -V
#PBS -m abe
STAR --genomeDir ~/zhen/Reference_genome/ \
--runMode genomeGenerate \
--genomeFastaFiles ~/zhen/Reference_genome/hg38.fa \
--runThreadN 7 \
The reference genome (in this case hg38) can be downloaded from UCSC.
#!/bin/bash
#PBS -q hotel
#PBS -N alignment_test
#PBS -l nodes=1:ppn=10
#PBS -l walltime=168:00:00
#PBS -M z4li@ucsd.edu
#PBS -V
#PBS -m abe
cd ~/zhen/data/raw_data/GSE81475/fastq/
names=$(ls -S | head -2 | paste -sd "," -)
#echo $names
genomeDir=~/zhen/Reference_genome/hg38/
runDir=../test/1pass/
STAR --outFileNamePrefix $runDir --genomeDir $genomeDir --readFilesIn $names --runThreadN 9
wait
echo All processes completed!