Combining Multiple Single cell RNA-seq Datasets: Part II

In this post, I will try to combine Nowakowski et al., 2017 , BrainSpan and Zhong et al., 2017 datasets.

All of these datasets are publicly available. For more information, please visit the Resources page

First let's combine the Nowakowski and Brainsan datasets.

df0.all <- MergeSeurat(df0.BS, df0.nowa,
                       min.cells = 10,
                       do.normalize = T,
                       scale.factor = 100000)

Normalize both datasets:

Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
p1 <- plotGeneDistr("PAX6",object1 = df0.all)
p2 <- plotGeneDistr("BCL11B",object1 = df0.all)
p3 <- plotGeneDistr("GAD1",object1 = df0.all)
p4 <- plotGeneDistr("OLIG1",object1 = df0.all)
plot_grid(p1, p2, p3, p4, nrow = 2)

Find highly variable genes to conduct PCA analysis:

Calculating gene means
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating gene variance to mean ratios
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|