I asked this question on Twitter:
what test to test if two distributions are different? I am aware of KS test. When n is large (which is common in genomic studies), the p-value is always significant. better to test against an effect size? how to do it in this context?
In genomics studies, it is very common to have large N (e.g., the number of introns, promoters in the genome, number of cells in the single-cell studies).
I asked this question on twitter.
load the package library(tidyverse) make some dummy data The dummy example: We have two groups of samples: disease and health. We treat those cells in vitro with different dosages (0, 1, 5) of a chemical X and count the cell number after 3 hours.
x <- tibble( '0' = c(8.66, 11.50, 7.01, 13.40, 11.30, 8.13, 5.92, 7.54), '1' = c(22.10, 23.00, 22.00, 35.70, 32.
I was reading Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model. In the paper, the authors model the scRNAseq counts using a multinomial distribution.
I was using negative binomial distribution for modeling in my last post, so I asked the question on twitter:
for modeling RNAseq counts, what’s the difference/advantages using negative binomial and multinomial distribution? — Ming Tang (@tangming2005) November 26, 2019 some quotes from the answers I get from Matthew