RNA-seq

Batch Effect: To Correct or Not for Bulk RNA-seq Data

A colleague sent me a PCA plot last week. Three replicates per condition, control vs KO of a gene. PC1 cleanly separated replicate 1 from replicate 2 from replicate 3. PC2 separated control vs KO. The question that followed: should we run limma::removeBatchEffect() before differential expression? The answer is “it depends,” and most of the time the answer is “no, do not pre-correct the matrix before DE testing.” But the question hides a few subtleties worth unpacking, because I have seen people both under-correct (ignore a real block effect) and over-correct (treat true biological variation as batch and squash it).

Reviving BETA for Python 3: Integrating ChIP-seq and RNA-seq to Predict TF Targets

I started to learn bioinformatics because I needed to analyze public ChIP-seq data in 2012. That’s how I got to know Shirley Liu’s lab at Dana-Farber Cancer Institute. And God knows that I would join her group in 2020 for a staff scientist position to lead the CIDC bioinformatic project. I witnessed the development of many groundbreaking computational tools for genomics in Shirley’s lab. One tool that I found particularly elegant was BETA (Binding and Expression Target Analysis), developed by Su Wang and published in Nature Protocols in 2013.