Part 4 CITE-seq normalization using empty droplets with the dsb package

In this post, we are going to try a CITE-seq normalization method called dsb published in Normalizing and denoising protein expression data from droplet-based single cell profiling two major components of protein expression noise in droplet-based single cell experiments: (1) protein-specific noise originating from ambient, unbound antibody encapsulated in droplets that can be accurately estimated via the level of “ambient” ADT counts in empty droplets, and (2) droplet/cell-specific noise revealed via the shared variance component associated with isotype antibody controls and background protein counts in each cell.

Part 3 Centered log ratio (CLR) normalization for CITE-seq protein count data

Following my last blog post, download the CITE-seq protein and RNA count data at here. library(Seurat) library(ggplot2) library(dplyr) pbmc<- readRDS("~/blog_data/CITEseq/pbmc1k_adt.rds") How to normalize protein ADT count data? Seurat uses the centered log ratio (CLR) to normalize protein count data. In the Seurat github page: clr_function <- function(x) { return(log1p(x = x / (exp(x = sum(log1p(x = x[x > 0]), na.rm = TRUE) / length(x = x))))) } log1p(x) computes log(1+x) accurately also for |x| << 1.

Part 2 CITE-seq downstream analysis: From Alevin output to Seurat visualization

In my last post, I showed you how to get the protein and RNA counts from a CITE-seq experiment using Simpleaf. Now that we have the raw count matrices, we are ready to explore them within R. To follow the tutorial, you can download the associated data from here. Load the data suppressPackageStartupMessages({ library(fishpond) library(ggplot2) library(dplyr) library(SingleCellExperiment) library(Seurat) library(DropletUtils) }) # set the seed set.seed(123) #gex_q <- loadFry('~/blog_data/CITEseq/alevin_rna/af_quant') #fb_q <- loadFry( '~/blog_data/CITEseq/alevin_adt/af_quant') # I saved the above objs first to rds files, now just read them back fb_q<- readRDS("~/blog_data/CITEseq/fb_q.

Part 1 How to use Salmon/Alevin to preprocess CITE-seq data

Introduction A state-of-the-art method called CITE-seq (Cellular Indexing of Transcriptomes and Epitopes by Sequencing) allows surface protein levels and RNA expression to be measured simultaneously in individual cells. CITE-seq uses traditional single-cell RNA-sequencing to read out both transcriptome and proteomic information from the same cell after labeling it with oligo-conjugated antibodies. This gets over the drawbacks of techniques that just test proteins or RNA separately. CITE-seq reveals coordinated control of gene and protein activity, offering a potent multidimensional perspective of cell states.