TCGA

PCA analysis on TCGA bulk RNAseq data continued

To not miss a post like this, sign up for my newsletter to learn computational biology and bioinformatics. In my last blog post, I showed you how to download TCGA RNAseq count data and do PCA and make a heatmap. It is interesting to see some of the LUSC samples mix with the LUAD samples and vice versa. In this post, we will continue to use PCA to do more Exploratory data analysis (EDA).

PCA analysis on TCGA bulk RNAseq data

To not miss a post like this, sign up for my newsletter to learn computational biology and bioinformatics. what is PCA? Principal Component Analysis (PCA) is a mathematical technique used to reduce the dimensionality of large datasets while preserving the most important patterns in the data. It transforms the original high-dimensional data into a smaller set of new variables called principal components (PCs), which capture the most variation in the data.

How to convert raw counts to TPM for TCGA data and make a heatmap across cancer types

Sign up for my newsletter to not miss a post like this https://divingintogeneticsandgenomics.ck.page/newsletter The Cancer Genome Atlas (TCGA) project is probably one of the most well-known large-scale cancer sequencing project. It sequenced ~10,000 treatment-naive tumors across 33 cancer types. Different data including whole-exome, whole-genome, copy-number (SNP array), bulk RNAseq, protein expression (Reverse-Phase Protein Array), DNA methylation are available. TCGA is a very successful large sequencing project. I highly recommend learning from the organization of it.