Bioinformatics

Navigating Variant Calling for Disease-Causing Mutations: The state-of-art process

Disclaimer: This post is sponsored by Watershed Omics Bench platform. I have personally tested the platform. The opinions and views expressed in this post are solely those of the author and do not represent the views of my employer Variant calling is the process of identifying and categorizing genetic variants in sequencing data. It is a critical step in the analysis of whole-genome sequencing (WGS) and whole-exome sequencing (WES) data, as it allows researchers to identify potential disease-causing mutations.

Reuse the single cell data! How to create a seurat object from GEO datasets

Download the data https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE116256 cd data/GSE116256 wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE116nnn/GSE116256/suppl/GSE116256_RAW.tar tar xvf GSE116256_RAW.tar rm GSE116256_RAW.tar Depending on how the authors upload their data. Some authors may just upload the merged count matrix file. This is the easiest situation. In this dataset, each sample has a separate set of matrix (*dem.txt.gz), features and barcodes. Total, there are 43 samples. For each sample, it has an associated metadata file (*anno.txt.gz) too. You can inspect the files in command line:

10 single-cell data benchmarking papers

I tweeted it at https://twitter.com/tangming2005/status/1679120948140572672 I got asked to put all my posts in a central place and I think it is a good idea. And here it is! Benchmarking integration of single-cell differential expression Benchmarking atlas-level data integration in single-cell genomics A review of computational strategies for denoising and imputation of single-cell transcriptomic data Benchmarking spatial and single-cell transcriptomics integration methods for transcript distribution prediction and cell type deconvolution

How to add boxplots or density plots side-by-side a scatterplot: a single cell case study

introduce ggside using single cell data The ggside R package provides a new way to visualize data by combining the flexibility of ggplot2 with the power of side-by-side plots. We will use a single cell dataset to demonstrate its usage. ggside allows users to create side-by-side plots of multiple variables, such as gene expression, cell type, and experimental conditions. This can be helpful for identifying patterns and trends in scRNA-seq data that would be difficult to see in individual plots.

Unlock the Power of Genomics Data Analysis: Watershed's Seamless Cloud Computing Solution

Disclaimer: This post is sponsored by Watershed Omics Bench platform. I have personally tested the platform. The opinions and views expressed in this post are solely those of the author and do not represent the views of my employer As an experienced bioinformatician who understands the needs of biotech startups, I know the challenges that arise when analyzing genomics data. The first solution that comes to mind is cloud computing. Unsurprisingly, AWS and Google Cloud Platform (GCP) are commonly used options.

How to do neighborhood/cellular niches analysis with spatial transcriptome data

Sign up for my newsletter to not miss a post like this https://divingintogeneticsandgenomics.ck.page/newsletter In a previous blog post, I showed you how to make a Seurat spatial object from Vizgen spatial transcriptome data. In this post, I am going to show you how to identify clusters of neighborhood or cellular niches where specific cell types tend to co-localize. read in the data and pre-process library(Seurat) library(here) library(ggplot2) library(dplyr) # the LoadVizgen function requires the raw segmentation files which is too big.

How to classify MNIST images with convolutional neural network

Introduction An artificial intelligence system called a convolutional neural network (CNN) has gained a lot of popularity recently. For jobs like image recognition, where we want to teach a computer to recognize things in a picture, they are especially well suited. CNNs operate by dissecting an image into increasingly minute components, or “features.” The network then examines each feature and searches for patterns shared by various objects. For instance, a CNN might come to understand that some pixel patterns are frequently linked to faces, while others are linked to vehicles or trees.

How to construct a spatial object in Seurat

Sign up for my newsletter to not miss a post like this https://divingintogeneticsandgenomics.ck.page/newsletter Single-cell spatial transcriptome data is a new and advanced technology that combines the study of individual cells’ genes and their location in a tissue to understand the complex cellular and molecular differences within it. This allows scientists to investigate how genes are expressed and how cells interact with each other with much greater detail than before.

Deep learning to predict cancer from healthy controls using TCRseq data

Sign up for my newsletter to not miss a post like this https://divingintogeneticsandgenomics.ck.page/newsletter The T-cell receptor (TCR) is a special molecule found on the surface of a type of immune cell called a T-cell. Think of T-cells like soldiers in your body’s defense system that help identify and attack foreign invaders like viruses and bacteria. The TCR is like a sensor or antenna that allows T-cells to recognize specific targets, kind of like how a key fits into a lock.

Deep learning with Keras using MNIST dataset

Sign up for my newsletter to not miss a post like this https://divingintogeneticsandgenomics.ck.page/newsletter Introduction Are you a machine learning practitioner or data analyst looking to broaden your skill set? Look nowhere else! This blog post will offer an introduction to deep learning, which is currently the hottest topic in machine learning. Using the well-known MNIST dataset) and the Keras package, we will investigate the potential of deep learning.