There are so many public datasets there waiting for us to mine! It is the blessing and cursing as a computational biologist!
Metadata, or the data describing (e.g., responder or non-responder for the treatment) the data are critical in interpreting the analysis. Without metadata, your data are useless.
how to download GEO metadata again? I remember there is a way to click and download a table with GSM ids and other associated metadata.— Ming (Tommy) Tang (@tangming2005) November 29, 2021
Use SRA run selector
go to https://www.ncbi.nlm.nih.gov/Traces/study/ and type in the accession number
Metadata below the
Download column, a
SraRuntable.txt file will be downloaded.
Use GEOquery or GEOmetadatadb
A nextflow pipeline: nf-core/fetchngs
Command line tool
pip install ffq
ffq -t GSE GSE176021
pip install pysradb
pysradb metadata SRP000002 --detailed
- Recount3 summaries and queries for large-scale RNA-seq expression and splicing. paper recently published https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02533-6
- Digital Expression Explorer 2 from Mark Ziemann.
- Other databases I curated https://github.com/crazyhottommy/RNA-seq-analysis#databases