2025-06-18
在單細(xi)胞測序的數據挖掘中,如何從(cong)海量數據中精準捕捉(zhuo)關鍵(jian)信(xin)(xin)(xin)息(xi),揭示細(xi)胞異質性與功能差異?大家首先(xian)想到的可(ke)(ke)能是(shi)venn圖(tu),在集(ji)(ji)合不(bu)超過(guo)5個的時候,venn圖(tu)的可(ke)(ke)視化(hua)結果非常直觀,但是(shi)一旦(dan)數據集(ji)(ji)增加,很難從(cong)圖(tu)中解讀出想要的信(xin)(xin)(xin)息(xi)。這時候,我們可(ke)(ke)以應用外(wai)形美觀、展示信(xin)(xin)(xin)息(xi)又很直觀的花瓣圖(tu)。
每一(yi)片花(hua)瓣都代表(biao)(biao)一(yi)個組(zu)/一(yi)個細胞(bao)類型的差異基(ji)因(yin),數字則表(biao)(biao)示(shi)差異基(ji)因(yin)數量,更(geng)直觀(guan)地觀(guan)察到不同細胞(bao)類型或不同處理下(xia)差異基(ji)因(yin)的表(biao)(biao)達模式與分布特(te)征,從而(er)快(kuai)速鎖定關鍵基(ji)因(yin),深(shen)入挖掘(jue)其背后的生物學機制。
下(xia)面我們(men)來學習花瓣圖的(de)繪制方法(fa):
主要需要ggVennDiagram和ggplot2兩(liang)個包
一、數(shu)據(ju)準(zhun)備,我們需(xu)要(yao)差異基因列表和對應(ying)的(de)cluster名字:
library(ggplot2)
library("VennDiagram")
> str(sub_gene_list)
List of 7
$ BCells : chr [1:3487] "Rps29" "Rpl37a" "Rps27" "Sub1" ...
$ DCs : chr [1:3427] "Rpl37a" "Rps29" "Rbm3" "Pfn1" ...
$ ECs : chr [1:4803] "Rps29" "Rpl37a" "Rplp1" "Rps15" ...
$ EpithelialCells: chr [1:9566] "Trp63" "Serpinb5" "Krt6b" "Krt6a" ...
$ Macrophages : chr [1:3523] "Rps29" "Pfn1" "Rpl37a" "Rbm3" ...
$ Neutrophils : chr [1:2298] "Pfn1" "Col1a1" "Cd52" "S100a8" ...
$ TCells : chr [1:2972] "Rpl37a" "Rps29" "S100a9" "Rpl38" ...
二、畫圖
all_diff_genes = read.table("/PERSONALBIO/work/All_diff_gene.xls",sep="\t",header = T)
sub_diff_genes = all_diff_genes[all_diff_genes$cluster %in%c("BCells","TCells","Neutrophils"),]
unique(sub_diff_genes$cluster)
sub_gene_list <- split(sub_diff_genes$gene, sub_diff_genes$cluster)
p1=ggVennDiagram(sub_gene_list, label_alpha=0)+ scale_fill_distiller(palette = "RdBu")
sub_diff_genes = all_diff_genes[all_diff_genes$cluster %in%c("BCells","TCells","Neutrophils","DCs","Macrophages"),]
unique(sub_diff_genes$cluster)
sub_gene_list <- split(sub_diff_genes$gene, sub_diff_genes$cluster)
p1=ggVennDiagram(sub_gene_list, label_alpha=0)+ scale_fill_distiller(palette = "RdBu")
sub_diff_genes = all_diff_genes[all_diff_genes$cluster %in%c("BCells","TCells","Neutrophils","DCs","Macrophages","EpithelialCells","ECs"),]
unique(sub_diff_genes$cluster)
sub_gene_list <- split(sub_diff_genes$gene, sub_diff_genes$cluster)
p = ggVennDiagram(sub_gene_list,label_size= 5,label= "count",label_geom = "text",
set_color = CustomCol2(1:7))+scale_fill_gradient(low="#222F75",high = "firebrick")
花瓣圖以更加直觀、生動的(de)方式呈現數據,能夠迅(xun)速捕捉到(dao)差(cha)異基因的(de)關鍵信息;花瓣的(de)大(da)小、顏色、等屬性(xing)(xing)可自定義,以展示差(cha)異基因的(de)多種(zhong)生物學特征,如表(biao)達水平、差(cha)異顯著(zhu)性(xing)(xing)、功能注釋(shi)等。這(zhe)種(zhong)多維度的(de)展示方式,有助于全(quan)面(mian)、深入地理解數據。
大家都動(dong)手試試吧!繪制圖片或者(zhe)(zhe)復現代碼過程(cheng)中,如(ru)果老師遇到疑惑,歡迎撥打(da)我們(men)的熱線電話或者(zhe)(zhe)聯系我們(men)的駐地銷售。