生物信息里面有幾種典型的network:
- PPI,就是蛋白互做的網絡,直接可以從STRING數據庫下載;
- TF correlation network,就是根據轉錄組的數據來構建相關性;
- TF target network,SCENIC等就是做這個的;
自己構思有點難,那就開啟寫輪眼開始模仿吧。
這里推薦一篇論文:2015 – PNAS - Human cerebral organoids recapitulate gene expression programs of fetal neocortex development 【in vivo和in vitro單細胞比較】
見圖2C,為什么這種圖這么受實驗老板的喜歡,因為符合直覺感知,也提供了非常核心的信息。
- trajectory已經揭示了細胞的分化路徑;
- TF網絡則揭示了在每一個階段發揮關鍵作用的TF;
信息不在多,而在於精,最后吧所有核心的TF列出來,實驗的老板就high了。
如何代碼實現?這篇文章很良心,方法里描述得比較清晰了。
For Fig. 2C, for the TF network analysis, we computed a pairwise correlation matrix for TFs annotated as such in the “Animal Transcription Factor Database” (www.bioguo.org/AnimalTFDB/) (39) and identified those TFs with a correlation of greater than 0.3 with at least three other TFs (99 TFs).
做了一個嚴格論證分析
We used a permutation approach to determine the probability of finding TFs meeting this threshold by chance. We randomly shuffled the columns (TFs) of each row (cells) 500 times and calculated the pairwise correlation matrix for each permutation of the input data frame. After each permutation, we counted the number of TFs meeting our threshold. The majority of randomized data frames (96%) resulted in 0 TFs that met our threshold. The maximum number of TFs that met our threshold was 2, which occurred in only 0.2% of the permutations. In contrast, our data resulted in 99 TFs that met this threshold, which suggests that our threshold was strict, but all nodes and connections that we present in the TF network are highly unlikely to be by chance.
We used the pairwise correlation matrix for the selected TFs as input into the function graph.adjacency() of igraph implemented in R (igraph.sf.net) to generate a weighted network graph, in which the selected TFs are presented as vertices and all pairwise correlations >0.2 are presented as edges linking the respective vertices.
The network graph was visualized using the fruchterman reingold layout.
好像有點廢話了, 表達最高,那肯定相關,最后肯定在network被聚到一起。
TF vertices were manually color coded based on the expression pattern along the monocle lineage. Green, teal, and blue represent highest average expression in APs, BPs, and neurons, respectively.
以下是我的模仿結果:
待續~