如何在R中可视化大型网络？-Java 学习之路

网络可视化在实践中在科学中变得普遍 . 但随着网络规模的扩大，常见的可视化变得不那么有用 . 有太多的节点/顶点和链接/边缘 . 通常，可视化工作最终会产生“毛球” .

已经提出了一些新方法来克服这个问题，例如：

边缘捆绑：
http://vis.stanford.edu/papers/divided-edge-bundling或
https://gephi.org/tag/edge-bundling/
层次边缘捆绑：
http://graphics.cs.illinois.edu/sites/graphics.dev.engr.illinois.edu/files/edgebundles.pdf
组属性布局：
http://wiki.cytoscape.org/Cytoscape_3/UserManual
How to make grouped layout in igraph?

我相信还有更多方法 . 因此，我的问题是： How to overcome the hairball issue, i.e. how to visualize large networks by using R?

以下是一些模拟示例网络的代码：

# Load packages
lapply(c("devtools", "sna", "intergraph", "igraph", "network"), install.packages)
library(devtools)
devtools::install_github(repo="ggally", username="ggobi")
lapply(c("sna", "intergraph", "GGally", "igraph", "network"), 
       require, character.only=T)

# Set up data
set.seed(123)
g <- barabasi.game(1000)

# Plot data
g.plot <- ggnet(g, mode = "fruchtermanreingold")
g.plot

enter image description here

这个问题与Visualizing Undirected Graph That's Too Large for GraphViz?有关 . 但是，我在这里搜索的不是一般软件推荐，而是 concrete examples (using the data provided above) which techniques help to make a good visualization of a large network by using R （与此主题中的示例相似：R: Scatterplot with too many points） .

4 回答

另一种可视化非常大型网络的方法是使用BioFabric（www.BioFabric.org），它使用水平线而不是点来表示节点 . 然后使用垂直线段显示边缘 . 这项技术的快速D3演示如下所示：http://www.biofabric.org/gallery/pages/SuperQuickBioFabric.html .

BioFabric是一个Java应用程序，但是一个简单的R版本可以在：https://github.com/wjrl/RBioFabric获得 .

这是一段R代码：

# You need 'devtools':
 install.packages("devtools")
 library(devtools)

 # you need igraph:
 install.packages("igraph")
 library(igraph)

 # install and load 'RBioFabric' from GitHub
 install_github('RBioFabric',  username='wjrl')
 library(RBioFabric)

 #
 # This is the example provided in the question:
 #

 set.seed(123)
 bfGraph = barabasi.game(1000)

 # This example has 1000 nodes, just like the provided example, but it 
 # adds 6 edges in each step, making for an interesting shape; play
 # around with different values.

 # bfGraph = barabasi.game(1000, m=6, directed=FALSE)

 # Plot it up! For best results, make the PDF in the same
 # aspect ratio as the network, though a little extra height
 # covers the top labels. Given the size of the network,
 # a PDF width of 100 gives us good resolution.

 height <- vcount(bfGraph)
 width <- ecount(bfGraph)
 aspect <- height / width;
 plotWidth <- 100.0
 plotHeight <- plotWidth * (aspect * 1.2)
 pdf("myBioFabricOutput.pdf", width=plotWidth, height=plotHeight)
 bioFabric(bfGraph)
 dev.off()

以下是提问者提供的BioFabric数据版本的照片，尽管使用m> 1的值创建的网络更有趣 . 插图细节显示了网络左上角的特写;节点BF4是网络中的最高度节点，默认布局是从该节点开始的网络（忽略边缘方向）的广度优先搜索，其中相邻节点按节点度降低的顺序遍历 . 请注意，我们可以立即看到，例如，大约60％的节点BF4的邻居是1级 . 我们还可以从严格的45度下边缘看到这个1000节点网络有999个边缘，因此是树 .

BioFabric presentation of example data

完全披露：BioFabric是我写的工具 .

回复于 2024-05-10T01:09:10+08:00

16

那个's an interesting question, I didn't知道你列出的大部分工具，谢谢 . 您可以将HivePlot添加到列表中 . 它's a deterministic method consisting in projecting nodes on a fixed number of axes (usually 2 or 3). Look a the linked page, there'有很多视觉例子 .

如果数据集中有分类节点属性，则可以更好地工作，以便您可以使用它来选择节点所在的轴 . 例如，在研究大学的社交网络时：一个是学生，另一个是教师，第三个是行政人员 . 但当然，它也可以使用离散的数字属性（例如，各自轴上的年轻人，中年人和老年人） .

然后你需要另一个属性，这次它必须是数字（或至少是序数） . 它用于确定节点在其轴上的位置 . 您还可以使用一些拓扑测量，例如度数或传递性（聚类系数） .

How to build a hiveplot http://www.hiveplot.net/img/hiveplot-undirected-01.png

该方法具有确定性的事实很有意思，因为它允许比较代表不同（但可比较）系统的不同网络 . 例如，您可以比较两所大学（假设您使用相同的属性/度量来确定轴和位置） . 它还允许通过选择不同的属性/度量组合来生成可视化，以各种方式描述相同的网络 . 实际上，这是通过所谓的蜂巢面板实现网络可视化的推荐方式 .

我在本文开头提到的页面中列出了几个能够生成这些hive图的软件，包括Java和R中的实现 .

回复于 2024-05-10T01:09:10+08:00
7
我最近一直在处理这个问题 . 结果，我想出了另一个解决方案 . 按社区/群集折叠图表 . 这种方法类似于上述OP概述的第三种选择 . 作为警告，这种方法最适用于无向图 . 例如：
```
library(igraph)

set.seed(123)
g <- barabasi.game(1000) %>%
  as.undirected()

#Choose your favorite algorithm to find communities.  The algorithm below is great for large networks but only works with undirected graphs
c_g <- fastgreedy.community(g)

#Collapse the graph by communities.  This insight is due to this post http://stackoverflow.com/questions/35000554/collapsing-graph-by-clusters-in-igraph/35000823#35000823

res_g <- simplify(contract(g, membership(c_g)))
```
此过程的结果如下图所示，顶点的名称代表社区成员资格 .
```
plot(g, margin = -.5)
```
以上显然比这个可怕的混乱更好
```
plot(r_g, margin = -.5)
```
要将社区链接到原始顶点，您将需要类似于以下内容的内容
```
mem <- data.frame(vertices = 1:vcount(g), memeber = as.numeric(membership(c_g)))
```
IMO这是一个很好的方法有两个原因 . 首先，它理论上可以处理任何大小的图形 . 在折叠图上可以不断重复查找社区的过程 . 其次，采用交互式方法会产生非常可读的结果 . 例如，可以想象用户能够点击折叠图中的顶点来扩展该社区，显示其所有原始顶点 .
回复于 2024-05-10T01:09:10+08:00
2

另一个有趣的方案是networkD3 . 在这个库中有无数种表示图形的方法 . 特别是，我发现 forceNetwork 是一个有趣的选择 . 它是交互式的，因此可以让您真正探索您的网络 . 这对EDA来说很棒，但对于最终的工作来说可能也是如此 .

回复于 2024-05-10T01:09:10+08:00

如何在R中可视化大型网络？

4 回答

相关问题