前情回顾:
读李霞老师《生物信息学》教材
Gephi网络图极简教程
Network在单细胞转录组数据分析中的应用
网络数据统计分析笔记|| 为什么研究网络
网络数据统计分析笔记|| 操作网络数据
网络数据统计分析笔记|| 网络数据可视化
网络数据统计分析笔记|| 网络数据的描述性分析
网络数据统计分析笔记||网络图的数学模型
网络数据统计分析笔记|| 网络图的统计模型
网络数据统计分析笔记|| 网络拓扑结构推断
网络数据统计分析笔记|| 网络图上的过程建模与预测
网络数据统计分析笔记|| 动态网络
网络数据统计分析笔记|| 案例1分析单细胞转录组数据
在R语言的世界里学习网络数据分析,我们对一个网站不会陌生:https://kateto.net/
这里提供了大量network的应用实例,而且每年还会更新igraph的Workshop,而这正是network很好的入门材料,似乎成为网络学习的必经之路。所以,我们也不例外,把里面的流程走一遍,以扩展我们的想象力。
示例代码和数据可以在github下载:https://github.com/kateto
network 可视化的目的
network 可视化的类型
网络属性
网络布局
开始之前
- 必要的R包
install.packages("igraph")
install.packages("network")
install.packages("sna")
install.packages("ggraph")
install.packages("visNetwork")
install.packages("threejs")
install.packages("networkD3")
install.packages("ndtv")
- 下载示例数据和代码:
git clone https://github.com/kateto/R-Network-Visualization-Workshop.git
R中颜色调用
颜色很漂亮,但更重要的是,它们帮助人们区分对象的类型或属性的级别。在大多数R函数中,您可以使用命名颜色、十六进制或RGB值。
plot(x=1:10, y=rep(5,10), pch=19, cex=3, col="dark red")
points(x=1:10, y=rep(6, 10), pch=19, cex=3, col="557799")
points(x=1:10, y=rep(4, 10), pch=19, cex=3, col=rgb(.25, .5, .3))
查看 R内置了多少颜色:
length(colors()) # all colors)
[1] 657
光蓝色就有:
grep("blue", colors(), value=T) # colors that have 'blue' in the name
[1] "aliceblue" "blue" "blue1"
[4] "blue2" "blue3" "blue4"
[7] "blueviolet" "cadetblue" "cadetblue1"
[10] "cadetblue2" "cadetblue3" "cadetblue4"
[13] "cornflowerblue" "darkblue" "darkslateblue"
[16] "deepskyblue" "deepskyblue1" "deepskyblue2"
[19] "deepskyblue3" "deepskyblue4" "dodgerblue"
[22] "dodgerblue1" "dodgerblue2" "dodgerblue3"
[25] "dodgerblue4" "lightblue" "lightblue1"
[28] "lightblue2" "lightblue3" "lightblue4"
[31] "lightskyblue" "lightskyblue1" "lightskyblue2"
[34] "lightskyblue3" "lightskyblue4" "lightslateblue"
[37] "lightsteelblue" "lightsteelblue1" "lightsteelblue2"
[40] "lightsteelblue3" "lightsteelblue4" "mediumblue"
[43] "mediumslateblue" "midnightblue" "navyblue"
[46] "powderblue" "royalblue" "royalblue1"
[49] "royalblue2" "royalblue3" "royalblue4"
[52] "skyblue" "skyblue1" "skyblue2"
[55] "skyblue3" "skyblue4" "slateblue"
[58] "slateblue1" "slateblue2" "slateblue3"
[61] "slateblue4" "steelblue" "steelblue1"
[64] "steelblue2" "steelblue3" "steelblue4"
颜色值互换:
rgb(10, 100, 100, maxColorValue=255)
[1] "#0A6464"
思考:如果是反过来呢,如何转化?
plot(x=1:5, y=rep(5,5), pch=19, cex=16, col=rgb(.25, .5, .3, alpha=.5), xlim=c(0,6))
如果我们有一个十六进制的颜色,我们可以使用调整颜色从包grDevices设置透明度alpha。为了好玩,我们还使用par()函数将图形背景设置为灰色。我们不会在下面这样做,但是我们可以使用par设置图的边距(mar=c(底部、左侧、顶部、右侧)),或者告诉R在使用par添加新图之前不要清除前一个图(new=TRUE)。
par(bg="black")
col.tr <- grDevices::adjustcolor("#557799", alpha=0.7)
plot(x=1:5, y=rep(5,5), pch=19, cex=20, col=col.tr, xlim=c(0,6))
在许多情况下,我们需要一些对比色,或者一种颜色的多种深浅。R提供了一些预定义的调色板函数,可以为我们生成这些。例如:
色板(Palettes)
pal1 <- heat.colors(5, alpha=1) # generate 5 colors from the heat palette, opaque
pal2 <- rainbow(5, alpha=.5) # generate 5 colors from the heat palette, semi-transparent
plot(x=1:10, y=1:10, pch=19, cex=10, col=pal1)
par(new=TRUE) # tells R not to clear the first plot before adding the second one
plot(x=10:1, y=1:10, pch=19, cex=10, col=pal2)
我们也可以使用colorRampPalette生成我们自己的渐变。注意,colorRampPalette返回一个函数,我们可以使用该函数从调色板中生成所需的任意颜色。
palf <- colorRampPalette(c("gray70", "dark red"))
plot(x=10:1, y=1:10, pch=19, cex=10, col=palf(10))
palf <- colorRampPalette(c(rgb(1,1,1, .2),rgb(.8,0,0, .7)), alpha=TRUE)
plot(x=10:1, y=1:10, pch=19, cex=10, col=palf(10))
找到好的颜色组合是一项艰巨的任务-和内置的R调色板相当有限。值得庆幸的是,还有其他可用的软件包:
library("RColorBrewer")
display.brewer.all()
这个包有一个主要的功能,叫做brewery .pal。要使用它,您只需要选择所需的调色板和一些颜色。让我们看看一些RColorBrewer调色板:
par(mfrow=c(1,3))
display.brewer.pal(8, "Set3")
display.brewer.pal(8, "Spectral")
display.brewer.pal(8, "Blues")
par(mfrow=c(1,2)) # plot two figures - 1 row, 2 columns
pal3 <- brewer.pal(10, "Set3")
plot(x=10:1, y=10:1, pch=19, cex=6, col=pal3)
plot(x=10:1, y=10:1, pch=19, cex=6, col=rev(pal3)) # backwards
数据格式
在本教程中,我们将主要使用两个示例数据集。两者都包含有关媒体组织的数据。一种是新闻来源之间的超链接和提及。第二种是媒体场所和消费者之间的联系网络。
虽然这里使用的示例数据很小,但我们将生成的可视化背后的许多想法适用于中型和大型网络。这也是为什么我们很少使用某些可视化属性,比如节点符号的形状:在较大的图形映射中,这些属性是不可能区分的。事实上,在绘制非常大的网络时,我们甚至可能想要隐藏网络的边缘,而专注于识别和可视化节点的社区。
此时,您可以在R中可视化的网络的大小主要受到您机器的RAM的限制。但有一点需要强调的是,在很多情况下,将较大的网络可视化为巨大的毛团比提供图表来显示图表的主要特征更没有帮助。
清空我们的环境,并导入数据:
rm(list = ls())
# Set the working directory to the folder containing the workshop files:
setwd("F:\Rstudio\network\R-Network-Visualization-Workshop-master")
#setwd("C:/sunbelt2019")
# If you don't know the path to the folder and you're in RStudio, go to the
# "Session" menu -> "Set Working Directory" -> "To Source File Location"
library("igraph")
# Read in the data:
nodes <- read.csv("./Data files/Dataset1-Media-Example-NODES.csv", header=T, as.is=T)
links <- read.csv("./Data files/Dataset1-Media-Example-EDGES.csv", header=T, as.is=T)
head(nodes)
id media media.type type.label audience.size
1 s01 NY Times 1 Newspaper 20
2 s02 Washington Post 1 Newspaper 25
3 s03 Wall Street Journal 1 Newspaper 30
4 s04 USA Today 1 Newspaper 32
5 s05 LA Times 1 Newspaper 20
6 s06 New York Post 1 Newspaper 50
head(links)
from to type weight
1 s01 s02 hyperlink 22
2 s01 s03 hyperlink 22
3 s01 s04 hyperlink 21
4 s01 s15 mention 20
5 s02 s01 hyperlink 23
6 s02 s03 hyperlink 21
创建一个 igraph对象
接下来,我们将把原始数据转换为igraph网络对象。为此,我们将使用graph_from_data_frame()函数,该函数接受两个数据帧:d和vertices。
- d 描述网络的边缘。它的前两列是每条边的源节点和目标节点的id。下面的列是边属性(重量、类型、标签或其他)。
- vertices 顶点从一列节点id开始。下面的列都被解释为节点属性。
net <- graph_from_data_frame(d=links, vertices=nodes, directed=T)
# Examine the resulting object:
class(net)
[1] "igraph"
net
IGRAPH d431b72 DNW- 17 49 --
+ attr: name (v/c), media (v/c), media.type (v/n),
| type.label (v/c), audience.size (v/n), type (e/c),
| weight (e/n)
+ edges from d431b72 (vertex names):
[1] s01->s02 s01->s03 s01->s04 s01->s15 s02->s01 s02->s03
[7] s02->s09 s02->s10 s03->s01 s03->s04 s03->s05 s03->s08
[13] s03->s10 s03->s11 s03->s12 s04->s03 s04->s06 s04->s11
[19] s04->s12 s04->s17 s05->s01 s05->s02 s05->s09 s05->s15
[25] s06->s06 s06->s16 s06->s17 s07->s03 s07->s08 s07->s10
[31] s07->s14 s08->s03 s08->s07 s08->s09 s09->s10 s10->s03
+ ... omitted several edges
igraph对象的描述以四个字母开头:
- D或U,表示有向图或无向图
- N表示命名图(其中节点具有name属性)
- W表示加权图(其中边具有权重属性)
- B表示二部图(二模式图)(其中节点有类型属性)
(17 49)后面的两个数字表示图中的节点数和边数。描述还列出了节点和边缘属性,例如:
- (g/c) -图形级字符属性
- (v/c) -顶点级字符属性
- (e/n) -边缘级数值属性
E(net) # The edges of the "net" object
+ 49/49 edges from d431b72 (vertex names):
[1] s01->s02 s01->s03 s01->s04 s01->s15 s02->s01 s02->s03
[7] s02->s09 s02->s10 s03->s01 s03->s04 s03->s05 s03->s08
[13] s03->s10 s03->s11 s03->s12 s04->s03 s04->s06 s04->s11
[19] s04->s12 s04->s17 s05->s01 s05->s02 s05->s09 s05->s15
[25] s06->s06 s06->s16 s06->s17 s07->s03 s07->s08 s07->s10
[31] s07->s14 s08->s03 s08->s07 s08->s09 s09->s10 s10->s03
[37] s12->s06 s12->s13 s12->s14 s13->s12 s13->s17 s14->s11
[43] s14->s13 s15->s01 s15->s04 s15->s06 s16->s06 s16->s17
[49] s17->s04
V(net) # The vertices of the "net" object
+ 17/17 vertices, named, from d431b72:
[1] s01 s02 s03 s04 s05 s06 s07 s08 s09 s10 s11 s12 s13 s14
[15] s15 s16 s17
E(net)$type # Edge attribute "type"
[1] "hyperlink" "hyperlink" "hyperlink" "mention"
[5] "hyperlink" "hyperlink" "hyperlink" "hyperlink"
[9] "hyperlink" "hyperlink" "hyperlink" "hyperlink"
[13] "mention" "hyperlink" "hyperlink" "hyperlink"
[17] "mention" "mention" "hyperlink" "mention"
[21] "mention" "hyperlink" "hyperlink" "mention"
[25] "hyperlink" "hyperlink" "mention" "mention"
[29] "mention" "hyperlink" "mention" "hyperlink"
[33] "mention" "mention" "mention" "hyperlink"
[37] "mention" "hyperlink" "mention" "hyperlink"
[41] "mention" "mention" "mention" "hyperlink"
[45] "hyperlink" "hyperlink" "hyperlink" "mention"
[49] "hyperlink"
V(net)$media # Vertex attribute "media"
[1] "NY Times" "Washington Post"
[3] "Wall Street Journal" "USA Today"
[5] "LA Times" "New York Post"
[7] "CNN" "MSNBC"
[9] "FOX News" "ABC"
[11] "BBC" "Yahoo News"
[13] "Google News" "Reuters.com"
[15] "NYTimes.com" "WashingtonPost.com"
[17] "AOL.com"
根据属性查找节点和边:
V(net)[media=="BBC"]
+ 1/17 vertex, named, from d431b72:
[1] s11
E(net)[type=="mention"]
+ 20/49 edges from d431b72 (vertex names):
[1] s01->s15 s03->s10 s04->s06 s04->s11 s04->s17 s05->s01
[7] s05->s15 s06->s17 s07->s03 s07->s08 s07->s14 s08->s07
[13] s08->s09 s09->s10 s12->s06 s12->s14 s13->s17 s14->s11
[19] s14->s13 s16->s17
根据索引查找节点和边:
net[1,]
s01 s02 s03 s04 s05 s06 s07 s08 s09 s10 s11 s12 s13 s14 s15
0 22 22 21 0 0 0 0 0 0 0 0 0 0 20
s16 s17
0 0
> net[5,7]
[1] 0
从igraph网络中提取一个边列表或矩阵也很容易:
head(as_edgelist(net, names=T))
[,1] [,2]
[1,] "s01" "s02"
[2,] "s01" "s03"
[3,] "s01" "s04"
[4,] "s01" "s15"
[5,] "s02" "s01"
[6,] "s02" "s03"
as_adjacency_matrix(net, attr="weight")
17 x 17 sparse Matrix of class "dgCMatrix"
[[ suppressing 17 column names ‘s01’, ‘s02’, ‘s03’ ... ]]
s01 . 22 22 21 . . . . . . . . . . 20 . .
s02 23 . 21 . . . . . 1 5 . . . . . . .
s03 21 . . 22 1 . . 4 . 2 1 1 . . . . .
s04 . . 23 . . 1 . . . . 22 3 . . . . 2
s05 1 21 . . . . . . 2 . . . . . 21 . .
s06 . . . . . 1 . . . . . . . . . 21 21
s07 . . 1 . . . . 22 . 21 . . . 4 . . .
s08 . . 2 . . . 21 . 23 . . . . . . . .
s09 . . . . . . . . . 21 . . . . . . .
s10 . . 2 . . . . . . . . . . . . . .
s11 . . . . . . . . . . . . . . . . .
s12 . . . . . 2 . . . . . . 22 22 . . .
s13 . . . . . . . . . . . 21 . . . . 1
s14 . . . . . . . . . . 1 . 21 . . . .
s15 22 . . 1 . 4 . . . . . . . . . . .
s16 . . . . . 23 . . . . . . . . . . 21
s17 . . . 4 . . . . . . . . . . . . .
也可以转化为data frames
head(as_data_frame(net, what="edges"))
from to type weight
1 s01 s02 hyperlink 22
2 s01 s03 hyperlink 22
3 s01 s04 hyperlink 21
4 s01 s15 mention 20
5 s02 s01 hyperlink 23
6 s02 s03 hyperlink 21
head(as_data_frame(net, what="vertices"))
name media media.type type.label
s01 s01 NY Times 1 Newspaper
s02 s02 Washington Post 1 Newspaper
s03 s03 Wall Street Journal 1 Newspaper
s04 s04 USA Today 1 Newspaper
s05 s05 LA Times 1 Newspaper
s06 s06 New York Post 1 Newspaper
audience.size
s01 20
s02 25
s03 30
s04 32
s05 20
s06 50
现在我们有了igraph网络对象,让我们首先尝试绘制它。
plot(net) # not pretty!
看起来不太好。让我们通过移除图中的循环来修复问题。
# Removing loops from the graph:
net <- simplify(net, remove.multiple = F, remove.loops = T)
# Let's and reduce the arrow size and remove the labels:
plot(net, edge.arrow.size=.4,vertex.label=NA)
第二个数据集:矩阵
我们的第二个数据集是新闻媒体和消费者之间的链接网络。它包括两个文件“Dataset2-Media-Example-NODES”。csv”和“Dataset2-Media-Example-EDGES.csv”
# Read in the data:
nodes2 <- read.csv("./Data files/Dataset2-Media-User-Example-NODES.csv", header=T, as.is=T)
links2 <- read.csv("./Data files/Dataset2-Media-User-Example-EDGES.csv", header=T, row.names=1)
# Examine the data:
head(nodes2)
id media media.type media.name audience.size
1 s01 NYT 1 Newspaper 20
2 s02 WaPo 1 Newspaper 25
3 s03 WSJ 1 Newspaper 30
4 s04 USAT 1 Newspaper 32
5 s05 LATimes 1 Newspaper 20
6 s06 CNN 2 TV 56
head(links2)
U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14
s01 1 1 1 0 0 0 0 0 0 0 0 0 0 0
s02 0 0 0 1 1 0 0 0 0 0 0 0 0 0
s03 0 0 0 0 0 1 1 1 1 0 0 0 0 0
s04 0 0 0 0 0 0 0 0 1 1 1 0 0 0
s05 0 0 0 0 0 0 0 0 0 0 1 1 1 0
s06 0 0 0 0 0 0 0 0 0 0 0 0 1 1
U15 U16 U17 U18 U19 U20
s01 0 0 0 0 0 0
s02 0 0 0 0 0 1
s03 0 0 0 0 0 0
s04 0 0 0 0 0 0
s05 0 0 0 0 0 0
s06 0 0 1 0 0 0
Two-mode (bipartite) networks in igraph
接下来,我们将把第二个网络转换为igraph对象。
我们可以看到,links2是一个双模式网络的邻接矩阵。双模图或二部图有两种不同类型的参与者和连接,它们穿过,但不是在每种类型内。我们的第二个媒体例子就是这样的网络,研究新闻来源和他们的消费者之间的联系。
links2 <- as.matrix(links2)
> dim(links2)
[1] 10 20
> dim(nodes2)
[1] 30 5
# Create an igraph network object from the two-mode matrix:
net2 <- graph_from_incidence_matrix(links2)
# A built-in vertex attribute 'type' shows which mode vertices belong to.
table(V(net2)$type)
FALSE TRUE
10 20
plot(net2,vertex.label=NA)
用igraph绘图:网络绘图有很多你可以设置的参数,包括节点选项(从顶点开始)和边选项(从边开始)。下面是所选选项的列表,但你也可以查看?igraph。绘图以获取更多信息。
NODES
vertex.color Node color
vertex.frame.color Node border color
vertex.shape One of “none”, “circle”, “square”, “csquare”, “rectangle”
“crectangle”, “vrectangle”, “pie”, “raster”, or “sphere”
vertex.size Size of the node (default is 15)
vertex.size2 The second size of the node (e.g. for a rectangle)
vertex.label Character vector used to label the nodes
vertex.label.family Font family of the label (e.g.“Times”, “Helvetica”)
vertex.label.font Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol
vertex.label.cex Font size (multiplication factor, device-dependent)
vertex.label.dist Distance between the label and the vertex
vertex.label.degree The position of the label in relation to the vertex, where
0 is right, “pi” is left, “pi/2” is below, and “-pi/2” is above
EDGES
edge.color Edge color
edge.width Edge width, defaults to 1
edge.arrow.size Arrow size, defaults to 1
edge.arrow.width Arrow width, defaults to 1
edge.lty Line type, could be 0 or “blank”, 1 or “solid”, 2 or “dashed”,
3 or “dotted”, 4 or “dotdash”, 5 or “longdash”, 6 or “twodash”
edge.label Character vector used to label edges
edge.label.family Font family of the label (e.g.“Times”, “Helvetica”)
edge.label.font Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol
edge.label.cex Font size for edge labels
edge.curved Edge curvature, range 0-1 (FALSE sets it to 0, TRUE to 0.5)
arrow.mode Vector specifying whether edges should have arrows,
possible values: 0 no arrow, 1 back, 2 forward, 3 both
OTHER
margin Empty space margins around the plot, vector with length 4
frame if TRUE, the plot will be framed
main If set, adds a title to the plot
sub If set, adds a subtitle to the plot
asp Numeric, the aspect ratio of a plot (y/x).
palette A color palette to use for vertex color
rescale Whether to rescale coordinates to [-1,1]. Default is TRUE.
# Plot with curved edges (edge.curved=.1) and reduce arrow size:
plot(net, edge.arrow.size=.4,vertex.color = 'white', edge.curved=.1)
# Set node color to orange and the border color to hex #555555
# Replace the vertex label with the node names stored in "media"
plot(net, edge.arrow.size=.2, edge.curved=0,
vertex.color="orange", vertex.frame.color="#555555",
vertex.label=V(net)$media, vertex.label.color="black",
vertex.label.cex=.7)
第二种设置属性的方法是将它们添加到igraph对象中。假设我们想根据媒体类型为网络节点着色,并根据度中心性(更多的链接->更大的节点)调整它们的大小,我们还将根据它们的权重改变边缘的宽度。
# Generate colors based on media type:
colrs <- c("gray50", "tomato", "gold")
V(net)$color <- colrs[V(net)$media.type]
# Compute node degrees (#links) and use that to set node size:
deg <- degree(net, mode="all")
V(net)$size <- deg*3
# Alternatively, we can set node size based on audience size:
V(net)$size <- V(net)$audience.size*0.7
# The labels are currently node IDs.
# Setting them to NA will render no labels:
V(net)$label.color <- "black"
V(net)$label <- NA
# Set edge width based on weight:
E(net)$width <- E(net)$weight/6
#change arrow size and edge color:
E(net)$arrow.size <- .2
E(net)$edge.color <- "gray80"
# We can even set the network layout:
graph_attr(net, "layout") <- layout_with_lgl
plot(net)
# We can also override the attributes explicitly in the plot:
plot(net, edge.color="orange", vertex.color="gray50")
有时,特别是在语义网络中,我们可能只对绘制节点的标签感兴趣:
plot(net, vertex.shape="none", vertex.label=V(net)$media,
vertex.label.font=2, vertex.label.color="gray40",
vertex.label.cex=.7, edge.color="gray85")
让我们根据源节点的颜色为图的边着色。我们可以使用ends() igraph函数获得每条边的起始节点。它返回es参数中列出的边的起始点和结束点。names参数控制函数是返回边名还是id。
edge.start <- ends(net, es=E(net), names=F)[,1]
edge.col <- V(net)$color[edge.start]
plot(net, edge.color=edge.col, edge.curved=.1)
网络图的布局
网络布局是一种简单的算法,它返回网络中每个节点的坐标。
为了研究布局,我们将生成一个稍微大一点的100节点的图。我们使用sample_pa()函数,它生成一个简单的图,从一个节点开始,并根据预设的优先连接级别添加更多节点和链接(barabsi - albert模型)。
net.bg <- sample_pa(100, 1.2)
V(net.bg)$size <- 8
V(net.bg)$frame.color <- "white"
V(net.bg)$color <- "orange"
V(net.bg)$label <- ""
E(net.bg)$arrow.mode <- 0
plot(net.bg)
您可以在plot功能中设置布局:
plot(net.bg, layout=layout_randomly)
# Or calculate the vertex coordinates in advance:
l <- layout_in_circle(net.bg)
plot(net.bg, layout=l)
# Randomly placed vertices
l <- layout_randomly(net.bg)
plot(net.bg, layout=l)
#3D sphere layout
l <- layout_on_sphere(net.bg)
plot(net.bg, layout=l)
l <- cbind(1:vcount(net.bg), c(1, vcount(net.bg):2))
plot(net.bg, layout=l)
Fruchterman-Reingold是最常用的强制定向布局算法之一。
强制定向布局试图得到一个好看的图形,其中边的长度相似,并尽可能不交叉。他们把图形模拟成一个物理系统。节点是带电粒子,当它们靠得太近时会相互排斥。
这些边就像弹簧一样吸引连接的节点靠近。因此,节点均匀分布在整个图表区域,布局直观,节点间的联系越多,节点间的距离越近。这些算法的缺点是它们相当慢,因此很少用于大于1000个顶点的图。
l <- layout_with_fr(net.bg)
plot(net.bg, layout=l)
对于强制定向布局,您可以使用niter参数来控制要执行的迭代次数。默认设置为500次迭代。对于大型图,可以降低这个数字,以更快地获得结果,并检查它们是否合理。
l <- layout_with_fr(net.bg, niter=50)
plot(net.bg, layout=l
布局也可以解释边的权重。您可以设置“权重”参数,以增加由较重边连接的节点之间的吸引力。
ws <- c(1, rep(100, ecount(net.bg)-1))
lw <- layout_with_fr(net.bg, weights=ws)
plot(net.bg, layout=lw)
您还将注意到,Fruchterman-Reingold布局是不确定的——不同的运行将导致略微不同的配置。将布局保存在l中可以多次得到完全相同的结果,如果您想绘制图的时间演化或不同的关系,并且希望节点在多个图中保持在相同的位置,那么这将非常有用。
par(mfrow=c(2,2), mar=c(1,1,1,1))
plot(net.bg, layout=layout_with_fr)
plot(net.bg, layout=layout_with_fr)
plot(net.bg, layout=l)
plot(net.bg, layout=l)
默认情况下,图形的坐标将被重新标定为x和y的[-1,1]区间。您可以使用参数rescale=FALSE进行更改,并通过将坐标乘以一个标量手动重新标定您的图形。您可以使用norm_coords使用您想要的边界来规范化绘图。通过这种方式,您可以创建更紧凑或更分散的布局版本。
# Get the layout coordinates:
l <- layout_with_fr(net.bg)
# Normalize them so that they are in the -1, 1 interval:
l <- norm_coords(l, ymin=-1, ymax=1, xmin=-1, xmax=1)
par(mfrow=c(2,2), mar=c(0,0,0,0))
plot(net.bg, rescale=F, layout=l*0.4)
plot(net.bg, rescale=F, layout=l*0.8)
plot(net.bg, rescale=F, layout=l*1.2)
plot(net.bg, rescale=F, layout=l*1.6)
有些布局有3D版本,你可以使用参数dim=3。如您所料,3D布局返回一个包含3列的矩阵,其中包含每个节点的X、Y和Z坐标。
# Some layouts have 3D versions that you can use with parameter 'dim=3'
l <- layout_with_fr(net.bg, dim=3)
plot(net.bg, layout=l)
# Another popular force-directed algorithm that produces nice results for
# connected graphs is Kamada Kawai. Like Fruchterman Reingold, it attempts to
# minimize the energy in a spring system.
l <- layout_with_kk(net.bg)
plot(net.bg, layout=l)
Graphopt是在igraph中实现的一种很好的强制定向布局,它使用分层来帮助实现大型网络的可视化。
l <- layout_with_graphopt(net.bg)
plot(net.bg, layout=l)
利用现有的graphopt参数可以改变节点的质量和电荷,以及最佳的弹簧长度和边缘的弹簧常数。参数名是charge(默认值为0.001)、mass(默认值为30)、spring。长度(缺省值为0)和弹簧。常量(默认值为1)。调整这些参数会导致非常不同的图形布局。
l1 <- layout_with_graphopt(net.bg, charge=0.02)
l2 <- layout_with_graphopt(net.bg, charge=0.00000001)
par(mfrow=c(1,2), mar=c(1,1,1,1))
plot(net.bg, layout=l1)
plot(net.bg, layout=l2)
LGL算法适用于大型连通图。在这里,您还可以指定一个根:一个将放置在布局中间的节点。
# The LGL algorithm is for large connected graphs. Here you can specify a root -
# the node that will be placed in the middle of the layout.
plot(net.bg, layout=layout_with_lgl)
MDS(多维尺度)算法试图基于节点之间的某种相似性或距离度量来放置节点。更相似的节点被绘制得更接近彼此。默认情况下,使用的度量是基于网络中节点之间的最短路径。我们可以通过使用我们自己的距离矩阵(无论如何定义)和参数dist来改变它。MDS布局很好,因为位置和距离有一个清晰的解释。它们的问题在于视觉清晰度:节点通常是重叠的,或者是相互叠加的。
plot(net.bg, layout=layout_with_mds)
让我们看看所有可用的布局在igraph:
layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1]
# Remove layouts that do not apply to our graph.
layouts <- layouts[!grepl("bipartite|merge|norm|sugiyama|tree", layouts)]
par(mfrow=c(3,5), mar=c(1,1,1,1))
for (layout in layouts) {
print(layout)
l <- do.call(layout, list(net))
plot(net, edge.arrow.mode=0, layout=l, main=layout) }
[1] "layout_as_star"
[1] "layout_components"
[1] "layout_in_circle"
[1] "layout_nicely"
[1] "layout_on_grid"
[1] "layout_on_sphere"
[1] "layout_randomly"
[1] "layout_with_dh"
[1] "layout_with_drl"
[1] "layout_with_fr"
[1] "layout_with_gem"
[1] "layout_with_graphopt"
[1] "layout_with_kk"
[1] "layout_with_lgl"
[1] "layout_with_mds"
突出网络的某个方面
注意,我们的网络图仍然不是很有用。我们可以识别出团簇的类型和大小,但无法看到更多的结构,因为我们正在检查的连接太密集了。解决这个问题的一种方法是看看我们是否能使网络分散化,只保留最重要的联系,而丢弃其余的。
hist(links$weight)
Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
invalid graphics state
2: In doTryCatch(return(expr), name, parentenv, handler) :
invalid graphics state
> mean(links$weight)
[1] 12.40816
> sd(links$weight)
[1] 9.905635
>
还有更复杂的方法来提取关键边缘,但为了本练习的目的,我们将只保留权值高于网络平均值的方法。在igraph中,我们可以使用delete_edges(net, edges)来删除边:
cut.off <- mean(links$weight)
net.sp <- delete_edges(net, E(net)[weight<cut.off])
plot(net.sp, layout=layout_with_kk)
考虑这个问题的另一种方法是分别绘制两种绑定类型(超链接和提及)。我们将在本教程的第5节:绘制多路网络。:我们还可以通过展示网络地图中的社区,使其更有用:
# Community detection (by optimizing modularity over partitions):
clp <- cluster_optimal(net)
class(clp)
[1] "communities"
clp$membership
[1] 1 1 1 1 1 2 3 3 3 3 1 4 4 4 1 2 2
# Community detection returns an object of class "communities"
# which igraph knows how to plot:
plot(clp, net)
# We can also plot the communities without relying on their built-in plot:
V(net)$community <- clp$membership
colrs <- adjustcolor( c("gray50", "tomato", "gold", "yellowgreen"), alpha=.6)
plot(net, vertex.color=colrs[V(net)$community])
突出显示特定节点或链接
有时,我们希望将可视化重点放在特定节点或节点组上。在我们的例子“媒体网络”中,我们可以检查来自焦点参与者的信息传播。例如,让我们表示到纽约时报的距离。
distance函数返回一个最短路径矩阵,从v参数中列出的节点到to参数中包含的节点。
dist.from.NYT <- distances(net, v=V(net)[media=="NY Times"], to=V(net), weights=NA)
# Set colors to plot the distances:
oranges <- colorRampPalette(c("dark red", "gold"))
col <- oranges(max(dist.from.NYT)+1)
col <- col[dist.from.NYT+1]
plot(net, vertex.color=col, vertex.label=dist.from.NYT, edge.arrow.size=.6,
vertex.label.color="white")
# We can also highlight paths between the nodes in the network.
# Say here between MSNBC and the New York Post:
news.path <- shortest_paths(net,
from = V(net)[media=="MSNBC"],
to = V(net)[media=="New York Post"],
output = "both") # both path nodes and edges
# Generate edge color variable to plot the path:
ecol <- rep("gray80", ecount(net))
ecol[unlist(news.path$epath)] <- "orange"
# Generate edge width variable to plot the path:
ew <- rep(2, ecount(net))
ew[unlist(news.path$epath)] <- 4
# Generate node color variable to plot the path:
vcol <- rep("gray40", vcount(net))
vcol[unlist(news.path$vpath)] <- "gold"
plot(net, vertex.color=vcol, edge.color=ecol,
edge.width=ew, edge.arrow.mode=0)
我们可以突出进入或离开一个顶点的边,例如华尔街日报。对于单个节点,使用incident(),对于多个节点,使用incident_edges()
inc.edges <- incident(net, V(net)[media=="Wall Street Journal"], mode="all")
# Set colors to plot the selected edges.
ecol <- rep("gray80", ecount(net))
ecol[inc.edges] <- "orange"
vcol <- rep("grey40", vcount(net))
vcol[V(net)$media=="Wall Street Journal"] <- "gold"
plot(net, vertex.color=vcol, edge.color=ecol)
WSJ说,我们还可以指出一个顶点的近邻。邻居函数查找距焦点参与者一步之遥的所有节点。要查找多个节点的邻居,请使用adjacent_vertices()而不是neighbors()。为了找到超过一步的节点邻域,使用函数ego(),参数顺序设置为从焦点节点出发的步数。
neigh.nodes <- neighbors(net, V(net)[media=="Wall Street Journal"], mode="out")
# Set colors to plot the neighbors:
vcol[neigh.nodes] <- "#ff9d00"
plot(net, vertex.color=vcol)
吸引人们注意一组节点的一种方法(我们以前在社区中看到过)是“标记”它们:
par(mfrow=c(1,2))
# plot(net, mark.groups=c(1,4,5,8), mark.col="#C5E5E7", mark.border=NA)
# Another way to draw attention to a group of nodes:
plot(net, mark.groups=c(1,4,5,8), mark.col="#C5E5E7", mark.border=NA)
# Mark multiple groups:
plot(net, mark.groups=list(c(1,4,5,8), c(15:17)),
mark.col=c("#C5E5E7","#ECD89A"), mark.border=NA)
交互式绘制与tkplot
R和igraph允许对网络进行交互式绘图。如果您想稍微调整一个小图形的布局,这可能是一个有用的选项。在手动调整布局之后,您可以获得节点的坐标,并将它们用于其他图形。
tkid <- tkplot(net) #tkid is the id of the tkplot
手动调节后之后,提取出来:
l <- tkplot.getcoords(tkid) # grab the coordinates from tkplot
plot(net, layout=l)
绘制双模网络
你可能还记得,我们的第二个媒体例子是一个双模式网络,研究新闻来源和他们的消费者之间的联系。
> head(nodes2)
id media media.type media.name audience.size
1 s01 NYT 1 Newspaper 20
2 s02 WaPo 1 Newspaper 25
3 s03 WSJ 1 Newspaper 30
4 s04 USAT 1 Newspaper 32
5 s05 LATimes 1 Newspaper 20
6 s06 CNN 2 TV 56
Warning message:
In rm(list = cmd, envir = .tkplot.env) : 找不到对象'tkp.3'
> head(links2)
U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14
s01 1 1 1 0 0 0 0 0 0 0 0 0 0 0
s02 0 0 0 1 1 0 0 0 0 0 0 0 0 0
s03 0 0 0 0 0 1 1 1 1 0 0 0 0 0
s04 0 0 0 0 0 0 0 0 1 1 1 0 0 0
s05 0 0 0 0 0 0 0 0 0 0 1 1 1 0
s06 0 0 0 0 0 0 0 0 0 0 0 0 1 1
U15 U16 U17 U18 U19 U20
s01 0 0 0 0 0 0
s02 0 0 0 0 0 1
s03 0 0 0 0 0 0
s04 0 0 0 0 0 0
s05 0 0 0 0 0 0
s06 0 0 1 0 0 0
net2
IGRAPH 8d46ba2 UN-B 30 31 --
+ attr: type (v/l), name (v/c)
+ edges from 8d46ba2 (vertex names):
[1] s01--U01 s01--U02 s01--U03 s02--U04 s02--U05 s02--U20
[7] s03--U06 s03--U07 s03--U08 s03--U09 s04--U09 s04--U10
[13] s04--U11 s05--U11 s05--U12 s05--U13 s06--U13 s06--U14
[19] s06--U17 s07--U14 s07--U15 s07--U16 s08--U16 s08--U17
[25] s08--U18 s08--U19 s09--U06 s09--U19 s09--U20 s10--U01
[31] s10--U11
> plot(net2)
与单模式网络一样,我们可以修改网络对象,以包括在绘制网络时默认使用的可视化属性。注意,这次我们还将改变节点的形状——媒体输出将是方形的,而它们的用户将是圆形的。
# This time we will make nodes look different based on their type.
# Media outlets are blue squares, audience nodes are orange circles:
V(net2)$color <- c("steel blue", "orange")[V(net2)$type+1]
V(net2)$shape <- c("square", "circle")[V(net2)$type+1]
# Media outlets will have name labels, audience members will not:
V(net2)$label <- ""
V(net2)$label[V(net2)$type==F] <- nodes2$media[V(net2)$type==F]
V(net2)$label.cex=.6
V(net2)$label.font=2
plot(net2, vertex.label.color="white", vertex.size=(2-V(net2)$type)*8)
在igraph中,也有一个特殊的二分图(尽管它并不总是工作得很好,而且你最好自己生成一个双模式的二分图)。
plot(net2, vertex.label=NA, vertex.size=7, layout=layout_as_bipartite)
par(mar=c(0,0,0,0))
plot(net2, vertex.shape="none", vertex.label=nodes2$media,
vertex.label.color=V(net2)$color, vertex.label.font=2,
vertex.label.cex=.95, edge.color="gray70", edge.width=2)
还可以自己加图标
library("png")
img.1 <- readPNG("./Data files/images/news.png")
img.2 <- readPNG("./Data files/images/user.png")
V(net2)$raster <- list(img.1, img.2)[V(net2)$type+1]
par(mar=c(3,3,3,3))
plot(net2, vertex.shape="raster", vertex.label=NA,
vertex.size=16, vertex.size2=16, edge.width=2)
顺便说一下,我们还可以添加任何我们想要的图像到绘图。例如,许多网络图可以通过茶杯里一只小狗的照片得到很大的改善,这就有点调皮了啊。
img.3 <- readPNG("./Data files/images/puppy.png")
rasterImage(img.3, xleft=-1.4, xright=-0.4, ybottom=-1.4, ytop=-0.2)
我们也可以为二模式网络生成和绘制二部图投影:通过将网络矩阵乘以它的转置矩阵,或者使用igraph的bipartite.projection()函数,共同隶属关系很容易计算。
# We can also generate and plot bipartite projections for the two-mode network:
# (co-memberships are easy to calculate by multiplying the network matrix by
# its transposed matrix, or using igraph's bipartite.projection function)
net2.bp <- bipartite.projection(net2)
# We can calculate the projections manually as well:
# as_incidence_matrix(net2) %*% t(as_incidence_matrix(net2))
# t(as_incidence_matrix(net2)) %*% as_incidence_matrix(net2)
par(mfrow=c(1,2))
plot(net2.bp$proj1, vertex.label.color="black", vertex.label.dist=1,
vertex.label=nodes2$media[!is.na(nodes2$media.type)])
plot(net2.bp$proj2, vertex.label.color="black", vertex.label.dist=1,
vertex.label=nodes2$media[ is.na(nodes2$media.type)])
dev.off()
绘制多复路网络
library("igraph")
E(net)$width <- 2
plot(net, edge.color=c("dark red", "slategrey")[(E(net)$type=="hyperlink")+1],
vertex.color="gray40", layout=layout_in_circle, edge.curved=.3)
# Another way to delete edges using the minus operator:
net.m <- net - E(net)[E(net)$type=="hyperlink"]
net.h <- net - E(net)[E(net)$type=="mention"]
# Plot the two links separately:
par(mfrow=c(1,2))
plot(net.h, vertex.color="orange", layout=layout_with_fr, main="Tie: Hyperlink")
plot(net.m, vertex.color="lightsteelblue2", layout=layout_with_fr, main="Tie: Mention")
dev.off()
# Make sure the nodes stay in the same place in both plots:
par(mfrow=c(1,2),mar=c(1,1,4,1))
l <- layout_with_fr(net)
plot(net.h, vertex.color="orange", layout=l, main="Tie: Hyperlink")
plot(net.m, vertex.color="lightsteelblue2", layout=l, main="Tie: Mention")
在我们的示例网络中,碰巧没有由多种连接类型连接的节点对。也就是说,在同一家新闻媒体之间,我们从来没有同时存在“超链接”和“提及”的联系。然而,这在多路网络中很容易发生。
将多个图可视化的一个挑战是,相同的两个节点之间的多个边可能会以一种不可能清楚地看到它们的方式叠加在一起。例如,让我们生成一个非常简单的多路网络,其中有两个节点和三个节点之间的纽带:
multigtr <- graph( edges=c(1,2, 1,2, 1,2), n=2 )
l <- layout_with_kk(multigtr)
# Let's just plot the graph:
plot(multigtr, vertex.color="lightsteelblue", vertex.frame.color="white",
vertex.size=40, vertex.shape="circle", vertex.label=NA,
edge.color=c("gold", "tomato", "yellowgreen"), edge.width=10,
edge.arrow.size=5, edge.curved=0.1, layout=l)
因为图中所有的边都有相同的曲率,所以它们相互重叠,这样我们只能看到其中一条。我们能做的就是给每条边分配一个不同的曲率。igraph中一个名为curve_multiple的有用函数可以帮助我们解决这个问题。对于图G, curv .multiple(G)将为每条边生成一个曲率,使可见性最大化。
plot(multigtr, vertex.color="lightsteelblue", vertex.frame.color="white",
vertex.size=40, vertex.shape="circle", vertex.label=NA,
edge.color=c("gold", "tomato", "yellowgreen"), edge.width=10,
edge.arrow.size=5, edge.curved=curve_multiple(multigtr), layout=l)
超越igraph : Statnet, ggraph, and simple charts
igraph包只是r中许多可用的网络可视化选项之一。本节提供几个快速示例来说明其他可用的静态网络可视化方法。
使用network 包进行绘图与使用igraph非常相似——尽管表示法略有不同(一组全新的参数名称!)这个包还使用较少的通过修改网络对象获得的默认控件,以及在绘图函数中使用更显式的参数。
下面是一个使用(现在已经很熟悉了)媒体网络的快速示例。我们首先将数据转换为Statnet包系列(包括network、sna、ergm、stergm和其他)使用的网络格式。
在igraph中,我们可以从一个边表、一个邻接矩阵或一个关联矩阵中生成一个“网络”对象。您可以使用edgesett .constructors来获得细节。这里我们将使用边列表和节点属性数据帧来创建网络对象。这里要特别注意的是忽略。eval参数。默认情况下它被设置为TRUE,该设置会导致网络对象忽略边的权值。
library("network")
head(links)
from to type weight
1 s01 s02 hyperlink 22
2 s01 s03 hyperlink 22
3 s01 s04 hyperlink 21
4 s01 s15 mention 20
5 s02 s01 hyperlink 23
6 s02 s03 hyperlink 21
> head(nodes)
id media media.type type.label audience.size
1 s01 NY Times 1 Newspaper 20
2 s02 Washington Post 1 Newspaper 25
3 s03 Wall Street Journal 1 Newspaper 30
4 s04 USA Today 1 Newspaper 32
5 s05 LA Times 1 Newspaper 20
6 s06 New York Post 1 Newspaper 50
# Remember to set the ignore.eval to F for weighted networks.
net3 <- network(links, vertex.attr=nodes, matrix.type="edgelist",
loops=F, multiple=F, ignore.eval = F)
net3
Network attributes:
vertices = 17
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 49
missing edges= 0
non-missing edges= 49
Vertex attribute names:
audience.size id media media.type type.label vertex.names
Edge attribute names:
type weight
net3[,]
s01 s02 s03 s04 s05 s06 s07 s08 s09 s10 s11 s12 s13 s14
s01 0 1 1 1 0 0 0 0 0 0 0 0 0 0
s02 1 0 1 0 0 0 0 0 1 1 0 0 0 0
s03 1 0 0 1 1 0 0 1 0 1 1 1 0 0
s04 0 0 1 0 0 1 0 0 0 0 1 1 0 0
s05 1 1 0 0 0 0 0 0 1 0 0 0 0 0
s06 0 0 0 0 0 1 0 0 0 0 0 0 0 0
s07 0 0 1 0 0 0 0 1 0 1 0 0 0 1
s08 0 0 1 0 0 0 1 0 1 0 0 0 0 0
s09 0 0 0 0 0 0 0 0 0 1 0 0 0 0
s10 0 0 1 0 0 0 0 0 0 0 0 0 0 0
s11 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s12 0 0 0 0 0 1 0 0 0 0 0 0 1 1
s13 0 0 0 0 0 0 0 0 0 0 0 1 0 0
s14 0 0 0 0 0 0 0 0 0 0 1 0 1 0
s15 1 0 0 1 0 1 0 0 0 0 0 0 0 0
s16 0 0 0 0 0 1 0 0 0 0 0 0 0 0
s17 0 0 0 1 0 0 0 0 0 0 0 0 0 0
s15 s16 s17
s01 1 0 0
s02 0 0 0
s03 0 0 0
s04 0 0 1
s05 1 0 0
s06 0 1 1
s07 0 0 0
s08 0 0 0
s09 0 0 0
s10 0 0 0
s11 0 0 0
s12 0 0 0
s13 0 0 1
s14 0 0 0
s15 0 0 0
s16 0 0 1
s17 0 0 0
> net3 %n% "net.name" <- "Media Network" # network attribute
> net3 %v% "media" # Node attribute
[1] "NY Times" "Washington Post"
[3] "Wall Street Journal" "USA Today"
[5] "LA Times" "New York Post"
[7] "CNN" "MSNBC"
[9] "FOX News" "ABC"
[11] "BBC" "Yahoo News"
[13] "Google News" "Reuters.com"
[15] "NYTimes.com" "WashingtonPost.com"
[17] "AOL.com"
> net3 %e% "type" # Node attribute
[1] "hyperlink" "hyperlink" "hyperlink" "mention"
[5] "hyperlink" "hyperlink" "hyperlink" "hyperlink"
[9] "hyperlink" "hyperlink" "hyperlink" "hyperlink"
[13] "mention" "hyperlink" "hyperlink" "hyperlink"
[17] "mention" "mention" "hyperlink" "mention"
[21] "mention" "hyperlink" "hyperlink" "mention"
[25] "hyperlink" "hyperlink" "mention" "mention"
[29] "mention" "hyperlink" "mention" "hyperlink"
[33] "mention" "mention" "mention" "hyperlink"
[37] "mention" "hyperlink" "mention" "hyperlink"
[41] "mention" "mention" "mention" "hyperlink"
[45] "hyperlink" "hyperlink" "hyperlink" "mention"
[49] "hyperlink"
net3 %v% "col" <- c("gray70", "tomato", "gold")[net3 %v% "media.type"]
# plot the network:
plot(net3, vertex.cex=(net3 %v% "audience.size")/7, vertex.col="col")
注意,与igraph中一样,plot返回节点位置坐标。您可以使用coord参数在其他绘图中使用它们。
l <- plot(net3, vertex.cex=(net3 %v% "audience.size")/7, vertex.col="col")
plot(net3, vertex.cex=(net3 %v% "audience.size")/7, vertex.col="col", coord=l)
network 包还提供了交互式编辑情节的选项,通过设置参数interactive=T:
plot(net3, vertex.cex=(net3 %v% "audience.size")/7, vertex.col="col", interactive=T)
ggraph
ggplot2包及其扩展以提供最有意义的结构化和高级方法来可视化r中的数据而闻名。在ggplot2中,您可以从各种可视化构建块中选择,并将它们逐个添加到图形中,一次一层。
ggraph包采用了这一原理,并将其扩展到网络数据。在本节中,我们只讨论基础知识,而不提供图形方法语法的详细概述。为了更深入地了解它,最好先熟悉ggplot2,然后学习ggraph的细节。
一个好消息是,我们可以直接将igraph对象与ggraph包一起使用。下面的代码获取数据并为节点和链接添加单独的层。
library(ggraph)
library(igraph)
# We can use our 'net' igraph object directly with the 'ggraph' package.
# The following code gets the data and adds layers for nodes and links.
ggraph(net) +
geom_edge_link() + # add edges to the plot
geom_node_point() # add nodes to the plot
:在这里,您还将认识一些熟悉的网络布局,从igraph绘图:“星形”,“圆”,“网格”,“球面”,“kk”,“fr”,“mds”,“lgl”等。
ggraph(net, layout="lgl") +
geom_edge_link() +
ggtitle("Look ma, no nodes!") # add title to the plot
在这里,我们可以对直边使用geom_edge_link(),对弯曲边使用geom_edge_arc(),当我们希望确保任何重叠的多重边将被扇形展开时,可以使用geom_edge_fan()。
在其他包中,我们可以通过使用关键函数参数来设置网络图的可视化属性。例如,节点有颜色、填充、形状、大小和描边。边有颜色、宽度和线条类型。这里也是alpha参数控制透明度。
ggraph(net, layout="lgl") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=V(net)$color, size=8) +
theme_minimal()
与ggplot2一样,我们可以向情节中添加不同的主题。为了更清晰的外观,您可以使用最小或空主题,并使用theme_minimal()或theme_void()。
ggraph(net, layout = 'linear') +
geom_edge_arc(color = "orange", width=0.7) +
geom_node_point(size=5, color="gray50") +
theme_void()
ggraph包还使用了映射美学的传统ggplot2方法:也就是说,指定哪些数据元素应该对应于图形的不同视觉属性。这是使用aes()函数完成的,该函数匹配数据中的可视参数和属性名。在下面的代码中,edge属性的类型和节点属性的受众。大小从igraph对象中包含的数据中提取。
ggraph(net, layout="lgl") +
geom_edge_link(aes(color = type)) + # colors by edge type
geom_node_point(aes(size = audience.size)) + # size by audience size
theme_void()
您可以在上面看到关于ggplot2和ggraph的一个伟大之处是,它们会自动生成图例,这使得情节更容易解释。
我们可以使用geom_node_text()或geom_node_label()来添加带有节点标签的层,它们与ggplot2中的类似函数相对应。
ggraph(net, layout = 'lgl') +
geom_edge_arc(color="gray", curvature=0.3) +
geom_node_point(color="orange", aes(size = audience.size)) +
geom_node_text(aes(label = media), color="gray50", repel=T) +
theme_void()
其他方法
# First, we'll extract a matrix from our igraph network object.
netm <- as_adjacency_matrix(net, attr="weight", sparse=F)
colnames(netm) <- V(net)$media
rownames(netm) <- V(net)$media
# Generate a color palette to use in the heatmap:
palf <- colorRampPalette(c("gold", "dark orange"))
# The Rowv & Colv parameters turn dendrograms on and off
heatmap(netm[,17:1], Rowv = NA, Colv = NA, col = palf(20),
scale="none", margins=c(10,10) )
# Degree distribution
deg.dist <- degree_distribution(net, cumulative=T, mode="all")
plot( x=0:max(degree(net)), y=1-deg.dist, pch=19, cex=1.4, col="orange",
xlab="Degree", ylab="Cumulative Frequency")
交互网络
如果你已经安装了“ndtv”,你还应该有一个它使用的名为“animation”的包。如果没有,现在是时候用install.packages('animation')来安装它了。注意,这个包提供了一种简单的技术来在r中创建各种(不一定是与网络相关的)动画。它通过生成多个情节并将它们组合到一个GIF动画中工作。
这里的问题是,为了使其工作,您不仅需要R包,还需要一个名为ImageMagick的附加软件(imagemagick.org)。您可能不想在研讨会期间安装它,但您可以在家里尝试。
好消息是,一旦您明白了这一点,您就可以将任何系列的R图形(网络或非网络!)转换为GIF动画。
library("animation")
library("igraph")
# In order for this to work, you need not only the R package, but also
# an additional software called ImageMagick from imagemagick.org
# If you don't already have it, skip this part of the tutorial for now.
ani.options("convert") # Check that the package knows where to find ImageMagick
ani.options(convert="C:/Progra~1/ImageMagick-7.0.6-Q16/convert.exe")
# You can use this technique to create various (not necessarily network-related)
# animations in R by generating multiple plots and combining them in an animated GIF.
l <- layout_with_lgl(net)
saveGIF( { col <- rep("grey40", vcount(net))
plot(net, vertex.color=col, layout=l)
step.1 <- V(net)[media=="Wall Street Journal"]
col[step.1] <- "#ff5100"
plot(net, vertex.color=col, layout=l)
step.2 <- unlist(neighborhood(net, 1, step.1, mode="out"))
col[setdiff(step.2, step.1)] <- "#ff9d00"
plot(net, vertex.color=col, layout=l)
step.3 <- unlist(neighborhood(net, 2, step.1, mode="out"))
col[setdiff(step.3, step.2)] <- "#FFDD1F"
plot(net, vertex.color=col, layout=l) },
interval = .8, movie.name="network_animation.gif" )
detach("package:igraph")
detach("package:animation")
交互式JS可视化与visNetwork
现在,将R图导出到HTML/JavaScript输出相当容易。有许多像rcharts和htmlwidgets这样的包可以帮助您直接从r创建交互式web图表。但是有一点需要记住的是,以这种方式创建的网络可视化作为进一步工作的起点是非常有用的。如果您了解一点javascript,那么您可以将其作为第一步来使用,并调整结果以更接近您想要的结果。
这里我们将快速浏览一下visNetwork,它使用vis.js javascript库生成交互式网络可视化。你可以用install.packages('visNetwork')来安装这个包。
我们可以马上可视化我们的媒体网络:visNetwork()将接受我们的节点和链接数据帧。通常,节点数据框架需要有一个id列,而链接数据需要有表示每个绑定的开始和结束的from和to列。
library("visNetwork")
head(nodes)
head(links)
# We can visualize the network right away - visNetwork() will accept
# our node and link data frames (it needs node data with an 'id' column,
# and edge data with 'from' and 'to' columns).
visNetwork(nodes, links)
# We can set the height and width of the visNetwork() window
# with parameters 'height' and 'width', the back color with 'background',
# the title, subtitle, and footer with 'main', 'submain', and 'footer'
visNetwork(nodes, links, height="600px", width="100%", background="#eeefff",
main="Network", submain="And what a great network it is!",
footer= "Hyperlinks and mentions among media sources")
在下面的代码中,我们将更改一些节点的可视参数。我们从节点形状开始(它的可用选项包括椭圆、圆、数据库、框、文本、图像、圆图像、菱形、点、星形、三角形、三角向下、正方形和图标)。我们还将更改几个节点元素的颜色。在这个包中,背景控制节点的颜色,边框改变帧的颜色;高亮设置鼠标点击时的颜色,悬停设置鼠标越过时的颜色。
# We'll start by adding new node and edge attributes to our dataframes.
vis.nodes <- nodes
vis.links <- links
# The options for node shape include 'ellipse', 'circle',
# 'database', 'box', 'text', 'image', 'circularImage', 'diamond',
# 'dot', 'star', 'triangle', 'triangleDown', 'square', and 'icon'
vis.nodes$shape <- "dot"
vis.nodes$shadow <- TRUE # Nodes will drop shadow
vis.nodes$title <- vis.nodes$media # Text on click
vis.nodes$label <- vis.nodes$type.label # Node label
vis.nodes$size <- vis.nodes$audience.size # Node size
vis.nodes$borderWidth <- 2 # Node border width
# We can set the color for several elements of the nodes:
# "background" changes the node color, "border" changes the frame color;
# "highlight" sets the color on click, "hover" sets the color on mouseover.
vis.nodes$color.background <- c("slategrey", "tomato", "gold")[nodes$media.type]
vis.nodes$color.border <- "black"
vis.nodes$color.highlight.background <- "orange"
vis.nodes$color.highlight.border <- "darkred"
visNetwork(vis.nodes, vis.links)
# Below we change some of the visual properties of the edges:
vis.links$width <- 1+links$weight/8 # line width
vis.links$color <- "gray" # line color
vis.links$arrows <- "middle" # arrows: 'from', 'to', or 'middle'
vis.links$smooth <- FALSE # should the edges be curved?
vis.links$shadow <- FALSE # edge shadow
visNetwork(vis.nodes, vis.links)
vis.links$arrows <- ""
vis.links$width <- 1
visnet <- visNetwork(vis.nodes, vis.links)
visnet
# We can also set the visualization options directly with visNodes() and visEdges()
visnet2 <- visNetwork(nodes, links)
visnet2 <- visNodes(visnet2, shape = "square", shadow = TRUE,
color=list(background="gray", highlight="orange", border="black"))
visnet2 <- visEdges(visnet2, color=list(color="black", highlight = "orange"),
smooth = FALSE, width=2, dashes= TRUE, arrows = 'middle' )
visnet2
visNetwork在visOptions()函数中提供了许多其他选项。例如,我们可以突出显示所选节点的所有邻居(highlightNearest),或者添加一个下拉菜单来选择节点子集(selectedBy)。子集基于数据中的一列—这里我们使用类型标签。
visOptions(visnet, highlightNearest = TRUE, selectedBy = "type.label")
visNetwork还可以使用预定义的节点组。可以使用visGroups()设置属于每个组的节点的视觉特征。我们可以使用visLegend()添加一个自动生成的组图例。
nodes$group <- nodes$type.label
visnet3 <- visNetwork(nodes, links)
visnet3 <- visGroups(visnet3, groupname = "Newspaper", shape = "square",
color = list(background = "gray", border="black"))
visnet3 <- visGroups(visnet3, groupname = "TV", shape = "dot",
color = list(background = "tomato", border="black"))
visnet3 <- visGroups(visnet3, groupname = "Online", shape = "diamond",
color = list(background = "orange", border="black"))
visLegend(visnet3, main="Legend", position="right", ncol=1)
threejs
另一个将网络从R导出到javascript的好包是threejs,它使用three.js javascript库和htmlwidgets R包生成交互式网络可视化。关于threejs的一个优点是它可以直接读取igraph对象。
# install.packages('threejs')
library(threejs)
library(htmlwidgets)
library(igraph)
这里的主要网络绘图函数graphjs将接受一个igraph对象。我们可以对初始的net对象稍加修改:我们将删除它的图形布局,让三个ejb自己生成一个。我们在前面做了一点小弊,给igraph对象中的布局属性分配了一个函数,而不是给它一个节点坐标表。这在igraph上是可以的,但是threejs不让我们这么做。
net.js <- net
graph_attr(net.js, "layout") <- NULL
gjs <- graphjs(net.js, main="Network!", bg="gray10", showLabels=F, stroke=F,
curvature=0.1, attraction=0.9, repulsion=0.8, opacity=0.9)
print(gjs)
saveWidget(gjs, file="Media-Network-gjs.html")
browseURL("Media-Network-gjs.html")
一旦我们在浏览器中打开结果可视化,我们就可以使用鼠标滚动轮来放大和缩小,鼠标左键来旋转网络,鼠标右键来平移。
我们还可以通过使用布局、顶点颜色和边颜色列表来创建简单的动画,这些列表在每一步都会切换。
gjs.an <- graphjs(net.js, bg="gray10", showLabels=F, stroke=F,
layout=list(layout_randomly(net.js, dim=3),
layout_with_fr(net.js, dim=3),
layout_with_drl(net.js, dim=3),
layout_on_sphere(net.js)),
vertex.color=list(V(net.js)$color, "gray", "orange", V(net.js)$color),
main=list("Random Layout", "Fruchterman-Reingold", "DrL layout", "Sphere" ) )
print(gjs.an)
saveWidget(gjs.an, file="Media-Network-gjs-an.html")
browseURL("Media-Network-gjs-an.html")
networkD3
我们还将快速浏览一下networkD3,正如它的名字所暗示的那样,它使用D3 javascript库生成交互式网络可视化。如果你没有networkD3库,用install.packages(“networkD3”)安装它。
这个库需要从is得到的数据是标准的边缘列表形式,稍微做了一些改动。为了让事情正常运行,节点id必须是数字,而且它们也必须从0开始。一种简单的方法是将字符id转换为因子变量,再将其转换为数值,并确保它从0开始减去1。
#install.packages("networkD3")
library(networkD3)
# d3ForceNetwork expects node IDs that are numeric and start from 0
# so we have to transform our character node IDs:
links.d3 <- data.frame(from=as.numeric(factor(links$from))-1,
to=as.numeric(factor(links$to))-1 )
# The nodes need to be in the same order as the "source" column in links:
nodes.d3 <- cbind(idn=factor(nodes$media, levels=nodes$media), nodes)
# The `Group` parameter is used to color the nodes.
# Nodesize is not (as you might think) the size of the node, but the
# number of the column in the node data that should be used for sizing.
# The `charge` parameter guides node repulsion (if negative) or
# attraction (if positive).
forceNetwork(Links = links.d3, Nodes = nodes.d3, Source="from", Target="to",
NodeID = "idn", Group = "type.label",linkWidth = 1,
linkColour = "#afafaf", fontSize=12, zoom=T, legend=T,
Nodesize=6, opacity = 1, charge=-600,
width = 600, height = 600)
使用ndtv-d3实现动态网络可视化
在这里,我们将使用ndtv包创建D3可视化。你不需要额外的软件来用ndtv制作网页动画。如果您想将动画保存为视频文件(参见saveVideo),则必须安装名为FFmpeg的视频转换器(http://ffmpg.org)。要了解如何为你的操作系统进行正确的安装,请查看?install.ffmpeg。要使用所有可用的布局,您还需要在您的机器上安装Java。
下面的大多数参数在这一点上是不言自明的(bg是图的背景色)。两个我们以前没有使用过的新参数是vertex。工具提示和edge.tooltip。这些包含我们在将鼠标移动到网络元素上时可以看到的信息。注意,工具提示参数接受html标记——例如,我们将使用换行标记
。参数launchBrowser指示R在浏览器中打开生成的可视化文件(文件名)。
# install.packages('ndtv', dependencies=T)
library('ndtv')
net3
par(mar=c(0,0,0,0))
render.d3movie(net3, usearrows = F, displaylabels = F, bg="#111111",
vertex.border="#ffffff", vertex.col = net3 %v% "col",
vertex.cex = (net3 %v% "audience.size")/8,
edge.lwd = (net3 %e% "weight")/3, edge.col = '#55555599',
vertex.tooltip = paste("<b>Name:</b>", (net3 %v% 'media') , "<br>",
"<b>Type:</b>", (net3 %v% 'type.label')),
edge.tooltip = paste("<b>Edge type:</b>", (net3 %e% 'type'), "<br>",
"<b>Edge weight:</b>", (net3 %e% "weight" ) ),
launchBrowser=F, filename="Media-Network.html" )
文章的后面是几个交互的例子,交互的程度越高,对设备的要求也越高。那么,网络可视化的为未来是交互吗?
在地理地图上覆盖网络
本节中展示的示例仅使用基本R和映射包。如果您有使用ggplot2的经验,那么该包确实提供了一种更通用的方法来完成此任务。使用ggplot()的代码与您将在下面看到的代码类似,但是您将使用borders()绘制地图,使用geom_path()绘制边缘。
为了在地图上绘制,我们还需要一些包装。正如您将在下面看到的,maps将让我们生成一个地理地图来用作背景,geosphere将让我们生成表示网络边缘的弧线。如果您还没有它们,请安装这两个包,然后加载它们。
#install.packages("maps")
#install.packages("geosphere")
library("maps")
library("geosphere")
# Package 'maps' has built-in maps it can plot for you. For example:
# ('col' is map fill, 'border' is border color, 'bg' is background color)
par(mfrow = c(2,2))
map("usa", col="tomato", border="gray10", fill=TRUE, bg="gray30")
map("state", col="orange", border="gray10", fill=TRUE, bg="gray30")
map("county", col="palegreen", border="gray10", fill=TRUE, bg="gray30")
map("world", col="skyblue", border="gray10", fill=TRUE, bg="gray30")
我们将在这里使用的数据包含美国机场和它们之间的航班。机场文件包括地理坐标——经纬度。如果您的数据中没有这些,您可以使用包ggmap中的geocode()函数获取地址的纬度和经度。
airports <- read.csv("./Data Files/Dataset3-Airlines-NODES.csv", header=TRUE)
flights <- read.csv("./Data Files/Dataset3-Airlines-EDGES.csv", header=TRUE, as.is=TRUE)
head(flights)
Source Target Freq
1 0 109 10
2 1 36 10
3 1 61 10
4 2 152 10
5 3 104 10
6 4 132 10
> head(airports)
ID Label Code City latitude longitude
1 0 Adams Field Airport LIT Little Rock, AR 34.72944 -92.22444
2 1 Akron/canton Regional CAK Akron/Canton, OH 40.91611 -81.44222
3 2 Albany International ALB Albany 42.73333 -73.80000
4 3 Albemarle CHO Charlottesville 38.13333 -78.45000
5 4 Albuquerque International ABQ Albuquerque 35.04028 -106.60917
6 5 Alexandria International AEX Alexandria, LA 31.32750 -92.54861
ToFly Visits
1 0 105
2 0 123
3 0 129
4 1 114
5 0 105
6 0 93
# Select only large airports: ones with more than 10 connections in the data.
tab <- table(flights$Source)
big.id <- names(tab)[tab>10]
airports <- airports[airports$ID %in% big.id,]
flights <- flights[flights$Source %in% big.id &
flights$Target %in% big.id, ]
# Plot a map of the united states:
map("state", col="grey20", fill=TRUE, bg="black", lwd=0.1)
# Add a point on the map for each airport:
points(x=airports$longitude, y=airports$latitude, pch=19,
cex=airports$Visits/80, col="orange")
# Generate edge colors: lighter color means higher flight volume.
col.1 <- adjustcolor("orange red", alpha=0.4)
col.2 <- adjustcolor("orange", alpha=0.4)
edge.pal <- colorRampPalette(c(col.1, col.2), alpha = TRUE)
edge.col <- edge.pal(100)
# For each flight, we will generate the coordinates of an arc that connects
# its star and end point, using gcIntermediate() from package 'geosphere'.
# Then we will plot that arc over the map using lines().
for(i in 1:nrow(flights)) {
node1 <- airports[airports$ID == flights[i,]$Source,]
node2 <- airports[airports$ID == flights[i,]$Target,]
arc <- gcIntermediate( c(node1[1,]$longitude, node1[1,]$latitude),
c(node2[1,]$longitude, node2[1,]$latitude),
n=1000, addStartEnd=TRUE )
edge.ind <- round(100*flights[i,]$Freq / max(flights$Freq))
lines(arc, col=edge.col[edge.ind], lwd=edge.ind/30)
}
https://ggraph.data-imaginist.com/articles/Layouts.html