97 lines
No EOL
3.7 KiB
Text
97 lines
No EOL
3.7 KiB
Text
```{r libraries, echo = FALSE, include = FALSE}
|
|
require("ggplot2")
|
|
require("ggokabeito")
|
|
require("tidyr")
|
|
require("dplyr")
|
|
require("patchwork")
|
|
require("latex2exp")
|
|
```
|
|
|
|
\section{Network clustering of simulated networks}\label{sec:network-clustering-of-simulated-networks}
|
|
|
|
```{r impoting-data, echo = FALSE}
|
|
filenames <- list.files(
|
|
path = "./data",
|
|
pattern = "simulated_collection_data_clustering_*",
|
|
full.names = TRUE
|
|
)
|
|
|
|
# data_list <- lapply(filenames, function(file) lapply(readRDS(file), function(model) model$list_clustering))
|
|
df_netclust <- do.call("rbind", lapply(filenames, readRDS))
|
|
df_netclust$model <- factor(df_netclust$model, levels = c(
|
|
"iid", "pi",
|
|
"rho", "pirho"
|
|
))
|
|
|
|
```
|
|
\paragraph{Simulation settings} For all models we simulate $M = 9$ networks with
|
|
$\forall m \in \{ 1 \dots M \} , n^m_1 = n^m_2 = 75$ with $Q_1 = Q_2 = 3$. For
|
|
the simulations the proportions are the following:
|
|
|
|
\begin{align*}
|
|
\bm{\pi}^1 = \left( 0.2, 0.3, 0.5 \right) & & \bm{\rho}^1 = \left( 0.2, 0.3, 0.5 \right)
|
|
\end{align*}
|
|
and for all $m = 2,\dots,9$
|
|
\begin{align*}
|
|
\bm{\pi}^m = \begin{cases}
|
|
\bm{\pi}^1 & \text{for } iid\text{-}colBiSBM \\
|
|
\sigma^1_m(\bm{\pi}^1) & \text{for } \pi\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM
|
|
\end{cases}\\
|
|
\bm{\rho}^m =
|
|
\begin{cases}
|
|
\bm{\rho}^1 & \text{for } iid\text{-}colBiSBM \\
|
|
\sigma^2_m(\bm{\rho}^1) & \text{for } \rho\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM
|
|
\end{cases}
|
|
\end{align*}
|
|
where $\sigma^1_m$ and $\sigma^2_m$ are permutations of {1, 2, 3} proper to network $m$ and
|
|
$\sigma^1 (\pi)= {(\pi_{\sigma^1 (i)})}_{i=\{1,\dots,3\}}$
|
|
and $\sigma^2 (\rho)= {(\rho_{\sigma^2 (i)})}_{i=\{1,\dots,3\}}$.
|
|
The networks are divided into 3 sub-collections of 3
|
|
networks with connectivity parameters as follows:
|
|
|
|
\begin{align*}
|
|
\bm{\alpha}^{as} = .3 + \begin{pmatrix}
|
|
\epsilon & - \frac{\epsilon}{2} & - \frac{\epsilon}{2}\\
|
|
- \frac{\epsilon}{2} & \epsilon & - \frac{\epsilon}{2}\\
|
|
- \frac{\epsilon}{2} & - \frac{\epsilon}{2} & \epsilon
|
|
\end{pmatrix}, &&
|
|
\bm{\alpha}^{cp} = .3 + \begin{pmatrix}
|
|
\frac{3 \epsilon}{2} & \epsilon & \frac{\epsilon}{2}\\
|
|
\epsilon & \frac{\epsilon}{2} & 0\\
|
|
\frac{\epsilon}{2} & 0 & - \frac{\epsilon}{2}
|
|
\end{pmatrix}, &&
|
|
\bm{\alpha}^{dis} = .3 + \begin{pmatrix}
|
|
- \frac{\epsilon}{2} & \epsilon & \epsilon\\
|
|
\epsilon & - \frac{\epsilon}{2} & \epsilon\\
|
|
\epsilon & \epsilon & - \frac{\epsilon}{2}
|
|
\end{pmatrix},
|
|
\end{align*}
|
|
with $\epsilon \in [.1, .4]$. $\bm{\alpha}^{as}$ represents a classical
|
|
assortative community structure,
|
|
while $\bm{\alpha}^{cp}$ is a layered core-periphery structure with block 2
|
|
acting as a semi-core. Finally, $\bm{\alpha}^{dis}$ is a disassortative
|
|
community structure with stronger
|
|
connections between blocks than within blocks. If $\epsilon = 0$, the three
|
|
matrices are equal and the 9 networks have the same connection structure.
|
|
Increasing $\epsilon$ differentiates the 3 sub-collections of networks.
|
|
|
|
```{r netclustering-ARI-boxplot, echo = FALSE}
|
|
#| dpi = 300,
|
|
#| fig.asp = 0.5,
|
|
#| fig.cap = "\\label{}ARI of the partition obtained by clustering in function of $\\eps$"
|
|
df_netclust %>%
|
|
ggplot() +
|
|
aes(x = as.factor(epsilon), y = ARI) +
|
|
scale_color_okabe_ito() +
|
|
scale_fill_okabe_ito() +
|
|
xlab(TeX("$\\epsilon$")) +
|
|
guides(fill = guide_legend(title = "Model")) +
|
|
ylab("ARI of obtained netclustering") +
|
|
geom_boxplot(aes(fill = model))
|
|
```
|
|
|
|
\paragraph{Results} The evaluation of our method involves a comparison between
|
|
the resulting partition of the network collection and the simulated partition
|
|
using the ARI index. As the value of $\epsilon$ increases, our ability to
|
|
distinguish between the networks improves, and this distinction becomes nearly
|
|
perfect in all setups of the $colBiSBM$. |