```{r libraries, echo = FALSE, include = FALSE}
require("ggplot2")
require("ggokabeito")
require("tidyr")
require("dplyr")
require("patchwork")
require("latex2exp")
```

\section{Network clustering of simulated networks}\label{sec:network-clustering-of-simulated-networks}

```{r impoting-data, echo = FALSE}
filenames <- list.files(
    path = "./data",
    pattern = "simulated_collection_data_clustering_*",
    full.names = TRUE
)

# data_list <- lapply(filenames, function(file) lapply(readRDS(file), function(model) model$list_clustering))
df_netclust <- do.call("rbind", lapply(filenames, readRDS))
df_netclust$model <- factor(df_netclust$model, levels = c(
    "iid", "pi",
    "rho", "pirho"
))

```
\paragraph{Simulation settings} For all models we simulate $M = 9$ networks with
$\forall m \in \{ 1 \dots M \} , n^m_1 = n^m_2 = 75$ with $Q_1 = Q_2 = 3$. For
the simulations the proportions are the following:

\begin{align*}
\bm{\pi}^1 = \left( 0.2, 0.3, 0.5 \right) & &  \bm{\rho}^1 = \left( 0.2, 0.3, 0.5 \right)
\end{align*}
and for all $m = 2,\dots,9$
\begin{align*}
\bm{\pi}^m = \begin{cases}
    \bm{\pi}^1 & \text{for } iid\text{-}colBiSBM \\
    \sigma^1_m(\bm{\pi}^1) & \text{for } \pi\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM
\end{cases}\\
\bm{\rho}^m = 
\begin{cases}
    \bm{\rho}^1 & \text{for } iid\text{-}colBiSBM \\
    \sigma^2_m(\bm{\rho}^1) & \text{for } \rho\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM
\end{cases} 
\end{align*}
where $\sigma^1_m$ and $\sigma^2_m$ are permutations of {1, 2, 3} proper to network $m$ and 
$\sigma^1 (\pi)= {(\pi_{\sigma^1 (i)})}_{i=\{1,\dots,3\}}$ 
and $\sigma^2 (\rho)= {(\rho_{\sigma^2 (i)})}_{i=\{1,\dots,3\}}$. 
The networks are divided into 3 sub-collections of 3
networks with connectivity parameters as follows:

\begin{align*}
\bm{\alpha}^{as} = .3 + \begin{pmatrix}
    \epsilon & - \frac{\epsilon}{2} & - \frac{\epsilon}{2}\\
    - \frac{\epsilon}{2} & \epsilon & - \frac{\epsilon}{2}\\
    - \frac{\epsilon}{2} & - \frac{\epsilon}{2} & \epsilon
\end{pmatrix}, &&
\bm{\alpha}^{cp} = .3 + \begin{pmatrix}
    \frac{3 \epsilon}{2} & \epsilon & \frac{\epsilon}{2}\\
    \epsilon & \frac{\epsilon}{2} & 0\\
    \frac{\epsilon}{2} & 0 & - \frac{\epsilon}{2}
\end{pmatrix}, &&
\bm{\alpha}^{dis} = .3 + \begin{pmatrix}
    - \frac{\epsilon}{2} & \epsilon & \epsilon\\
    \epsilon & - \frac{\epsilon}{2} & \epsilon\\
    \epsilon & \epsilon & - \frac{\epsilon}{2}
\end{pmatrix},
\end{align*}
with $\epsilon \in [.1, .4]$. $\bm{\alpha}^{as}$ represents a classical
assortative community structure, 
while $\bm{\alpha}^{cp}$ is a layered core-periphery structure with block 2
acting as a semi-core. Finally, $\bm{\alpha}^{dis}$ is a disassortative
community structure with stronger
connections between blocks than within blocks. If $\epsilon = 0$, the three
matrices are equal and the 9 networks have the same connection structure. 
Increasing $\epsilon$ differentiates the 3 sub-collections of networks.

```{r netclustering-ARI-boxplot, echo = FALSE}
#| dpi = 300,
#| fig.asp = 0.5,
#| fig.cap = "\\label{}ARI of the partition obtained by clustering in function of $\\eps$"
df_netclust %>%
    ggplot() +
    aes(x = as.factor(epsilon), y = ARI) +
    scale_color_okabe_ito() +
    scale_fill_okabe_ito() +
    xlab(TeX("$\\epsilon$")) +
    guides(fill = guide_legend(title = "Model")) +
    ylab("ARI of obtained netclustering") +
    geom_boxplot(aes(fill = model))
```

\paragraph{Results} The evaluation of our method involves a comparison between 
the resulting partition of the network collection and the simulated partition 
using the ARI index. As the value of $\epsilon$ increases, our ability to 
distinguish between the networks improves, and this distinction becomes nearly
perfect in all setups of the $colBiSBM$.