mia-rapport-2024/rapport/chapter4-simulations/na-robustness.tex

\paragraph{Simulation settings} We want to compare the performance of retrieving
the nodes blocks with missing edges (that are labeled as \texttt{NA} in the
incidence matrix).

For this purpose we generate collections of networks with the following
parameters:
\begin{align*}
    \bm{\pi}^m = \begin{cases}
                     \bm{\pi} = \left( 0.5, 0.3, 0.2 \right) & \text{for } iid\text{-}colBiSBM                                      \\
                     \sigma_1^m(\bm{\pi})                    & \text{for } \pi\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM
                 \end{cases} \\
    \bm{\rho}^m =
    \begin{cases}
        \bm{\rho}  = \left( 0.5, 0.3, 0.2 \right) & \text{for } iid\text{-}colBiSBM                                        \\
        \sigma_2^m(\bm{\rho})                     & \text{for } \rho\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM,
    \end{cases}
\end{align*}
for the block proportions, and two different structures with the corresponding
$\bm{\alpha}$,
\begin{align*}
    \bm{\alpha}^{modular} = \begin{pmatrix}
                                0.9  & 0.05 & 0.05 \\
                                0.05 & 0.2  & 0.05 \\
                                0.05 & 0.05 & 0.8
                            \end{pmatrix}, &
    \bm{\alpha}^{nested} = \begin{pmatrix}
                               0.9 & 0.25 & 0.1  \\
                               0.3 & 0.15 & 0.05 \\
                               0.1 & 0.05 & 0.05
                           \end{pmatrix},
\end{align*}

where $\bm{\alpha}^{modular}$ represents networks where there are look-a-like
communities, which tends to interact preferentially within the community and
less with the other communities. And $\bm{\alpha}^{nested}$ represents a common
structure detected in ecology with generalist and specialist species and a
\enquote{nested} structure.

The collections contain two networks of size $n^{m=1}_1 = n^{m=1}_2 = 40$ and
$n^{m=2}_1 = n^{m=2}_2 = 120$. One collection is generated for each $colBiSBM$
model. And the nodes block memberships (i.e., the row and column blocks they
belong to) are saved.

In the network $m=1$ (i.e., the smaller one) a proportion of the edges
$p_{\texttt{NA}}$ see their values replaced by \texttt{NA}s, the
\enquote{forgotten} values are stored.

\paragraph{Test procedure} A LBM is fitted on the first network, and the
predicted block memberships are saved, along with the predicted links using the
inferred parameters. This will serve as a baseline to see if the use of the
collection benefits the predictions.

A $colBiSBM$ model is then fitted (with a model matching the dataset considered)
and we store the same predictions.

\paragraph{Quality metrics} To benchmark the performance we use the
\emph{Area Under the Curve} (AUC) for predicted versus real link values and the
ARI for predicted versus real block memberships.

\paragraph{Results}