mia-rapport-2024/rapport/chapter4-simulations/inference.tex

\section{Efficiency of the inference}
\label{sec:efficiency-of-the-inference}

The goal here is to assess the quality of the inference procedure.

\paragraph{Simulation settings} For this simulation the data is simulated with
$M = 2, n_{1}^{m} = 120,~n_{2}^{m} = 120,~Q_1 = Q_2 = 4$, $\bm{\alpha}, \bm{\pi}$
and $\bm{\rho}$ are set as follows:
\begin{align*}
     & \bm{\alpha} = .25 +
    \begin{pmatrix}
        3 \eps[\alpha] & 2 \eps[\alpha] & \eps[\alpha]   & - \eps[\alpha] \\
        2 \eps[\alpha] & 2 \eps[\alpha] & - \eps[\alpha] & \eps[\alpha]   \\
        \eps[\alpha]   & - \eps[\alpha] & \eps[\alpha]   & 2 \eps[\alpha] \\
        - \eps[\alpha] & \eps[\alpha]   & 2 \eps[\alpha] & 0
    \end{pmatrix},
\end{align*}
\begin{align*}
    \bm{\pi}^1 = \sigma_1
    \begin{pmatrix}
        0.2 & 0.4 & 0.4 & 0
    \end{pmatrix},
                               &  & \bm{\pi}^2 =
    \begin{pmatrix}
        0.25 & 0.25 & 0.25 & 0.25
    \end{pmatrix},                    \\
    \bm{\rho}^1 =
    \begin{pmatrix}
        0.25 & 0.25 & 0.25 & 0.25
    \end{pmatrix}, &  &
    \bm{\rho}^2 = \sigma_2
    \begin{pmatrix}
        0 & 0.33 & 0.33 & 0.33
    \end{pmatrix},    &  &
\end{align*}
with $\eps[\alpha]$ taking nine equally spaced values ranging from 0 to 0.24.
For each value of $\eps[\alpha]$, 108 datasets ($X_1, X_2$) are simulated,
resulting in $9 \times 108 = 972$ datasets. More precisely, for each dataset,
we pick uniformly at random two permutations of $\{ 1, \dots , 4 \}$
($\sigma_1, \sigma_2$) with the constraint that $\sigma_1(4) \neq \sigma_2(1)$.
This ensures that each of the two networks have a non-empty block that is empty
in the other one. Then the networks are simulated with
$\mathcal{B}$ern-$BiSBM_{120,120}(Q_1 = 4, Q_2 = 4,
    \bm{\alpha}, \bm{\pi}^m, \bm{\rho}^m)$
with the previous parameters. Each network has 2 blocks in common and their
connectivity structures encompass a mix of core-periphery, assortative
community and dis-assortative community structures, depending on which 3 of the 4
blocks are selected for each network. $\eps[\alpha]$ represents the strength of
these structures, the larger, the easier it is to tell apart one block from
another.
The true model of all the simulation is a $\pi\rho$-colBiSBM.

\paragraph{Inference} We want to measure the quality of the
inference procedure, for this we use the inference described in the section
\ref{sec:variational-estimation-of-the-parameters}.

\paragraph{Quality indicators} To assess the quality of the inference, we will
use the following indicators:
\begin{itemize}
    \item First, for each dataset, we put in competition $\pi$-colBiSBM with
          $sep\text{-}BiSBM$, $iid$-colBiSBM, $\rho$-colBiSBM,
          $\pi\rho$-colBiSBM
          respectively. To do so, for each dataset, we compute the
          BIC-L of each model $\pi$-colBiSBM is preferred to $sep\text{-}BiSBM$
          (resp. $iid$-colBiSBM, $\rho$-colBiSBM,
          $\pi\rho$-colBiSBM) if
          its BIC-L is greater.
    \item When considering our colBiSBM models we compare
          $\widehat{Q_1}$, $\widehat{Q_2}$ to
          their true values. ($Q_1 = 4$ and $Q_2 = 4$)
    \item Finally, we assess the quality of the node grouping by computing the
          Adjusted Rand Index \parencite{hubertComparingPartitions1985}, ARI = 0
          for a random grouping, ARI = 1 for a
          perfect match between groupings\footnote{Please note that even if Rand
              Index can only yield values between 0 and 1, ARI can return
              negative values if the RI is less than the expected value. This
              indicates a structure in grouping discordance.}.
          For each network, for the
          $\pi$-colBiSBM, $\rho$-colBiSBM,
          $\pi\rho$-colBiSBM we compare the inferred block memberships to
          the real ones by computing the mean of the ARI per axis over the two
          networks
          \begin{equation*}
              \overline{\text{ARI}}_d = \frac{1}{2} \big( \text{ARI}(\widehat{\bm{Z}^1_d},\bm{Z}^1_d) + \text{ARI}(\widehat{\bm{Z}^2_d},\bm{Z}^2_d) \big),
          \end{equation*}
          where $d$ is the dimension or axis (i.e., rows, $d=1$, or columns, $d=2$) of
          the block memberships.
          And we compute the ARI of the whole set of nodes to account for block
          pairing between networks
          \begin{equation*}
              \text{ARI}_d = \text{ARI}\big((\widehat{\bm{Z}^1_d},\widehat{\bm{Z}^2_d}),(\bm{Z}^1_d,\bm{Z}^2_d) \big).
          \end{equation*}
\end{itemize}

All these quality indicators are averaged over the 108 datasets. The results are
provided in the tables \ref{tab:inference_results_iid} to \ref{tab:inference_results_pirho}. Each line corresponds to the
108 datasets for a given value of $\eps[\alpha]$. Graphical representation of
some results are shown on figures~\ref{fig:inference-prop-modele-pref}
and~\ref{fig:inference-ari-plots}.

\begin{figure}[ht]
    \centering
    \includestandalone{tikz/simulations/inference/model-proportions}
    \caption{Preferred model proportions over all datasets in function of
        $\eps[\alpha]$}
    \label{fig:inference-prop-modele-pref}
\end{figure}

\begin{figure}[H]
    \centering
    \includestandalone{tikz/simulations/inference/ari-plots}
    \caption{Plot of the ARI quality indicators in function of
        $\eps[\alpha]$}
    \label{fig:inference-ari-plots}
\end{figure}

\paragraph{Results}
For the model comparison, when $\eps[\alpha]$ is small
($\eps[\alpha]\in[0, .04]$), the simulation model is close to the
Erd\H{o}s-Reńyi network, and it is very hard to find any structure beyond the one
of a single block on each dimension.

On the figure \ref{fig:inference-prop-modele-pref} one can see that from
$\eps[\alpha] = 0.06$ around $70\%$ of the time the
$\pi\rho$-colBiSBM model (i.e., the correct one) is selected.

An interesting result we can read in the tables is that our models outperform
the $sep\text{-}BiSBM$ when considering the ARI on the whole set of nodes
($\text{ARI}_d$). This means that our models are able to recover the block
pairing \emph{between the networks} in addition to recovering the blocks and
their parameters.