mia-rapport-2024/rapport/chapter4-simulations/inference.tex

\section{Efficiency of the inference}

The goal here is to assess the quality of the inference procedure.

\paragraph{Simulation settings} For this simulation the data is simulated with
$M = 2, n_{1}^{m} = 120, n_{2}^{m} = 120, Q_1 = Q_2 = 4$, $\bm{\alpha}, \bm{\pi}$
and $\bm{\rho}$ are set as follows:
\begin{align*}
     & \bm{\alpha} = .25 +
    \begin{pmatrix}
        3 \eps[\alpha] & 2 \eps[\alpha] & \eps[\alpha]   & - \eps[\alpha] \\
        2 \eps[\alpha] & 2 \eps[\alpha] & - \eps[\alpha] & \eps[\alpha]   \\
        \eps[\alpha]   & - \eps[\alpha] & \eps[\alpha]   & 2 \eps[\alpha] \\
        - \eps[\alpha] & \eps[\alpha]   & 2 \eps[\alpha] & 0
    \end{pmatrix},
\end{align*}
\begin{align*}
    \bm{\pi}^1 = \sigma_1
    \begin{pmatrix}
        0.2 & 0.4 & 0.4 & 0
    \end{pmatrix},
                               &  & \bm{\pi}^2 =
    \begin{pmatrix}
        0.25 & 0.25 & 0.25 & 0.25
    \end{pmatrix},                    \\
    \bm{\rho}^1 =
    \begin{pmatrix}
        0.25 & 0.25 & 0.25 & 0.25
    \end{pmatrix}, &  &
    \bm{\rho}^2 = \sigma_2
    \begin{pmatrix}
        0 & 0.33 & 0.33 & 0.33
    \end{pmatrix},    &  &
\end{align*}
with $\eps[\alpha]$ taking nine equally spaced values ranging from 0 to 0.24.
For each value of $\eps[\alpha]$, 108 datasets ($X_1, X_2$) are simulated,
resulting in $9 \times 108 = 972$ datasets. More precisely, for each dataset,
we pick uniformly at random two permutations of $\{ 1, \dots , 4 \}$
($\sigma_1, \sigma_2$) with the constraint that $\sigma_1(4) \neq \sigma_2(1)$.
This ensures that each of the two networks have a non-empty block that is empty
in the other one. Then the networks are simulated with
$\mathcal{B}$ern-$BiSBM_{120,120}(Q_1 = 4, Q_2 = 4,
    \bm{\alpha}, \bm{\pi}^m, \bm{\rho}^m)$
with the previous parameters. Each network has 2 blocks in common and their
connectivity structures encompass a mix of core-periphery, assortative
community and dis-assortative community structures, depending on which 3 of the 4
blocks are selected for each network. $\eps[\alpha]$ represents the strength of
these structures, the larger, the easier it is to tell apart one block from
another.
The true model of all the simulation is a $\pi\rho\text{-}colBiSBM$.

\paragraph{Inference} We want to measure the quality of the
inference procedure, for this we use the inference described in the section
\ref{sec:variational-estimation-of-the-parameters}.

\paragraph{Quality indicators} To assess the quality of the inference, we will
use the following indicators:
\begin{itemize}
    \item First, for each dataset, we put in competition $\pi\text{-}colBiSBM$ with
          $sep\text{-}BiSBM$, $iid\text{-}colBiSBM$, $\rho\text{-}colBiSBM$,
          $\pi\rho\text{-}colBiSBM$
          respectively. To do so, for each dataset, we compute the
          BIC-L of each model $\pi\text{-}colBiSBM$ is preferred to $sep\text{-}BiSBM$
          (resp. $iid\text{-}colBiSBM$, $\rho\text{-}colBiSBM$,
          $\pi\rho\text{-}colBiSBM$) if
          its BIC-L is greater.
    \item When considering our \emph{colBiSBM} models we compare
          $\widehat{Q_1}$, $\widehat{Q_2}$ to
          their true values. ($Q_1 = 4$ and $Q_2 = 4$)
    \item Finally, we assess the quality of the node grouping by computing the
          Adjusted Rand Index \parencite{hubertComparingPartitions1985}, ARI = 0
          for a random grouping, ARI = 1 for a perfect recovery. For each
          network, for the
          $\pi\text{-}colBiSBM$, $\rho\text{-}colBiSBM$,
          $\pi\rho\text{-}colBiSBM$ we compare the inferred block memberships to
          the real ones by computing the mean of the ARI per axis over the two
          networks
          \begin{equation*}
              \overline{\text{ARI}}_d = \frac{1}{2} \text{ARI}\big( \text{ARI}(\widehat{\bm{Z}^1_d},\bm{Z}^1_d) + \text{ARI}(\widehat{\bm{Z}^2_d},\bm{Z}^2_d) \big),
          \end{equation*}
          where $d$ is the dimension or axis (i.e., rows, $d=1$, or columns, $d=2$) of
          the block memberships.
          And we compute the ARI of the whole set of nodes to account for block
          pairing between networks
          \begin{equation*}
              \text{ARI}_d = \text{ARI}\big((\widehat{\bm{Z}^1_d},\widehat{\bm{Z}^2_d}),(\bm{Z}^1_d,\bm{Z}^2_d) \big).
          \end{equation*}
\end{itemize}

All these quality indicators are averaged over the 108 datasets. The results are
provided in the tables \ref{tab:per_model_sep} to \ref{tab:per_model_pirho}. Each line corresponds to the
108 datasets for a given value of $\eps[\alpha]$.

\begin{figure}[ht]
    \centering
    \input{../tikz/simulations/inference/model-proportions.tex}
    \caption{Preferred model proportions over all datasets in function of
        $\eps[\alpha]$}
    \label{fig:prop-modele-pref}
\end{figure}

\foreach \modelname in {sep, iid, pi, rho, pirho}{
        \input{../tables/simulations/inference/\modelname.tex}
    }

\paragraph{Results}
For the model comparison, when $\eps[\alpha]$ is small
($\eps[\alpha]\in[0, .04]$), the simulation model is close to the
Erd\H{o}s-Reńyi network, and it is very hard to find any structure beyond the one
of a single block on each dimension.

On the figure \ref{fig:inference-proportion-preferred} and table
\ref{tab:proportion-preferred-table} we can see that from
$\eps[\alpha] = 0.06$ around $70\%$ of the time the $\pi\rho\text{-}colBiSBM$
model (i.e., the correct one) is selected.

An interesting result we can read in the tables is that our models outperform
the $sep\text{-}BiSBM$ when considering the ARI on the whole set of nodes
($\text{ARI}_d$). This means that our models are able to recover the block
pairing \emph{between the networks} in addition to recovering the blocks and
their parameters.
\clearpage