\section{Efficiency of the inference}
\label{sec:efficiency-of-the-inference}

The goal here is to assess the quality of the inference procedure.

\paragraph{Simulation settings} For this simulation the data is simulated with
$M = 2, n_{1}^{m} = 120,~n_{2}^{m} = 120,~Q_1 = Q_2 = 4$, $\bm{\alpha}, \bm{\pi}$
and $\bm{\rho}$ are set as follows:
\begin{align*}
     & \bm{\alpha} = .25 +
    \begin{pmatrix}
        3 \eps[\alpha] & 2 \eps[\alpha] & \eps[\alpha]   & - \eps[\alpha] \\
        2 \eps[\alpha] & 2 \eps[\alpha] & - \eps[\alpha] & \eps[\alpha]   \\
        \eps[\alpha]   & - \eps[\alpha] & \eps[\alpha]   & 2 \eps[\alpha] \\
        - \eps[\alpha] & \eps[\alpha]   & 2 \eps[\alpha] & 0
    \end{pmatrix},
\end{align*}
\begin{align*}
    \bm{\pi}^1 = \sigma_1
    \begin{pmatrix}
        0.2 & 0.4 & 0.4 & 0
    \end{pmatrix},
                               &  & \bm{\pi}^2 =
    \begin{pmatrix}
        0.25 & 0.25 & 0.25 & 0.25
    \end{pmatrix},                    \\
    \bm{\rho}^1 =
    \begin{pmatrix}
        0.25 & 0.25 & 0.25 & 0.25
    \end{pmatrix}, &  &
    \bm{\rho}^2 = \sigma_2
    \begin{pmatrix}
        0 & 0.33 & 0.33 & 0.33
    \end{pmatrix},    &  &
\end{align*}
with $\eps[\alpha]$ taking nine equally spaced values ranging from 0 to 0.24.
For each value of $\eps[\alpha]$, 108 datasets ($X_1, X_2$) are simulated,
resulting in $9 \times 108 = 972$ datasets. More precisely, for each dataset,
we pick uniformly at random two permutations of $\{ 1, \dots , 4 \}$
($\sigma_1, \sigma_2$) with the constraint that $\sigma_1(4) \neq \sigma_2(1)$.
This ensures that each of the two networks have a non-empty block that is empty
in the other one. Then the networks are simulated with
$\mathcal{B}$ern-$BiSBM_{120,120}(Q_1 = 4, Q_2 = 4,
    \bm{\alpha}, \bm{\pi}^m, \bm{\rho}^m)$
with the previous parameters. Each network has 2 blocks in common and their
connectivity structures encompass a mix of core-periphery, assortative
community and dis-assortative community structures, depending on which 3 of the 4
blocks are selected for each network. $\eps[\alpha]$ represents the strength of
these structures, the larger, the easier it is to tell apart one block from
another.
The true model of all the simulation is a $\pi\rho$-colBiSBM.

\paragraph{Inference} We want to measure the quality of the
inference procedure, for this we use the inference described in the section
\ref{sec:variational-estimation-of-the-parameters}.

\paragraph{Quality indicators} To assess the quality of the inference, we will
use the following indicators:
\begin{itemize}
    \item First, for each dataset, we put in competition $\pi$-colBiSBM with
          $sep\text{-}BiSBM$, $iid$-colBiSBM, $\rho$-colBiSBM,
          $\pi\rho$-colBiSBM
          respectively. To do so, for each dataset, we compute the
          BIC-L of each model $\pi$-colBiSBM is preferred to $sep\text{-}BiSBM$
          (resp. $iid$-colBiSBM, $\rho$-colBiSBM,
          $\pi\rho$-colBiSBM) if
          its BIC-L is greater.
    \item When considering our colBiSBM models we compare
          $\widehat{Q_1}$, $\widehat{Q_2}$ to
          their true values. ($Q_1 = 4$ and $Q_2 = 4$)
    \item Finally, we assess the quality of the node grouping by computing the
          Adjusted Rand Index \parencite{hubertComparingPartitions1985}, ARI = 0
          for a random grouping, ARI = 1 for a
          perfect match between groupings\footnote{Please note that even if Rand
              Index can only yield values between 0 and 1, ARI can return
              negative values if the RI is less than the expected value. This
              indicates a structure in grouping discordance.}.
          For each network, for the
          $\pi$-colBiSBM, $\rho$-colBiSBM,
          $\pi\rho$-colBiSBM we compare the inferred block memberships to
          the real ones by computing the mean of the ARI per dimension over the two
          networks
          \begin{equation*}
              \overline{\text{ARI}}_d = \frac{1}{2} \big( \text{ARI}(\widehat{\bm{Z}^1_d},\bm{Z}^1_d) + \text{ARI}(\widehat{\bm{Z}^2_d},\bm{Z}^2_d) \big),
          \end{equation*}
          where $d$ is the dimension (i.e., rows, $d=1$, or columns, $d=2$) of
          the block memberships.
          And we compute the ARI of the whole set of nodes to account for block
          pairing between networks
          \begin{equation*}
              \text{ARI}_d = \text{ARI}\big((\widehat{\bm{Z}^1_d},\widehat{\bm{Z}^2_d}),(\bm{Z}^1_d,\bm{Z}^2_d) \big).
          \end{equation*}
          The purpose of this metric is to verify that the block labels found
          in one network match the block labels in the second network.
\end{itemize}

All these quality indicators are averaged over the 108 datasets. The results are
provided in the tables \ref{tab:inference_results_iid} to \ref{tab:inference_results_pirho}. Each line corresponds to the
108 datasets for a given value of $\eps[\alpha]$. Graphical representation of
some results are shown on figures~\ref{fig:inference-prop-modele-pref}
and~\ref{fig:inference-ari-plots}.

\begin{figure}[ht]
    \centering
    \includestandalone{tikz/simulations/inference/model-proportions}
    \caption{Preferred model proportions over all datasets in function of
        $\eps[\alpha]$}
    \label{fig:inference-prop-modele-pref}
\end{figure}

\begin{figure}[H]
    \centering
    \includestandalone{tikz/simulations/inference/ari-plots}
    \caption{Plot of the ARI quality indicators in function of
        $\eps[\alpha]$}
    \label{fig:inference-ari-plots}
\end{figure}

\paragraph{Results}
For the model comparison, when $\eps[\alpha]$ is small
($\eps[\alpha]\in[0, .03]$), the simulation model is close to an
Erd\H{o}s-Reńyi network~\parencite{erdosRandomGraphs1959}, and it is very hard
to find any structure beyond the one
of a single block on each dimension.

On the figure~\ref{fig:inference-prop-modele-pref} one can see that from
$\eps[\alpha] = 0.06$ around $75\%$ of the time the
$\pi\rho$-colBiSBM model (i.e., the correct one) is selected.

The figure~\ref{fig:inference-ari-plots} shows that for $\eps[\alpha] \geq 0.09$,
all the models, even the sep, have a
$\overline{\text{ARI}}$ around $0.94$. This indicates that the models are able to
assign correct nodes group memberships and thus that the inference works
correctly.

An interesting result we can read in the tables is that our models outperform
the $sep\text{-}BiSBM$ when considering the ARI on the whole set of nodes
($\text{ARI}_d$). This means that our models are able to recover the block
pairing \emph{between the networks} in addition to recovering the blocks and
their parameters.