\section{Efficiency of the inference} \label{sec:efficiency-of-the-inference} The goal here is to assess the quality of the inference procedure. \paragraph{Simulation settings} For this simulation the data is simulated with $M = 2, n_{1}^{m} = 120,~n_{2}^{m} = 120,~Q_1 = Q_2 = 4$, $\bm{\alpha}, \bm{\pi}$ and $\bm{\rho}$ are set as follows: \begin{align*} & \bm{\alpha} = .25 + \begin{pmatrix} 3 \eps[\alpha] & 2 \eps[\alpha] & \eps[\alpha] & - \eps[\alpha] \\ 2 \eps[\alpha] & 2 \eps[\alpha] & - \eps[\alpha] & \eps[\alpha] \\ \eps[\alpha] & - \eps[\alpha] & \eps[\alpha] & 2 \eps[\alpha] \\ - \eps[\alpha] & \eps[\alpha] & 2 \eps[\alpha] & 0 \end{pmatrix}, \end{align*} \begin{align*} \bm{\pi}^1 = \sigma_1 \begin{pmatrix} 0.2 & 0.4 & 0.4 & 0 \end{pmatrix}, & & \bm{\pi}^2 = \begin{pmatrix} 0.25 & 0.25 & 0.25 & 0.25 \end{pmatrix}, \\ \bm{\rho}^1 = \begin{pmatrix} 0.25 & 0.25 & 0.25 & 0.25 \end{pmatrix}, & & \bm{\rho}^2 = \sigma_2 \begin{pmatrix} 0 & 0.33 & 0.33 & 0.33 \end{pmatrix}, & & \end{align*} with $\eps[\alpha]$ taking nine equally spaced values ranging from 0 to 0.24. For each value of $\eps[\alpha]$, 108 datasets ($X_1, X_2$) are simulated, resulting in $9 \times 108 = 972$ datasets. More precisely, for each dataset, we pick uniformly at random two permutations of $\{ 1, \dots , 4 \}$ ($\sigma_1, \sigma_2$) with the constraint that $\sigma_1(4) \neq \sigma_2(1)$. This ensures that each of the two networks have a non-empty block that is empty in the other one. Then the networks are simulated with $\mathcal{B}$ern-$BiSBM_{120,120}(Q_1 = 4, Q_2 = 4, \bm{\alpha}, \bm{\pi}^m, \bm{\rho}^m)$ with the previous parameters. Each network has 2 blocks in common and their connectivity structures encompass a mix of core-periphery, assortative community and dis-assortative community structures, depending on which 3 of the 4 blocks are selected for each network. $\eps[\alpha]$ represents the strength of these structures, the larger, the easier it is to tell apart one block from another. The true model of all the simulation is a $\pi\rho$-colBiSBM. \paragraph{Inference} We want to measure the quality of the inference procedure, for this we use the inference described in the section \ref{sec:variational-estimation-of-the-parameters}. \paragraph{Quality indicators} To assess the quality of the inference, we will use the following indicators: \begin{itemize} \item First, for each dataset, we put in competition $\pi$-colBiSBM with $sep\text{-}BiSBM$, $iid$-colBiSBM, $\rho$-colBiSBM, $\pi\rho$-colBiSBM respectively. To do so, for each dataset, we compute the BIC-L of each model $\pi$-colBiSBM is preferred to $sep\text{-}BiSBM$ (resp. $iid$-colBiSBM, $\rho$-colBiSBM, $\pi\rho$-colBiSBM) if its BIC-L is greater. \item When considering our colBiSBM models we compare $\widehat{Q_1}$, $\widehat{Q_2}$ to their true values. ($Q_1 = 4$ and $Q_2 = 4$) \item Finally, we assess the quality of the node grouping by computing the Adjusted Rand Index \parencite{hubertComparingPartitions1985}, ARI = 0 for a random grouping, ARI = 1 for a perfect match between groupings\footnote{Please note that even if Rand Index can only yield values between 0 and 1, ARI can return negative values if the RI is less than the expected value. This indicates a structure in grouping discordance.}. For each network, for the $\pi$-colBiSBM, $\rho$-colBiSBM, $\pi\rho$-colBiSBM we compare the inferred block memberships to the real ones by computing the mean of the ARI per axis over the two networks \begin{equation*} \overline{\text{ARI}}_d = \frac{1}{2} \big( \text{ARI}(\widehat{\bm{Z}^1_d},\bm{Z}^1_d) + \text{ARI}(\widehat{\bm{Z}^2_d},\bm{Z}^2_d) \big), \end{equation*} where $d$ is the dimension or axis (i.e., rows, $d=1$, or columns, $d=2$) of the block memberships. And we compute the ARI of the whole set of nodes to account for block pairing between networks \begin{equation*} \text{ARI}_d = \text{ARI}\big((\widehat{\bm{Z}^1_d},\widehat{\bm{Z}^2_d}),(\bm{Z}^1_d,\bm{Z}^2_d) \big). \end{equation*} \end{itemize} All these quality indicators are averaged over the 108 datasets. The results are provided in the tables \ref{tab:inference_results_iid} to \ref{tab:inference_results_pirho}. Each line corresponds to the 108 datasets for a given value of $\eps[\alpha]$. Graphical representation of some results are shown on figures~\ref{fig:inference-prop-modele-pref} and~\ref{fig:inference-ari-plots}. \begin{figure}[ht] \centering \includestandalone{tikz/simulations/inference/model-proportions} \caption{Preferred model proportions over all datasets in function of $\eps[\alpha]$} \label{fig:inference-prop-modele-pref} \end{figure} \begin{figure}[H] \centering \includestandalone{tikz/simulations/inference/ari-plots} \caption{Plot of the ARI quality indicators in function of $\eps[\alpha]$} \label{fig:inference-ari-plots} \end{figure} \paragraph{Results} For the model comparison, when $\eps[\alpha]$ is small ($\eps[\alpha]\in[0, .04]$), the simulation model is close to the Erd\H{o}s-Reńyi network, and it is very hard to find any structure beyond the one of a single block on each dimension. On the figure \ref{fig:inference-prop-modele-pref} one can see that from $\eps[\alpha] = 0.06$ around $70\%$ of the time the $\pi\rho$-colBiSBM model (i.e., the correct one) is selected. An interesting result we can read in the tables is that our models outperform the $sep\text{-}BiSBM$ when considering the ARI on the whole set of nodes ($\text{ARI}_d$). This means that our models are able to recover the block pairing \emph{between the networks} in addition to recovering the blocks and their parameters.