\section{Efficiency of the inference} \label{sec:efficiency-of-the-inference} The goal here is to assess the quality of the inference procedure. \paragraph{Simulation settings} For this simulation the data is simulated with $M = 2, n_{1}^{m} = 120,~n_{2}^{m} = 120,~Q_1 = Q_2 = 4$, $\bm{\alpha}, \bm{\pi}$ and $\bm{\rho}$ are set as follows: \begin{align*} & \bm{\alpha} = .25 + \begin{pmatrix} 3 \eps[\alpha] & 2 \eps[\alpha] & \eps[\alpha] & - \eps[\alpha] \\ 2 \eps[\alpha] & 2 \eps[\alpha] & - \eps[\alpha] & \eps[\alpha] \\ \eps[\alpha] & - \eps[\alpha] & \eps[\alpha] & 2 \eps[\alpha] \\ - \eps[\alpha] & \eps[\alpha] & 2 \eps[\alpha] & 0 \end{pmatrix}, \end{align*} \begin{align*} \bm{\pi}^1 = \sigma_1 \begin{pmatrix} 0.2 & 0.4 & 0.4 & 0 \end{pmatrix}, & & \bm{\pi}^2 = \begin{pmatrix} 0.25 & 0.25 & 0.25 & 0.25 \end{pmatrix}, \\ \bm{\rho}^1 = \begin{pmatrix} 0.25 & 0.25 & 0.25 & 0.25 \end{pmatrix}, & & \bm{\rho}^2 = \sigma_2 \begin{pmatrix} 0 & 0.33 & 0.33 & 0.33 \end{pmatrix}, & & \end{align*} with $\eps[\alpha]$ taking nine equally spaced values ranging from 0 to 0.24. For each value of $\eps[\alpha]$, 108 datasets ($X_1, X_2$) are simulated, resulting in $9 \times 108 = 972$ datasets. More precisely, for each dataset, we pick uniformly at random two permutations of $\{ 1, \dots , 4 \}$ ($\sigma_1, \sigma_2$) with the constraint that $\sigma_1(4) \neq \sigma_2(1)$. This ensures that each of the two networks have a non-empty block that is empty in the other one. Then the networks are simulated with $\mathcal{B}$ern-$BiSBM_{120,120}(Q_1 = 4, Q_2 = 4, \bm{\alpha}, \bm{\pi}^m, \bm{\rho}^m)$ with the previous parameters. Each network has 2 blocks in common and their connectivity structures encompass a mix of core-periphery, assortative community and dis-assortative community structures, depending on which 3 of the 4 blocks are selected for each network. $\eps[\alpha]$ represents the strength of these structures, the larger, the easier it is to tell apart one block from another. The true model of all the simulation is a $\pi\rho$-colBiSBM. \paragraph{Inference} We want to measure the quality of the inference procedure, for this we use the inference described in the section \ref{sec:variational-estimation-of-the-parameters}. \paragraph{Quality indicators} To assess the quality of the inference, we will use the following indicators: \begin{itemize} \item First, for each dataset, we put in competition $\pi$-colBiSBM with $sep\text{-}BiSBM$, $iid$-colBiSBM, $\rho$-colBiSBM, $\pi\rho$-colBiSBM respectively. To do so, for each dataset, we compute the BIC-L of each model $\pi$-colBiSBM is preferred to $sep\text{-}BiSBM$ (resp. $iid$-colBiSBM, $\rho$-colBiSBM, $\pi\rho$-colBiSBM) if its BIC-L is greater. \item When considering our colBiSBM models we compare $\widehat{Q_1}$, $\widehat{Q_2}$ to their true values. ($Q_1 = 4$ and $Q_2 = 4$) \item Finally, we assess the quality of the node grouping by computing the Adjusted Rand Index \parencite{hubertComparingPartitions1985}, ARI = 0 for a random grouping, ARI = 1 for a perfect match between groupings\footnote{Please note that even if Rand Index can only yield values between 0 and 1, ARI can return negative values if the RI is less than the expected value. This indicates a structure in grouping discordance.}. For each network, for the $\pi$-colBiSBM, $\rho$-colBiSBM, $\pi\rho$-colBiSBM we compare the inferred block memberships to the real ones by computing the mean of the ARI per dimension over the two networks \begin{equation*} \overline{\text{ARI}}_d = \frac{1}{2} \big( \text{ARI}(\widehat{\bm{Z}^1_d},\bm{Z}^1_d) + \text{ARI}(\widehat{\bm{Z}^2_d},\bm{Z}^2_d) \big), \end{equation*} where $d$ is the dimension (i.e., rows, $d=1$, or columns, $d=2$) of the block memberships. And we compute the ARI of the whole set of nodes to account for block pairing between networks \begin{equation*} \text{ARI}_d = \text{ARI}\big((\widehat{\bm{Z}^1_d},\widehat{\bm{Z}^2_d}),(\bm{Z}^1_d,\bm{Z}^2_d) \big). \end{equation*} The purpose of this metric is to verify that the block labels found in one network match the block labels in the second network. \end{itemize} All these quality indicators are averaged over the 108 datasets. The results are provided in the tables \ref{tab:inference_results_iid} to \ref{tab:inference_results_pirho}. Each line corresponds to the 108 datasets for a given value of $\eps[\alpha]$. Graphical representation of some results are shown on figures~\ref{fig:inference-prop-modele-pref} and~\ref{fig:inference-ari-plots}. \begin{figure}[ht] \centering \includestandalone{tikz/simulations/inference/model-proportions} \caption{Preferred model proportions over all datasets in function of $\eps[\alpha]$} \label{fig:inference-prop-modele-pref} \end{figure} \begin{figure}[H] \centering \includestandalone{tikz/simulations/inference/ari-plots} \caption{Plot of the ARI quality indicators in function of $\eps[\alpha]$} \label{fig:inference-ari-plots} \end{figure} \paragraph{Results} For the model comparison, when $\eps[\alpha]$ is small ($\eps[\alpha]\in[0, .03]$), the simulation model is close to an Erd\H{o}s-Reńyi network~\parencite{erdosRandomGraphs1959}, and it is very hard to find any structure beyond the one of a single block on each dimension. On the figure~\ref{fig:inference-prop-modele-pref} one can see that from $\eps[\alpha] = 0.06$ around $75\%$ of the time the $\pi\rho$-colBiSBM model (i.e., the correct one) is selected. The figure~\ref{fig:inference-ari-plots} shows that for $\eps[\alpha] \geq 0.09$, all the models, even the sep, have a $\overline{\text{ARI}}$ around $0.94$. This indicates that the models are able to assign correct nodes group memberships and thus that the inference works correctly. An interesting result we can read in the tables is that our models outperform the $sep\text{-}BiSBM$ when considering the ARI on the whole set of nodes ($\text{ARI}_d$). This means that our models are able to recover the block pairing \emph{between the networks} in addition to recovering the blocks and their parameters.