1107 lines
No EOL
55 KiB
TeX
1107 lines
No EOL
55 KiB
TeX
\documentclass[12pt,a4paper]{report}
|
|
%====En-tête====
|
|
% Ajout des packages
|
|
\usepackage[english]{babel} % pour dire que le texte est en francais
|
|
\usepackage{a4} % pour la taille
|
|
\usepackage[T1]{fontenc} % pour les font postscript
|
|
\usepackage[cyr]{aeguill} % Police vectorielle TrueType, guillemets francais
|
|
\usepackage{epsfig} % pour gérer les images
|
|
\usepackage{amsmath,amsthm, mathtools} % très bon mode mathématique
|
|
\usepackage{amsfonts,amssymb,bm, bbold}% permet la definition des ensembles
|
|
\usepackage{algorithm2e} % pour les algorithmes
|
|
\usepackage{algpseudocode} % pour les algorithmes
|
|
\usepackage{float} % pour le placement des figure
|
|
\usepackage{url} % pour une gestion efficace des url
|
|
\usepackage[citecolor=blueind,urlcolor=blue,bookmarks=false,hypertexnames=true]{hyperref} % pour les hyperliens dans le document
|
|
\usepackage{tocbibind} % Pour avoir des index pour table des matières, biblio
|
|
\usepackage{tikz} % For graph plots
|
|
\usepackage[super]{nth}
|
|
\usepackage{caption} % Figures
|
|
\usepackage{subcaption} % And Subfigures
|
|
\usepackage{longtable}
|
|
\usepackage{rotating} % For allowing to rotate figures
|
|
\usepackage{svg} % To allow svg inclusions
|
|
\usepackage{float} % To allow Pandoc to control figure placement
|
|
% \usepackage{booktabs} % For good tables
|
|
|
|
%% Bibliography
|
|
\usepackage[style=apa,citestyle=authoryear-comp]{biblatex}
|
|
\addbibresource{../references.bib}
|
|
|
|
%% For good md to tex conversion
|
|
\providecommand{\tightlist}{%
|
|
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
|
|
\usepackage{booktabs}
|
|
|
|
% Images
|
|
|
|
|
|
% Figure placement
|
|
\floatplacement{figure}{H}
|
|
|
|
%% Tikz Related
|
|
\usetikzlibrary{calc,shapes,backgrounds,arrows,automata,shadows,positioning}
|
|
\usetikzlibrary{arrows,shapes,positioning,shadows,trees,calc,backgrounds,automata,positioning}
|
|
|
|
|
|
|
|
\tikzset{
|
|
basic/.style = {draw, text width=3cm, font=\sffamily, rectangle},
|
|
root/.style = {basic, rounded corners=2pt, thin, align=center,
|
|
fill=green!30},
|
|
level 2/.style = {basic, rounded corners=6pt, thin,align=center, fill=green!60,
|
|
text width=8em},
|
|
level 3/.style = {basic, thin, align=left, fill=pink!60, text width=3.5cm}
|
|
}
|
|
|
|
|
|
% pour tickz multilevel
|
|
\definecolor{redorg}{RGB}{215,48,39}
|
|
\definecolor{orangeorg}{RGB}{253,174,97}
|
|
|
|
\definecolor{blueind}{RGB}{69,117,233}
|
|
\definecolor{cyanind}{RGB}{116,173,209}
|
|
\definecolor{electricblue}{RGB}{125, 249, 255}
|
|
|
|
\definecolor{greenind}{RGB}{112,130,56}
|
|
|
|
\definecolor{burntorange}{RGB}{204, 85, 0}
|
|
\definecolor{goldenyellow}{RGB}{255, 192, 0}
|
|
\definecolor{yellow}{RGB}{255,255,0}
|
|
|
|
\definecolor{gray}{RGB}{128,128,128}
|
|
|
|
% Nouvelles commandes
|
|
\newcommand{\Tau}{\mathcal{T}}
|
|
\newcommand{\eps}[1][]{\ensuremath{\epsilon_{#1}}}
|
|
|
|
% titre et auteur
|
|
\title{Rapport de stage dans l'UMR MIA Paris-Saclay}
|
|
\author{Louis Lacoste}
|
|
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
|
|
\begin{document}
|
|
\maketitle
|
|
\tableofcontents
|
|
|
|
\section{Remerciements}
|
|
|
|
Je tiens à remercier en premier lieu Pierre Barbillon pour son encadrement
|
|
remarquable, sa disponibilité, ses conseils avisés et sa gentillesse.
|
|
Saint-Clair Chabert-Liddell pour son accompagnement, ses remarques, ses
|
|
explications et le temps qu'il m'a consacré. Merci à Sophie Donnet, pour les
|
|
cours et les idées qu'elle m'a donné. Merci à Julien pour sa sympathie et tout
|
|
son travail en tant que DU!
|
|
|
|
Merci à Farida, Christelle et Sébastien pour avoir expliqué et mené les
|
|
démarches administratives.
|
|
|
|
Un merci tout particulier à tous les doctorants (ou futurs doctorants): Mary,
|
|
Marina, Emré, Tam, Caroline, Jérémy, Florian, Annaïg, Jules, Tanguy, Barbara,
|
|
Bastien et Armand. Merci à tous les autres stagiaires, particulièrement:
|
|
Francois, Pierre, Camille, Maxime.
|
|
|
|
Merci à tous les permanents du \nth{3} étage, parmi lesquels: Christophe,
|
|
Stéphane et Vincent.
|
|
|
|
Merci à Hugo, Théodore, Éric, Jean-Benoist, Nicolas, Tristan, Sarah, Jade et
|
|
Pierre Gloaguen.
|
|
|
|
Un grand merci à tous ceux qui ont participé de près ou de loin au bon
|
|
déroulement de ce stage.
|
|
|
|
% TODO Compléter les remerciements
|
|
|
|
\chapter{L'UMR MIA Paris-Saclay}
|
|
|
|
\include{presentation_UMR}
|
|
|
|
\chapter{Context of the study}
|
|
|
|
\section{Usage and importance of bipartite graphs}
|
|
\label{sec:usage-and-importance-of-bipartite-graphs}
|
|
Bipartite graphs, denoted as $G = (U,V,E)$ with $U$ and $V$ two disjoint and
|
|
independent sets of vertices and $E$ the set of edges connecting $U$ vertices to
|
|
$V$ vertices.
|
|
|
|
\begin{minipage}{0.5\linewidth}
|
|
\centering
|
|
Bipartite network\\
|
|
\begin{tikzpicture}[scale=.6]
|
|
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt]
|
|
\tikzstyle{every state}=[draw, text=black,scale=0.95, transform shape]
|
|
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape]
|
|
\tikzstyle{every node}=[fill=blueind]
|
|
|
|
\node[state, draw=black!50] (A1) at (0,5) {\textbf{R1}};
|
|
\node[state, draw=black!50] (A2) at (2.5,5) {\textbf{R2}};
|
|
\node[state, draw=black!50] (A3) at (5,5) {\textbf{R3}};
|
|
|
|
\tikzstyle{every node}=[fill=greenind, shape=rectangle]
|
|
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape, shape=rectangle]
|
|
\node[state, draw=black!50] (B1) at (0,0) {\textbf{C1}};
|
|
\node[state, draw=black!50] (B2) at (1.25,0) {\textbf{C2}};
|
|
\node[state, draw=black!50] (B3) at (2.5,0) {\textbf{C3}};
|
|
\node[state, draw=black!50] (B4) at (3.75,0) {\textbf{C4}};
|
|
\node[state, draw=black!50] (B5) at (5,0) {\textbf{C5}};
|
|
\path (A1) edge [] (B1);
|
|
\path (A1) edge (B2);
|
|
\path (A1) edge (B3);
|
|
\path (A1) edge (B4);
|
|
\path (A2) edge (B3);
|
|
\path (A2) edge (B4);
|
|
\path (A3) edge (B5);
|
|
\path (A2) edge (B5);
|
|
\end{tikzpicture}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\linewidth}
|
|
\begin{center}
|
|
Incidence matrix
|
|
$X=\left(
|
|
\begin{array}{rrrrr}
|
|
1 & 1 & 1 & 1 & 0 \\
|
|
0 & 0 & 1 & 1 & 1 \\
|
|
0 & 0 & 0 & 0 & 1 \\
|
|
\end{array}\right)
|
|
$\\
|
|
\end{center}
|
|
\end{minipage}
|
|
|
|
$X$ is the \emph{incidence matrix} and is the mathematical object on which
|
|
computations are performed. It is filled with the following rule:
|
|
\begin{equation*}
|
|
\begin{cases}
|
|
X_{ij} = 0 & \text{if no interaction is observed between species }i\text{ and }j \\
|
|
X_{ij} \neq 0 & \text{otherwise}
|
|
\end{cases}
|
|
\end{equation*}
|
|
If the network represents binary observation (like presence-absence observation) then
|
|
$X_{ij}\in\mathcal{K}=\{0,1\},\forall(i,j)$; if the interactions are weighted
|
|
(like an abundance count), $X_{ij}\in\mathcal{K}=\mathbb{N},\forall(i,j)$.
|
|
|
|
This representation can be used to represent various forms of interactions were
|
|
two kinds of ``actors`` interact. Those interactions can be binary or valued
|
|
and a numeric representation is the incidence matrix, in the above example
|
|
$X$.\\
|
|
|
|
Among the use case of bipartite graphs one can find the Netflix Problem, which
|
|
was a prize organized by Netflix to improve its Recommender system. The row
|
|
nodes are the movies and the columns are the user, at the intersection the
|
|
value is the review of the user $j$ for the movie $i$.\\
|
|
|
|
Another use is the representation of ecological interactions like
|
|
plant-pollinator \parencite{ramos-jilibertoTopologicalChangeAndean2010},
|
|
birds-seed dispersion, prey-predator or host-parasite
|
|
\parencite{kaszewska-gilasGlobalStudiesHostParasite2021}. In those cases, the
|
|
rows are pollinator species and the columns are plant species, and the
|
|
intersection is a value, binary if it is a presence/absence or a value if it is
|
|
an abundance count.
|
|
|
|
Bipartite graphs are widely used in biology, in various fields, among which the
|
|
previously cited ecological networks, but also in medicine with biomedical
|
|
networks, biomolecular networks or epidemiological networks.
|
|
\parencite{pavlopoulosBipartiteGraphsSystems2018}
|
|
|
|
Some interesting results can arise when applying a tool widely used on a
|
|
particular kind of interactions is used on another kind of interactions.
|
|
Companies like Netflix use recommender system, to recommend another product to
|
|
consumers based on their previous interactions. In
|
|
~\cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the
|
|
\emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing
|
|
preys for predators in a predator-prey network.
|
|
|
|
\section{Latent Block Model}
|
|
\label{sec:latent-block-model}
|
|
The Latent Block Model (LBM) introduced by ~\cite{govaertLatentBlockModel2010}
|
|
adapts the Stochastic Block Model (SBM)
|
|
(~\cite{hollandStochasticBlockmodelsFirst1983};~\cite{snijdersEstimationPredictionStochastic1997})
|
|
to bipartite graphs.
|
|
|
|
\begin{small}
|
|
Please note that we prefer the term ``BiSBM`` and will use both LBM and BiSBM to
|
|
designate the Stochastic Block model applied on bipartite networks.
|
|
\end{small}
|
|
|
|
This model supposes that:
|
|
\begin{itemize}
|
|
\item Row nodes are members of row blocks and column nodes are members of column
|
|
blocks.
|
|
\item The connectivity of two individuals is determined by their block memberships.
|
|
\item An interaction can only occur between a row and a column node.
|
|
\end{itemize}
|
|
|
|
\begin{figure}[H]
|
|
\center
|
|
\begin{tikzpicture}[scale=.6]
|
|
\tikzstyle{every state}=[draw, text=black,scale=0.95, transform shape]
|
|
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape]
|
|
\tikzset{edge_proba/.style={draw=white, fill=none, text=black}}
|
|
|
|
\tikzstyle{every node}=[fill=blueind]
|
|
\node[edge_proba] (pi1) at (1,5.7) {\textbf{$\pi_{{\color{blueind}\bullet}}$}};
|
|
\node[state, draw=black!50] (R11) at (0,5) {\textbf{R11}};
|
|
\node[state, draw=black!50] (R12) at (1,5) {\textbf{R12}};
|
|
\node[state, draw=black!50] (R13) at (2,5) {\textbf{R13}};
|
|
|
|
\tikzstyle{every node}=[fill=cyanind]
|
|
\node[edge_proba] (pi2) at (6.75,5.7) {\textbf{$\pi_{{\color{cyanind}\bullet}}$}};
|
|
\node[state, draw=black!50] (R21) at (6.25,5) {\textbf{R21}};
|
|
\node[state, draw=black!50] (R22) at (7.25,5) {\textbf{R22}};
|
|
|
|
\tikzstyle{every node}=[fill=electricblue]
|
|
\node[edge_proba] (pi3) at (10,5.7) {\textbf{$\pi_{{\color{electricblue}\bullet}}$}};
|
|
\node[state, draw=black!50] (R31) at (10,5) {\textbf{R31}};
|
|
|
|
\tikzstyle{every node}=[fill=burntorange, shape=rectangle]
|
|
\node[edge_proba] (pi3) at (0.5,-0.7) {\textbf{$\rho_{{\color{burntorange}\bullet}}$}};
|
|
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape, shape=rectangle]
|
|
\node[state, draw=black!50] (B1) at (0,0) {\textbf{C11}};
|
|
\node[state, draw=black!50] (B2) at (1,0) {\textbf{C12}};
|
|
\tikzstyle{every node}=[fill=goldenyellow, shape=rectangle]
|
|
\node[edge_proba] (pi3) at (4,-0.7) {\textbf{$\rho_{{\color{goldenyellow}\bullet}}$}};
|
|
\node[state, draw=black!50] (B3) at (3.5,0) {\textbf{C21}};
|
|
\node[state, draw=black!50] (B4) at (4.5,0) {\textbf{C22}};
|
|
\tikzstyle{every node}=[fill=yellow, shape=rectangle]
|
|
\node[edge_proba] (pi3) at (10,-0.7) {\textbf{$\rho_{{\color{yellow}\bullet}}$}};
|
|
\node[state, draw=black!50] (B5) at (10,0) {\textbf{C31}};
|
|
|
|
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt,draw opacity=0.2]
|
|
|
|
\path (R11) edge (B2);
|
|
\path (R11) edge (B3);
|
|
\path (R11) edge (B4);
|
|
|
|
\path (R12) edge [] (B1);
|
|
\path (R12) edge (B2);
|
|
\path (R12) edge (B3);
|
|
\path (R12) edge (B4);
|
|
|
|
\path (R13) edge [] (B1);
|
|
\path (R13) edge (B2);
|
|
\path (R13) edge (B3);
|
|
|
|
\path (R21) edge (B4);
|
|
\path (R21) edge (B5);
|
|
|
|
\path (R22) edge (B3);
|
|
\path (R22) edge (B4);
|
|
|
|
\path (R11) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[left, fill=none] {$\alpha_{{\color{blueind}\bullet}{\color{burntorange}\bullet}}$} (B1);
|
|
\path (R13) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, left, fill=none] {$\alpha_{{\color{blueind}\bullet}{\color{goldenyellow}\bullet}}$} (B4);
|
|
\path (R21) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{cyanind}\bullet}{\color{goldenyellow}\bullet}}$} (B3);
|
|
\path (R22) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, left, fill=none] {$\alpha_{{\color{cyanind}\bullet}{\color{yellow}\bullet}}$} (B5);
|
|
\path (R31) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{electricblue}\bullet}{\color{yellow}\bullet}}$} (B5);
|
|
|
|
\end{tikzpicture}
|
|
\caption{An LBM model visualization}
|
|
\label{fig:LBMvisu}
|
|
\end{figure}
|
|
|
|
\begin{itemize}
|
|
\item $Q_1 = |\{{\color{blueind}\bullet},{\color{cyanind}\bullet},{\color{electricblue}\bullet}\}|$ \emph{given} blocks in rows
|
|
\item $Q_2 = |\{{\color{burntorange}\bullet},{\color{goldenyellow}\bullet},{\color{yellow}\bullet}\}|$ \emph{given} blocks in columns
|
|
\end{itemize}
|
|
Parameters
|
|
\begin{itemize}
|
|
\item $\pi_{\bullet} = \mathbb{P}(Z_i = \bullet)$ for rows and $\rho_{\bullet} = \mathbb{P}(W_j = \bullet)$ for columns
|
|
\item $\alpha_{{\color{blueind}\bullet}{\color{burntorange}\bullet}} = \mathbb{P}(X_{ij} = 1 | Z_i = {\color{blueind}\bullet}, W_j = {\color{burntorange}\bullet})$, probability of connectivity knowing node membership blocks.
|
|
\end{itemize}
|
|
|
|
On \ref{fig:LBMvisu}, $\bm{\pi}$ are the probabilities for a row node to belong
|
|
to the row block of corresponding color, $\bm{\rho}$ are the probabilities for
|
|
a column node to belong to the column block of corresponding color and
|
|
$\bm{\alpha}$ are the connectivity parameters between the row and column
|
|
blocks.
|
|
|
|
This model can be used to easily generate bipartite graphs with complex and
|
|
very varied structures. But when trying to determine the structure of a given
|
|
network we need to find those parameters and as the row and column block
|
|
memberships are \emph{latent} i.e.,\ they are not known and must be inferred.
|
|
|
|
For this a common approach is to use a VEM algorithm (proposed for SBM in
|
|
~\cite{daudinMixtureModelRandom2008} and for LBM in
|
|
~\cite{govaertEMAlgorithmBlock2005}) those groups and the required parameters
|
|
can be inferred by maximizing a lower bound of the likelihood minus a penalty.
|
|
|
|
\section{colSBM model, a joint model for a collection of networks}
|
|
\label{sec:colsbm-model-a-joint-model-for-a-collection-of-networks}
|
|
The \emph{colSBM} model introduced by ~\cite{chabert-liddellLearningCommonStructures2023}
|
|
propose an extension of the SBM model to collections of SBMs. A collection is a
|
|
set of networks which nodes are not common or linked between different networks,
|
|
the interactions have the same valuations and are of the same type.
|
|
|
|
The model can retrieve the shared structure in a collection, indicate if
|
|
networks should be grouped in a collection and in a large pool of networks,
|
|
collections with common structures.
|
|
|
|
The next step after designing this collection model for unipartite was to adapt
|
|
it to the bipartite case.
|
|
|
|
\chapter{Structure detection in a collection of bipartite networks : Adjustment of colSBM to the bipartite case}
|
|
\section{Definition of a collection}\label{sec:definition-of-a-collection}
|
|
|
|
We define a collection of bipartite networks as $\bm{X} = (X^1, \dots, X^M)$
|
|
the collection of incidence matrix. Moreover, all the networks in the
|
|
collection have the same type of interaction (e.g., all interactions are
|
|
binary).
|
|
|
|
\section{Separate BiSBM (sep-BiSBM)}\label{sec:separate-bisbm-sepbisbm}
|
|
|
|
A first approach to deal with a collection of networks is to adjust separate
|
|
BiSBM for each network of the collection.
|
|
|
|
For network $m$, let $n_1^m$ (resp. $n_2^m$) be the number of nodes in row
|
|
(resp. column) divided into $Q_1^m$ row clusters (resp. $Q_2^m$ column
|
|
clusters).\\ Let $Z^m~=~(Z^m_i, \dots, Z^m_{n_1^m})$ and $W^m~=~(W^m_j, \dots,
|
|
W^m_{n_2^m})$ be independent latent variables such that $Z^m_i = q$ if row node
|
|
$i$ of network $m$ belongs to row cluster $q$ ($q\in\{1,\dots,Q_1^m\}$) and
|
|
$W^m_j = r$ if column node $j$ of network $m$ belong to column block $r$
|
|
($r\in\{1,\dots,Q_2^m\}$). And we have
|
|
\begin{align}\label{eqn:lbm-block-membership-prob}
|
|
\mathbb{P}(Z_i^m=q)=\pi_q^m, & & \mathbb{P}(W_j^m=r)=\rho_r^m
|
|
\end{align}
|
|
where $\pi_q^m > 0$, $\rho_r^m > 0$, $\sum_{q=1}^{Q_1^m}\pi_q^m = 1$ and
|
|
$\sum_{r=1}^{Q_2^m}\rho_r^m = 1$. Given the latent variables
|
|
$Z^m, W^m$, the $X_{ij}^m$s are assumed to be independent and distributed
|
|
as
|
|
|
|
\begin{align}\label{eqn:lbm-conditional-to-latent}
|
|
X_{ij}^m|Z_i^m = q,W_j^m = r \sim \mathcal{F}(.;\alpha_{qr}^m)
|
|
\end{align}
|
|
where $\mathcal{F}$ is referred to as the emission distribution. $\mathcal{F}$ is chosen to
|
|
be the Bernoulli distribution for binary interactions, and the Poisson
|
|
distribution for weighted interactions such as counts. Let $f$ be the density of
|
|
the emission distribution, then:
|
|
|
|
\begin{equation}\label{eqn:lbm-emission}
|
|
\log f(X^m_{ij};\alpha_{qr}^m) =
|
|
\begin{cases}
|
|
X_{ij}^m \log(\alpha_{qr}^m) + (1-X_{ij}^m) \log(1-\alpha_{qr}^m) & \text{for Bernoulli emission} \\
|
|
-\alpha_{qr}^m + X_{ij}^m \log(\alpha_{qr}^m) - \log(X_{ij}^m!) & \text{for Poisson emission}
|
|
\end{cases}
|
|
\end{equation}
|
|
|
|
Equations \eqref{eqn:lbm-block-membership-prob},
|
|
\eqref{eqn:lbm-conditional-to-latent} and \eqref{eqn:lbm-emission} defines the
|
|
BiSBM model and we will now use a short notation:
|
|
|
|
\begin{equation}
|
|
\tag{\emph{sep-BiSBM}}
|
|
X^m \sim \mathcal{F}\text{-BiSBM}_{n_1^m,n_2^m}(Q_1^m, Q_2^m, \bm{\pi^m}, \bm{\rho^m}, \bm{\alpha^m})
|
|
\end{equation}
|
|
where $\mathcal{F}$ encodes the emission distribution, $n_1^m,n_2^m$ are the row
|
|
and column nodes, $Q_1^m, Q_2^m$ are the number of row and column blocks in
|
|
network $m$, $\bm{\pi}^m~=~{(\pi^m_q)}_{q=1,\dots,Q_1^m}$ and
|
|
$\bm{\rho}^m~=~{(\rho^m_r)}_{r=1,\dots,Q_2^m}$ are the vectors of their
|
|
proportions. The $Q_1^m \times Q_2^m$ matrix
|
|
$\bm{\alpha}^m = {(\alpha^m_{qr})}_{\substack{q = 1,\dots,Q_1^m \\ r = 1,\dots,Q_2^m}}$
|
|
are the connectivity parameters, the parameters of the emission distribution.
|
|
$\alpha^m_{qr}\in\mathcal{A}_{\mathcal{F}}$ where, for the Bernoulli
|
|
(resp. Poisson) emission distribution, $\mathcal{A}_{\mathcal{F}} = (0,1)$ (resp.
|
|
$\mathcal{A}_{\mathcal{F}} = \mathbb{R}^{*+}$). In this $sep$-$BiSBM$ each
|
|
network $m$ is assumed to follow a $BiSBM$ with its own parameters ($\bm{\pi}^m,
|
|
\bm{\rho}^m, \bm{\alpha}^m$).
|
|
% DONE Finish explaining
|
|
|
|
\section{Definition of the colBiSBM models}\label{sec:definition-of-the-colbisbm-models}
|
|
% Here are some common notations and conventions that we will use in the following
|
|
% sections.
|
|
|
|
\subsection{A collection of i.i.d bipartite SBM}\label{ssec:a-collection-of-i-i-d-bipartite-sbm}
|
|
As for \emph{colSBM} this first model is the most constrained. It assumes that
|
|
all the networks are the independent realizations of the same $Q_1$-$Q_2$-BiSBM
|
|
with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
|
|
|
|
\begin{align}
|
|
\tag{\emph{iid-colBiSBM}}
|
|
X^m \sim \mathcal{F}-BiSBM_{n_1^m,n_2^m} (Q_1, Q_2, \bm{\pi}, \bm{\rho}, \bm{\alpha}), & & \forall m = 1, \dots M
|
|
\end{align}
|
|
where $\forall (q,r) \in \{1,\dots,Q_1\}\times\{1,\dots,Q_2\}$, $\alpha_{qr} \in \mathcal{A}_{\mathcal{F}}$,
|
|
$\pi_q \in \left( 0,1 \right], \sum_{q=1}^{Q_1} \pi_q = 1 $ and $\rho_r \in \left( 0,1 \right], \sum_{r=1}^{Q_2} \rho_r = 1 $.
|
|
This model involves $(Q_1 - 1) + (Q_2 - 1) + Q_1\times Q_2$ parameters, the two
|
|
first terms corresponding to block proportions on the row and column dimensions
|
|
and the third term to connectivity parameters.
|
|
|
|
But the assumption that block proportions are the same among the networks is a
|
|
strong assumption. In plant-pollinator networks, the proportion of specialist
|
|
species can differ between networks and thus the model may benefit from not
|
|
having the same block proportions but sharing a common connectivity structure.
|
|
The following models relaxes this assumption on either row, column or both.
|
|
|
|
\subsection{A collection of bipartite SBM with varying block size on either rows or columns}\label{ssec:a-collection-of-bipartite-sbm-with-varying-block-size-on-either-rows-or-columns}
|
|
% DONE Finish explaining
|
|
|
|
$\pi$-colBiSBM model still assumes that the networks share a common connectivity
|
|
structure represented by $\bm{\alpha}$ but that each network has its own row
|
|
block proportions. For $m \in \{1,\dots,M\}$, the $X^m$ are independent and
|
|
\begin{align}
|
|
\tag{\emph{$\pi$-colBiSBM}}
|
|
X^m \sim \mathcal{F}-BiSBM_{n_1^m,n_2^m} (Q_1, Q_2, \bm{\pi^m}, \bm{\rho}, \bm{\alpha}), & & \forall m = 1, \dots, M
|
|
\end{align}
|
|
where $\forall (q,r) \in \{1,\dots,Q_1\}\times\{1,\dots,Q_2\}$, $\alpha_{qr} \in \mathcal{A}_{\mathcal{F}}$,
|
|
$\pi^m_q \in \left[ 0,1 \right], \sum_{q=1}^{Q_1} \pi^m_q~=~1, \forall m \in \{1,\dots,M\}$ and $\rho_r \in \left( 0,1 \right], \sum_{r=1}^{Q_2} \rho_r = 1 $.
|
|
This model is more flexible than the iid-colBiSBM as it allows some row block
|
|
proportions to be null
|
|
in certain networks ($\pi^m_q\in\left[ 0,1 \right]$): if $\pi_q^m = 0$ then the
|
|
block $q$ is not represented in the network $m$. The connectivity structure is
|
|
thus a subset of a large connectivity structure common to all networks. We face
|
|
the same problems as~\cite{chabert-liddellLearningCommonStructures2023} and
|
|
adapt the support $S$ they define for the $\pi$-colSBM to the bipartite case by
|
|
having $S^1$ of size $M\times Q_1$ the support for the rows and $S^2$ of size
|
|
$M\times Q_2$ the support for the columns. Thus
|
|
$S^1_{mq} = \mathbb{1}_{\pi^m_q > 0}$ and
|
|
$S^2_{mr} = \mathbb{1}_{\rho^m_r > 0}$. In this case, $S^2 = \bm{1}$, because
|
|
there is no freedom on the column dimension.
|
|
|
|
For a given number of blocks $Q_1$, $Q_2$ and matrix $S^1$ ($S^2$ being in this
|
|
case the matrix full of ones), the number of parameters is:
|
|
\begin{equation*}
|
|
\text{NP}(\pi\text{-}colBiSBM) = \sum_{m=1}^{M}\Bigg( \sum_{q=1}^{Q_1} S^1_{mq} - 1 \Bigg) + (Q_2 - 1) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
|
|
\end{equation*}
|
|
The first term corresponds to the non-null block proportions in each network.
|
|
The third quantity accounts for the fact that some blocks may never be
|
|
represented simultaneously in any network, so the corresponding connection
|
|
parameters $\alpha_{qr}$ are not useful for defining the model.
|
|
|
|
$\rho$-colBiSBM model still assumes that the networks share a common connectivity
|
|
structure represented by $\bm{\alpha}$ but that each network has its own column
|
|
block proportions. For $m \in \{1,\dots,M\}$, the $X^m$ are independent and
|
|
\begin{align}
|
|
\tag{\emph{$\rho$-colBiSBM}}
|
|
X^m \sim \mathcal{F}-BiSBM_{n_1^m,n_2^m} (Q_1, Q_2, \bm{\pi}, \bm{\rho^m}, \bm{\alpha}), & & \forall m = 1, \dots, M
|
|
\end{align}
|
|
where $\forall (q,r) \in \{1,\dots,Q_1\}\times\{1,\dots,Q_2\}$, $\alpha_{qr} \in \mathcal{A}_{\mathcal{F}}$,
|
|
$\pi_q \in \left( 0,1 \right], \sum_{q=1}^{Q_1} \pi_q = 1 $ and
|
|
$\rho^m_r \in \left[ 0,1 \right], \sum_{r=1}^{Q_2} \rho^m_r = 1 $.
|
|
This model is more flexible than the iid-colBiSBM as it allows some column block
|
|
proportions to be
|
|
null in certain networks ($\rho^m_r\in\left[ 0,1 \right]$): if $\rho_r^m = 0$
|
|
then the column block $r$ is not represented in the network $m$.
|
|
"Mirroring" the formulas for the $\pi$-$colBiSBM$ we relax the constraints on
|
|
the column dimension.
|
|
|
|
For a given number of blocks $Q_1$, $Q_2$ and matrix $S^2$ ($S^1$ being in this
|
|
case the matrix full of ones), the number of parameters is:
|
|
\begin{equation*}
|
|
\text{NP}(\rho\text{-}colBiSBM) = (Q_1 - 1) + \sum_{m=1}^{M}\Bigg( \sum_{r=1}^{Q_2} S^2_{mr} - 1 \Bigg) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
|
|
\end{equation*}
|
|
|
|
$\pi\rho$-colBiSBM model still assumes that the networks share a common connectivity
|
|
structure represented by $\bm{\alpha}$ but that each network has its own row and
|
|
column block proportions, it is the less constrained model.
|
|
For $m \in \{1,\dots,M\}$, the $X^m$ are independent and
|
|
\begin{align}
|
|
\tag{\emph{$\pi\rho$-colBiSBM}}
|
|
X^m \sim \mathcal{F}-BiSBM_{n_1^m,n_2^m} (Q_1, Q_2, \bm{\pi^m}, \bm{\rho^m}, \bm{\alpha}), & & \forall m = 1, \dots, M
|
|
\end{align}
|
|
where $\forall (q,r) \in \{1,\dots,Q_1\}\times\{1,\dots,Q_2\}$, $\alpha_{qr} \in \mathcal{A}_{\mathcal{F}}$,
|
|
$\pi^m_q \in \left[ 0,1 \right], \sum_{q=1}^{Q_1} \pi^m_q~=~1, \forall m \in \{1,\dots,M\}$ and
|
|
$\rho^m_r \in \left[ 0,1 \right], \sum_{r=1}^{Q_2} \rho^m_r = 1 $.
|
|
|
|
For a given number of blocks $Q_1$, $Q_2$ and matrices $S^1$, $S^2$, the number
|
|
of parameters is:
|
|
\begin{equation*}
|
|
\text{NP}(\pi\rho\text{-}colBiSBM) = \sum_{m=1}^{M}\Bigg( \sum_{q=1}^{Q_1} S^1_{mq} - 1 \Bigg) + \sum_{m=1}^{M}\Bigg( \sum_{r=1}^{Q_2} S^2_{mr} - 1 \Bigg) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
|
|
\end{equation*}
|
|
|
|
\section{Variational estimation of the parameters}\label{sec:variational-estimation-of-the-parameters}
|
|
|
|
In practice, the estimation of the likelihood is not tractable. Following the
|
|
classical approach defined in~\cite{daudinMixtureModelRandom2008} we use a
|
|
variatonal version of the Expectation Maximization (VEM) algorithm.
|
|
|
|
We maximize a variational lower bound of the log-likelihood of the observed
|
|
data by approximating $p(\bm{Z,W}|\bm{X};\bm{\theta})$ with a distribution on
|
|
$\bm{Z}$ and $\bm{W}$ named $\mathcal{R}$ defined as $\mathcal{R} =
|
|
\otimes_{m=1}^M \mathcal{R}_m$.\
|
|
|
|
The lower bound is defined as:
|
|
\begin{equation*}
|
|
\mathcal{J}(\mathcal{R};\bm{\theta}) \coloneqq \sum_{m=1}^{M} \bigg( \mathbb{E}_{\mathcal{R}_m}[\ell(X^m,Z^m,W^m;\bm{\theta})] + \mathcal{H}(\mathcal{R}_m) \bigg) \leq \ell(\bm{X};\bm{\theta})
|
|
\end{equation*}
|
|
|
|
$\bm{Z}$ and $\bm{W}$ are
|
|
redefined using the \emph{one-hot encoded} conversion (i.e., $Z_i^m = q
|
|
\rightarrow Z_{iq}^m = 1$ and $W_j^m = r \rightarrow W_{jr}^m = 1$).\\ % W_{jr\prime}^m pour r != r égal 0
|
|
|
|
When $\mathcal{R}_m$ is issued from the set of the factorizable distributions,
|
|
we denote $\tau_{iq}^{1,m} = \mathbb{P}_{\mathcal{R}_m}(Z_{iq}^m = 1|X_{ij}^m)$
|
|
and $\tau_{jr}^{2,m} = \mathbb{P}_{\mathcal{R}_m}(W_{jr}^m = 1|X_{ij}^m)$, thus
|
|
we have: $\mathbb{P}_{\mathcal{R}_m} (Z_{iq}^m = 1, W_{jr}^m = 1|X_{ij}^m) =
|
|
\mathbb{P}_{\mathcal{R}_m}(Z_{iq}^m = 1|X_{ij}^m) {\color{red}\times}
|
|
\mathbb{P}_{\mathcal{R}_m}(W_{jr}^m = 1|X_{ij}^m) = \tau_{iq}^{1,m}
|
|
{\color{red}\times} \tau_{jr}^{2,m}$.
|
|
|
|
The formula for the entropy per network is thus:
|
|
\begin{equation*}
|
|
\mathcal{H}(\mathcal{R}_m) = - \sum_{i=1}^{n_1} \tau^{1,m}_{i,q} \log \tau^{1,m}_{i,q} - \sum_{j=1}^{n_2} \tau^{2,m}_{j,r} \log \tau^{2,m}_{j,r}
|
|
\end{equation*}
|
|
|
|
And the expectation of the completed log-likelihood under the $\mathcal{R}_m$
|
|
variational distribution for network $m$ is:
|
|
\begin{align*}
|
|
\mathbb{E}_{\mathcal{R}_m}[\ell(X^m,Z^m,W^m;\bm{\theta})] = \sum_{i = 1}^{n_1^m}\sum_{j=1}^{n_2^m}\sum_{q \in \mathcal{Q}_{1,m}} \sum_{r \in \mathcal{Q}_{2,m}} \tau^{1,m}_{i,q} \tau^{2,m}_{j,r} \log f(X^{m}_{ij}; \alpha_{qr}) \\
|
|
+ \sum_{i=1}^{n_1^m} \sum_{q \in \mathcal{Q}_{1,m}} \tau^{1,m}_{i,q} \log \pi_{\color{black}q}^{\color{gray}m} + \sum_{j=1}^{n_2^m} \sum_{r \in \mathcal{Q}_{2,m}} \tau^{2,m}_{j,r} \log \rho_{\color{black}r}^{\color{gray}m}
|
|
\end{align*}
|
|
|
|
And thus the lower bound becomes:
|
|
|
|
\begin{align*}
|
|
\mathcal{J}(\bm{\tau};\bm{\theta}) \coloneqq \sum_{m=1}^{M} \bigg(\sum_{i = 1}^{n_1^m}\sum_{j=1}^{n_2^m}\sum_{q \in \mathcal{Q}_{1,m}} \sum_{r \in \mathcal{Q}_{2,m}} \tau^{1,m}_{i,q} \tau^{2,m}_{j,r} \log f(X^{m}_{ij}; \alpha_{qr}) \\
|
|
+ \sum_{i=1}^{n_1^m} \sum_{q \in \mathcal{Q}_{1,m}} \tau^{1,m}_{i,q} \log \pi_{\color{black}q}^{\color{gray}m} + \sum_{j=1}^{n_2^m} \sum_{r \in \mathcal{Q}_{2,m}} \tau^{2,m}_{j,r} \log \rho_{\color{black}r}^{\color{gray}m} \\
|
|
- \sum_{i=1}^{n_1} \tau^{1,m}_{i,q} \log \tau^{1,m}_{i,q} - \sum_{j=1}^{n_2} \tau^{2,m}_{j,r} \log \tau^{2,m}_{j,r} \bigg) \color{black}
|
|
\end{align*}
|
|
|
|
where we identify the variational distribution $\mathcal{R}$ with its parameter
|
|
$\bm{\tau}$. \\
|
|
|
|
% \begin{equation*}
|
|
% \mathcal{J}(\mathcal{R};\bm{\theta}) \coloneqq \mathbb{E}_{\mathcal{R}}[\ell(\bm{X},\bm{Z},\bm{W};\bm{\theta})] + \mathcal{H}(\bm{Z,W}) \leq \ell(\bm{X};\bm{\theta})
|
|
% \end{equation*}
|
|
|
|
The VEM algorithm alternates between two steps, the variational E step and the
|
|
M step. The E steps consists in optimizing $\mathcal{J}(\bm{\tau};\bm{\theta})$
|
|
for a current value of $\bm{\theta}$ with respect to $\bm{\tau}$. And the M
|
|
step consists of maximizing $\mathcal{J}(\bm{\tau};\bm{\theta})$ with respect
|
|
to $\bm{\theta}$ and for a given variational distribution $\bm{\tau}$.
|
|
|
|
\subsection{Variational E step}
|
|
\label{ssec:variational-e-step}
|
|
|
|
At this step we maximize with respect to the variational distribution
|
|
$\bm{\tau}$: $$\widehat{\bm{\tau}}^{(t+1)} = \arg \max_{\bm{\tau}}
|
|
\mathcal{J}(\mathcal{\bm{\tau}},\bm{\widehat{\theta}}^{(t)}).$$
|
|
|
|
And we obtain the following formulae for the $\bm{\tau^m}$:
|
|
|
|
\begin{align*}
|
|
\widehat{\tau}_{iq}^{1,m} \propto \widehat{\pi}_{q}^{m(t)} \prod_{j=1}^{n_2^m}\prod_{r\in\mathcal{Q}_2^m} f(X_{ij}^m;\widehat{\alpha}_{qr}^{(t)})^{\widehat{\tau}_{jr}^{2,m(t+1)}} & \forall i = 1, \dots , n_1^m, q \in \mathcal{Q}_1^m \\
|
|
\widehat{\tau}_{jr}^{2,m} \propto \widehat{\rho}_{r}^{m(t)} \prod_{i=1}^{n_1^m}\prod_{q\in\mathcal{Q}_1^m} f(X_{ij}^m;\widehat{\alpha}_{qr}^{(t)})^{\widehat{\tau}_{iq}^{1,m(t+1)}} & \forall j = 1, \dots , n_2^m, r \in \mathcal{Q}_2^m
|
|
\end{align*}
|
|
which are used to update iteratively the values by a fixed point algorithm with
|
|
only one step.
|
|
|
|
% TODO move to technical.tex
|
|
% From the above formulae we obtain for the Bernoulli distribution:
|
|
% \begin{itemize}
|
|
% \item[-] \textit{iid} :
|
|
% \[ \bm{\tau}^{m,1} = ~^{t}\pi + \exp((\text{Mask}^{m} \odot A^{m})
|
|
% \bm{\tau}^{m,2} ~^{t}(\text{logit}(\alpha)) + \text{Mask}^{m}
|
|
% \bm{\tau}^{m,2} ~^{t}\log(\bm{1} - \alpha)) \]
|
|
% \[ \bm{\tau}^{m,2} = ~^{t}\rho + \exp(~^{t}(\text{Mask}^{m} \odot A^{m})
|
|
% \bm{\tau}^{m,1} \text{logit}(\alpha) + ~^{t}\text{Mask}^{m}
|
|
% \bm{\tau}^{m,1} \log(\bm{1} - \alpha)) \]
|
|
% \item[-] $\rho\pi$ :
|
|
% \[ \bm{\tau}^{m,1} = ~^{t}\pi^{m} + \exp((\text{Mask}^{m} \odot A^{m})
|
|
% \bm{\tau}^{m,2} ~^{t}(\text{logit}(\alpha)) + \text{Mask}^{m}
|
|
% \bm{\tau}^{m,2} ~^{t}\log(\bm{1} - \alpha)) \]
|
|
% \[ \bm{\tau}^{m,2} = ~^{t}\rho^{m} + \exp(~^{t}(\text{Mask}^{m} \odot A^{m})
|
|
% \bm{\tau}^{m,1} \text{logit}(\alpha) + ~^{t}\text{Mask}^{m}
|
|
% \bm{\tau}^{m,1} \log(\bm{1} - \alpha)) \]
|
|
% \end{itemize}
|
|
|
|
% with $\text{Mask}^{m}$ the matrix containing $0$ if the value is a NA and a 1
|
|
% otherwise.
|
|
|
|
\subsection{M step of the algorithm}
|
|
\label{ssec:m-step-of-the-algorithm}
|
|
At iteration $(t)$ the M-step maximizes the variational bound with respect to
|
|
the model parameters $\bm{\theta}$:
|
|
\[
|
|
\widehat{\bm{\theta}}^{(t+1)} = \arg \max_{\bm{\theta}} \mathcal{J}(\mathcal{\bm{\widehat{\tau}}}^{(t+1)},\bm{\theta})
|
|
\]
|
|
|
|
The following quantities are involved in the obtained formulae:
|
|
|
|
\begin{align*}
|
|
e^{m}_{qr} = \sum_{i=1}^{n_1^m} \sum_{j=1}^{n_2^m} \tau_{iq}^{1,m} \tau_{jr}^{2,m} X_{ij}^m
|
|
& , & n^{m}_{qr} = \sum_{i=1}^{n_1^m} \sum_{j=1}^{n_2^m} \tau_{iq}^{1,m} \tau_{jr}^{2,m}
|
|
& , & n^{1,m}_{q} = \sum_{i=1}^{n_1^m} \tau_{iq}^{1,m}
|
|
& , & n^{2,m}_{r} = \sum_{j=1}^{n_2^m} \tau_{jr}^{2,m}
|
|
\end{align*}
|
|
|
|
The block proportions, in free mixture models,
|
|
$(\pi_q^m)_{q\in\mathcal{Q}_1^m}, (\rho_r^m)_{r\in\mathcal{Q}_2^m}$ are
|
|
estimated as
|
|
\begin{align*}
|
|
\widehat{\pi}_q^{m}= \frac{n^{1,m}_{q}}{n_1^m} & & \text{for } \pi\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM \\
|
|
\widehat{\rho}_r^{m}= \frac{n^{2,m}_{r}}{n_2^m} & & \text{for } \rho\text{-}colBiSBM \text{ and } \pi\rho\text{-}colBiSBM
|
|
\end{align*}
|
|
while on the other hand,
|
|
\begin{align*}
|
|
\widehat{\pi}_q = \frac{\sum_{m=1}^{M} n^{1,m}_{q}}{\sum_{m=1}^{M} n_1^m} & & \text{for } iid\text{-}colBiSBM \text{ and } \rho\text{-}colBiSBM \\
|
|
\widehat{\rho}_r = \frac{\sum_{m=1}^{M} n^{2,m}_{r}}{\sum_{m=1}^{M} n_2^m} & & \text{for } iid\text{-}colBiSBM \text{ and } \pi\text{-}colBiSBM
|
|
\end{align*}
|
|
the parameters takes into account all the networks at the same time. The
|
|
connectivity parameters $\alpha_{qr}$ for all models are estimated as the ratio
|
|
of the number of interactions between row block $q$ and column block $r$ among
|
|
all networks over the number of number of possible interactions:
|
|
\begin{align*}
|
|
\widehat{\alpha}_{qr} = \frac{\sum_{m=1}^{M} e^{m}_{qr}}{\sum_{m=1}^{M} n^{m}_{qr}}
|
|
\end{align*}
|
|
|
|
\section{Model selection}\label{sec:model-selection}
|
|
% DONE
|
|
% Adapt bicl, methode explo car defi
|
|
% 1 bicl 2 model exploration
|
|
% Citer la conclusion de l'article de St Clair discussion sur bipartite
|
|
As discussed in~\cite{chabert-liddellLearningCommonStructures2023}, the
|
|
algorithmic aspect becomes complex when dealing with the bipartite case. Due to
|
|
the size of the latent space being $\mathbb{N}^2$, conducting a complete
|
|
exploration of the latent space is practically infeasible. Therefore, in
|
|
addition to adapting the existing formulas, our contribution to addressing this
|
|
challenge involved making significant choices, which are outlined below.
|
|
|
|
The below procedures are implemented in the \emph{colSBM} package, available on
|
|
\url{https://github.com/Chabert-Liddell/colSBM}.
|
|
|
|
\subsection{The BIC-L criterion for model selection}
|
|
\label{ssec:the-bic-l-criterion-for-model-selection}
|
|
The Integrated Classified Likelihood (ICL) is a well-established tool in the SBM
|
|
and LBM domains for selecting the appropriate number of blocks. It was
|
|
introduced by~\cite{biernackiAssessingMixtureModel2000};
|
|
~\cite{daudinMixtureModelRandom2008}. The ICL is derived from an asymptotic
|
|
approximation of the marginal complete likelihood. In this approach, the model
|
|
parameters are integrated out using a prior distribution, resulting in a
|
|
penalized likelihood criterion. By employing the ICL, one can effectively
|
|
determine the optimal number of blocks for the given problem in a systematic
|
|
manner.
|
|
We obtain the following expression
|
|
\[
|
|
\text{ICL} = \max_{\theta} \mathbb{E}_{\widehat{\mathcal{R}}} [\ell(\bm{X,Z,W;\theta})] - \frac{1}{2}\text{pen}
|
|
\]
|
|
with pen the penalties.\\ Using the formula $\mathbb{E}_{\widehat{\mathcal{R}}}
|
|
[\ell(\bm{X,Z,W;\theta})] \approx \ell (\bm{X;\theta}) -
|
|
\mathcal{H(\widehat{R})}$, it becomes clearer, as highlighted in the existing
|
|
literature, that the Integrated Classified Likelihood (ICL) gives preference to
|
|
well-separated blocks by imposing a penalty on the entropy of node grouping.
|
|
However, the objective of our study extends beyond grouping nodes into coherent
|
|
blocks. We also aim to assess the similarity of connectivity patterns across
|
|
different networks. Consequently, we aim to permit models that offer more
|
|
flexible node grouping without penalizing entropy. This leads us to formulate a
|
|
BIC-like criterion in the following manner:
|
|
|
|
\[
|
|
\text{BIC-L} = \max_{\bm{\theta}} \mathbb{E}_{\widehat{\mathcal{R}}} [\ell(\bm{X,Z,W;\theta})] + \mathcal{H(\widehat{R})} - \frac{1}{2}\text{pen} = \max_{\bm{\theta}} \mathcal{J(\widehat{R}, \bm{\theta})} - \frac{1}{2}\text{pen}
|
|
\]
|
|
|
|
We provide below the expression for the penalties for the 4 models that we
|
|
propose.
|
|
|
|
\paragraph*{\textit{iid-colBiSBM}}
|
|
For the \textit{iid-colBiSBM} the penalties were modified in the following way:
|
|
|
|
\begin{itemize}
|
|
\item For the $\pi$s and $\rho$s:
|
|
\[\text{pen}_{\pi}(Q_1) = (Q_1 - 1)\log(\sum_{m=1}^{M}n_{1}^{m})\]
|
|
\[\text{pen}_{\rho}(Q_2) = (Q_2 - 1)\log(\sum_{m=1}^{M}n_{2}^{m})\]
|
|
\item For the $\alpha$s :
|
|
\[\text{pen}_{\alpha}(Q_1, Q_2) = Q_1 \times Q_2 \log(N_M)\]
|
|
with
|
|
\[ N_M = \sum_{m = 1}^{M} n_{1}^{m} \times n_{2}^{m} \]
|
|
\end{itemize}
|
|
And thus the $\text{BIC-L}$ formula is now:
|
|
\[ \text{BIC-L}(\bm{X},Q_1, Q_2) = \max_{\theta} \mathcal{J} (\mathcal{\hat{R}}, \bm{\theta})
|
|
- \frac{1}{2} [\text{pen}_{\pi}(Q_1) + \text{pen}_{\rho}(Q_2) + \text{pen}_{\alpha}(Q_1, Q_2)]\]
|
|
|
|
\paragraph*{\textit{$\rho\pi$-colBiSBM}}
|
|
For the \textit{$\rho\pi$-colBiSBM} the penalties are the following:
|
|
|
|
\begin{itemize}
|
|
\item The support penalties are:
|
|
\[ \text{pen}_{S_1}(Q_1) = -2 \log p_{Q_1} (S_1) \]
|
|
\[ \text{pen}_{S_2}(Q_2) = -2 \log p_{Q_2} (S_2) \]
|
|
with
|
|
\[ \log p_{Q_1}(S_1) = - M \log(Q_1) - \sum_{m=1}^{M} \log {Q_1 \choose Q_1^{(m)}} \]
|
|
\[ \log p_{Q_2}(S_2) = - M \log(Q_2) - \sum_{m=1}^{M} \log {Q_2 \choose Q_2^{(m)}} \]
|
|
\item Penalties for the $\rho$s and $\pi$s:
|
|
\[ \text{pen}_{\pi}(Q_1, S_1) = \sum_{m=1}^{M} (Q_{1}^{(m)} - 1) \log n_{1}^{m} \]
|
|
\[ \text{pen}_{\rho}(Q_2, S_2) = \sum_{m=1}^{M} (Q_{2}^{(m)} - 1) \log n_{2}^{m} \]
|
|
\item Penalties for the $\alpha$s:
|
|
\[ \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2) = (\sum_{q=1}^{Q_1} \sum_{r=1}^{Q_2} \mathbb{1}_{(S_1)'S_2 > 0}) \log (N_M) \]
|
|
\end{itemize}
|
|
And the corresponding BIC-L formula:
|
|
\[
|
|
\begin{aligned}
|
|
\text{BIC-L}(\bm{X},Q_1, Q_2) =
|
|
\max_{S_1,S_2} [
|
|
& \max_{\theta_{S_1,S_2} \in \Theta_{S_1,S_2}} \mathcal{J}(\mathcal{\hat{R}},\theta_{S_1,S_2}) \\
|
|
- \frac{1}{2} & (\text{pen}_{\pi}(Q_1, S_1) + \text{pen}_{\rho}(Q_2, S_2) \\
|
|
& + \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2) \\
|
|
& + \text{pen}_{S_1}(Q_1) + \text{pen}_{S_2}(Q_2))] \\
|
|
\end{aligned}
|
|
\]
|
|
|
|
\subsection{Initialization and pairing of the models}
|
|
\label{ssec:initialization-and-pairing-of-the-models}
|
|
First to combine the information from the $M$ networks we fit a collection model
|
|
for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the
|
|
previously described VEM algorithm we obtain for each network its parameters
|
|
($\bm{\rho,\pi,\alpha}$).
|
|
|
|
We then compute the marginal laws for each dimension, for each network. Then we
|
|
order the network blocks by the probabilities obtained in decreasing order.
|
|
\begin{itemize}
|
|
\item For the memberships on the columns: $col~order_m = order\left(\pi_m \times
|
|
\alpha_m\right)$
|
|
\item For the memberships on the rows: $row~order_m = order\left(\rho_m \times
|
|
~^{t}(\alpha_m)\right)$
|
|
\end{itemize}
|
|
|
|
Using this order we relabel the memberships for the $M$ fitted collection of a
|
|
single network. Then we use the $M$ memberships to fit a collection containing
|
|
the $M$ networks.
|
|
\subsection{Greedy exploration to find an estimation of the mode}
|
|
\label{ssec:greedy-exploration-to-find-an-estimation-of-the-mode}
|
|
Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to
|
|
perform a greedy exploration to find a first mode.
|
|
|
|
Meaning that for a given $Q = (Q_1, Q_2)$ we will compute all the possible
|
|
memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1,
|
|
Q_2), (Q_1, Q_2 - 1)\}$, fit the corresponding models and choose the one that
|
|
maximizes the BIC-L as the next point from which to repeat the procedure. We
|
|
repeat the procedure until the BIC-L stops increasing $2$ times in a row.
|
|
|
|
\begin{algorithm}[H]
|
|
\caption{Greedy Exploration for Mode Estimation}
|
|
\SetAlgoLined
|
|
\SetKwInOut{Input}{Input}
|
|
\SetKwInOut{Output}{Output}
|
|
|
|
\Input{Fitted models for $Q = (1,2)$ and $Q = (2,1)$}
|
|
\Output{Estimation of the mode using greedy exploration}
|
|
|
|
\BlankLine
|
|
Initialize $Q = (1,2)$ as the starting point
|
|
Initialize $\text{BIC-L}_{\text{max}}$ as the maximum achieved BIC-L value
|
|
Initialize $consecutive\_count$ as 0
|
|
|
|
\BlankLine
|
|
\While{$consecutive\_count < 2$}{
|
|
Compute possible memberships for $Q \in \{(Q_1 + 1, Q_2), (Q_1, Q_2 + 1), (Q_1 - 1, Q_2), (Q_1, Q_2 - 1)\}$\;
|
|
Fit models with the computed memberships
|
|
Choose the model with the maximum BIC-L as the next point
|
|
|
|
\BlankLine
|
|
\If{$\text{BIC-L} > \text{BIC-L}_{\text{max}}$}{
|
|
$\text{BIC-L}_{\text{max}} \leftarrow \text{BIC-L}$
|
|
$consecutive\_count \leftarrow 0$
|
|
}
|
|
\Else{
|
|
$consecutive\_count \leftarrow consecutive\_count + 1$
|
|
}
|
|
|
|
\BlankLine
|
|
$Q \leftarrow$ Next selected point
|
|
}
|
|
|
|
\BlankLine
|
|
\textbf{Output:} Estimation of the mode using greedy exploration
|
|
\end{algorithm}
|
|
|
|
When this first estimation of the BIC-L mode has been find we apply the moving
|
|
window on it.
|
|
\subsection{Moving window to update the block memberships and the BIC-L}
|
|
\label{ssec:moving-window-to-update-the-block-memberships-and-the-bic-l}
|
|
The \emph{moving window} is used to update the block memberships on rows and
|
|
columns and fit new models with those changes.
|
|
To define the window, we use a center point and a \emph{depth}, giving us the
|
|
bottom left corner ($Q_{1,center} - depth, Q_{2,center} - depth$) and the top right corner of the
|
|
window ($Q_{1,center} + depth, Q_{2,center} + depth$). All the points in this square will be
|
|
updated and contribute to the update of the others.
|
|
This procedure is repeated until convergence of the BIC-L.
|
|
|
|
The figure \ref{fig:moving-window-procedure} illustrates the procedure. It
|
|
consists of two alternating steps:
|
|
\begin{itemize}
|
|
\item the \emph{forward pass}: repeatedly computing the possible splits to fit the
|
|
current model.
|
|
\item the \emph{backward pass}: computing the possible merges to fit the current
|
|
model.
|
|
\end{itemize}
|
|
|
|
\begin{algorithm}[H]
|
|
\caption{Moving Window Procedure}
|
|
\SetAlgoLined
|
|
\SetKwInOut{Input}{Input}
|
|
\SetKwInOut{Output}{Output}
|
|
|
|
\Input{Center point $(Q_{1,\text{center}}, Q_{2,\text{center}})$, depth}
|
|
\Output{Best model with maximum BIC-L in the window}
|
|
|
|
\BlankLine
|
|
Define bottom left corner $(Q_{1,\text{center}} - \text{depth}, Q_{2,\text{center}} - \text{depth})$\\
|
|
Define top right corner $(Q_{1,\text{center}} + \text{depth}, Q_{2,\text{center}} + \text{depth})$
|
|
|
|
\BlankLine
|
|
\While{not converged}{
|
|
\textbf{Forward pass:}
|
|
|
|
\For{$Q_1 \in \left[ Q_{1,\text{center}} - \text{depth} ; Q_{1,\text{center}} + \text{depth} \right]$}{
|
|
\For{$Q_2 \in \left[ Q_{2,\text{center}} - \text{depth}; Q_{2,\text{center}} + \text{depth} \right] $}{
|
|
Compute possible splits from predecessors $(Q_1 - 1, Q_2)$ and $(Q_1, Q_2 - 1)$
|
|
Fit models with the block membership changes
|
|
Compare and keep the best model based on BIC-L
|
|
}
|
|
}
|
|
|
|
\BlankLine
|
|
\textbf{Backward pass:}
|
|
|
|
\For{$Q_1 \in \left[ Q_{1,\text{center}} + \text{depth} ; Q_{1,\text{center}} - \text{depth} \right]$}{
|
|
\For{$Q_2 \in \left[ Q_{2,\text{center}} + \text{depth}; Q_{2,\text{center}} - \text{depth} \right] $}{
|
|
Compute possible merges from predecessors $(Q_1 + 1, Q_2)$ and $(Q_1, Q_2 + 1)$
|
|
Fit models with the block membership changes
|
|
Compare and keep the best model based on BIC-L
|
|
}
|
|
}
|
|
|
|
\BlankLine
|
|
Update the best model based on the maximum BIC-L
|
|
}
|
|
|
|
\BlankLine
|
|
\textbf{Output:} Best model with maximum BIC-L in the window
|
|
\end{algorithm}
|
|
|
|
\begin{figure}[H]
|
|
\definecolor{mypurple}{RGB}{128,0,128}
|
|
\begin{subfigure}[b]{0.48\textwidth}
|
|
\begin{tikzpicture}[scale=1.5]
|
|
\tikzstyle{model}=[circle,draw=none,fill=gray]
|
|
\tikzstyle{split}=[>=stealth,->,thick, draw=blueind]
|
|
\tikzstyle{merge}=[>=stealth,->,thick, draw=red]
|
|
\draw[step=1cm, help lines] (-2,-2) grid (2,2);
|
|
\node[model] (mode) at (0,0) {{\color{red}X}};
|
|
|
|
\draw[color=red, line width=1pt, dashed] (-1.5,-1.5) rectangle ++(3,3);
|
|
|
|
\node[model] (bottom_left) at (-1,-1) {};
|
|
\node[model, draw=blue] (row_1) at (0,-1) {};
|
|
\node[model, draw=blue] (col_1) at (-1,0) {};
|
|
\node[model, draw=blue] (row_2) at (1,-1) {};
|
|
\node[model, draw=blue] (col_2) at (-1,1) {};
|
|
\node[model, draw=blue] (mode) at (0,0) {{\color{red}X}};
|
|
\node[model, draw=blue] (row_3) at (1,0) {};
|
|
\node[model, draw=blue] (col_3) at (0,1) {};
|
|
\node[model, draw=blue] (top_right) at (1,1) {};
|
|
|
|
\draw[split] (bottom_left) -- (col_1);
|
|
\draw[split] (-1.75,0) -- (col_1);
|
|
\draw[split] (bottom_left) -- (row_1);
|
|
\draw[split] (0,-1.75) -- (row_1);
|
|
|
|
\draw[split] (col_1) -- (col_2);
|
|
\draw[split] (-1.75,1) -- (col_2);
|
|
\draw[split] (row_1) -- (row_2);
|
|
\draw[split] (1,-1.75) -- (row_2);
|
|
\draw[split] (row_1) -- (mode);
|
|
\draw[split] (col_1) -- (mode);
|
|
|
|
\draw[split] (col_2) -- (col_3);
|
|
\draw[split] (row_2) -- (row_3);
|
|
\draw[split] (mode) -- (row_3);
|
|
\draw[split] (mode) -- (col_3);
|
|
|
|
\draw[split] (col_3) -- (top_right);
|
|
\draw[split] (row_3) -- (top_right);
|
|
\end{tikzpicture}
|
|
\caption[forward]{Visualisation of a forward pass of moving window}\label{fig:visualisation-forward-pass}
|
|
\end{subfigure}
|
|
\hfill
|
|
\begin{subfigure}[b]{0.48\textwidth}
|
|
\begin{tikzpicture}[scale=1.5]
|
|
\tikzstyle{model}=[circle,draw=none,fill=gray]
|
|
\tikzstyle{split}=[>=stealth,->,thick, draw=blueind]
|
|
\tikzstyle{merge}=[>=stealth,->,thick, draw=red]
|
|
\draw[step=1cm, help lines] (-2,-2) grid (2,2);
|
|
\draw[color=red, line width=1pt, dashed] (-1.5,-1.5) rectangle ++(3,3);
|
|
|
|
\node[model, draw=mypurple] (top_right) at (1,1) {};
|
|
\node[model, draw=mypurple] (row_3) at (1,0) {};
|
|
\node[model, draw=mypurple] (col_3) at (0,1) {};
|
|
\node[model, draw=mypurple] (row_2) at (1,-1) {};
|
|
\node[model, draw=mypurple] (col_2) at (-1,1) {};
|
|
\node[model, draw=mypurple] (mode) at (0,0) {{\color{red}X}};
|
|
\node[model, draw=red] (bottom_left) at (-1,-1) {};
|
|
\node[model, draw=mypurple] (row_1) at (0,-1) {};
|
|
\node[model, draw=mypurple] (col_1) at (-1,0) {};
|
|
|
|
\draw[merge] (1,1.75) -- (top_right);
|
|
\draw[merge] (1.75,1) -- (top_right);
|
|
\draw[merge] (0,1.75) -- (col_3);
|
|
\draw[merge] (1.75,0) -- (row_3);
|
|
\draw[merge] (1.75,-1) -- (row_2);
|
|
\draw[merge] (-1,1.75) -- (col_2);
|
|
|
|
\draw[merge] (top_right) -- (col_3);
|
|
\draw[merge] (top_right) -- (row_3);
|
|
\draw[merge] (col_3) -- (col_2);
|
|
\draw[merge] (row_3) -- (row_2) ;
|
|
\draw[merge] (row_3) -- (mode);
|
|
\draw[merge] (col_3) -- (mode);
|
|
\draw[merge] (col_2) --(col_1);
|
|
\draw[merge] (row_2) -- (row_1);
|
|
\draw[merge] (mode) -- (row_1);
|
|
\draw[merge] (mode) -- (col_1);
|
|
\draw[merge] (col_1) -- (bottom_left);
|
|
\draw[merge] (row_1) -- (bottom_left);
|
|
\end{tikzpicture}
|
|
\caption[forward]{Visualisation of a backward pass of moving window}\label{fig:visualisation-backward-pass}
|
|
\end{subfigure}
|
|
\caption{Moving window procedure, the center node marked with an {\color{red}X} is the mode of BIC-L}\label{fig:moving-window-procedure}
|
|
\end{figure}
|
|
|
|
\paragraph*{Forward pass} The forward pass consists for a model at $(Q_1, Q_2)$
|
|
to compute the possible splits from the block memberships of its ``predecessors``.
|
|
The predecessors are the point at the left $(Q_1 - 1, Q_2)$ and below
|
|
$(Q_1, Q_2 - 1)$ the current model (if they exist). To update the current model,
|
|
we take its predecessors block memberships and try to split one of the blocks in
|
|
two. Then the current model is fitted using this clustering as a starting
|
|
clustering. Once all the possible splits are fitted, they are compared, keeping
|
|
the best, in the sense of the BIC-L. If a model was already present it is also
|
|
compared and the best is chosen as the model for this round at $(Q_1, Q_2)$.\\
|
|
The procedure then repeats for the point at $(Q_1 + 1, Q_2)$ until it reaches
|
|
$(Q_{1,center} + depth, Q_2)$ from which it repeats from
|
|
$(Q_{1,center} - depth, Q_2 + 1)$. This repeats until computing the best model
|
|
for $(Q_{1,center} + depth, Q_{2,center} + depth)$.
|
|
\textit{Note on the initialization:} The forward pass starts from the point
|
|
$(Q_{1,center} + depth, Q_{2,center} + depth)$, so this points needs to have at
|
|
least a model fitted. In the best case, the greedy exploration will have visited
|
|
this point. But if the point has not been visited, a model will be fitted from
|
|
a spectral initialization (i.e the block memberships is computed by using a
|
|
spectral clustering). From this point, the next model will have at least one
|
|
predecessor and the procedure can iterate.
|
|
|
|
\paragraph*{Backward pass} The backward pass consists for a model at $(Q_1, Q_2)$
|
|
to compute the possible merges from the block memberships of its ``predecessors``.
|
|
The predecessors are the point at the right $(Q_1 + 1, Q_2)$ and on top
|
|
$(Q_1, Q_2 + 1)$ of the current model (if the predecessors exist). To update the
|
|
current model, we take its predecessors block memberships and try to merge two
|
|
blocks in one. Then the current model is fitted using this clustering as a starting clustering. Once all the possible merges are fitted, they are
|
|
compared, keeping the best, in the sense of the BIC-L.
|
|
If a model was already present it is also
|
|
compared and the best is chosen as the model for this round at $(Q_1, Q_2)$.\\
|
|
The procedure then repeats for the point at $(Q_1 - 1, Q_2)$ until it reaches
|
|
$(Q_{1,center} - depth, Q_2)$ from which it repeats from
|
|
$(Q_{1,center} - depth, Q_2 - 1)$. This repeats until computing the best model
|
|
for ($Q_{1,center} - depth, Q_{2,center} - depth$).
|
|
\textit{Note on the initialization:} The backward pass starts from
|
|
$(Q_{1,center} + depth, Q_{2,center} + depth)$, we know it was initialized at
|
|
least by the forward pass, no special case here.\\
|
|
|
|
At the end of the moving window pass, the model of max BIC-L is the new best
|
|
fit and the procedure can repeat until convergence.
|
|
|
|
\section{Networks clustering}
|
|
\label{sec:networks-clustering}
|
|
As in~\cite{chabert-liddellLearningCommonStructures2023} we use a recursive
|
|
algorithm to determine the best clustering of the given networks. The procedure
|
|
being the same, we will present it briefly and focus on adjustments.
|
|
|
|
When networks in a collection do not share the same mesoscale connectivity
|
|
structure we want to be able to partition them correctly. For this we perform a
|
|
clustering of networks.
|
|
|
|
The process of clustering a collection of networks involves discovering a
|
|
partition $\mathcal{G} = (\mathcal{M}_g)_{g=1,\dots,G}$ of $\{1,\dots, M\}$.
|
|
Given $\mathcal{G}$ we set the following model on $\bm{X}$:
|
|
|
|
\[
|
|
\forall g \in \{1,\dots, G\}, \forall m \in \mathcal{M}_g, X^m \sim \mathcal{F}\text{-}BiSBM(Q_1^g, Q_2^g, \bm{\pi^m, \rho^m,} \bm{\alpha}^g)
|
|
\]
|
|
|
|
And we defined the score of a given partition $\mathcal{G}$:
|
|
\[
|
|
Sc(\mathcal{G}) = \sum_{g=1}^{G} \max_{Q^g=1,\dots,Q_{\max}} \text{BIC-L}((X^m)_{m\in\mathcal{M}_g},Q_1^g, Q_2^g)
|
|
\]
|
|
Thus the score consists of the sum of the BIC-L of the sub-collections for the
|
|
partition $\mathcal{G}$.
|
|
|
|
\subsection{Dissimilarity between two networks}
|
|
\label{ssec:dissimilarity-between-two-networks}
|
|
The parameters for the dissimilarity are defined as follow:
|
|
\begin{align*}
|
|
\widetilde{n}_{qr}^m = \sum_{i=1}^{n_1^m} \sum_{j=1}^{n_2^m} \widehat{\tau}_{iq}^{1,m} \widehat{\tau}_{jr}^{2,m},
|
|
& & \widetilde{\alpha}_{qr}^m = \frac{\sum_{i=1}^{n_1^m} \sum_{j=1}^{n_2^m} \widehat{\tau}_{iq}^{1,m} \widehat{\tau}_{jr}^{2,m} X_{ij}^m}{\widetilde{n}_{qr}^m}, \\
|
|
\widetilde{\pi}_q^m = \frac{\sum_{i=1}^{n_1^m} \widehat{\tau}_{iq}^{1,m}}{n_1^m},
|
|
& & \widetilde{\rho}_r^m = \frac{\sum_{j=1}^{n_2^m} \widehat{\tau_{jr}}^{2,m}}{n_2^m}
|
|
\end{align*}
|
|
And the dissimilarity between any pair of networks $(m,m')\in\mathcal{M}^2$ is then:
|
|
\[
|
|
D_{\mathcal{M}}(m,m') = \sum_{q = 1}^{Q_1} \sum_{r = 1}^{Q_2} \max(\widetilde{\pi}_{q}^{m}, \widetilde{\pi}_{q}^{m'}) \left( \widetilde{\alpha}_{qr}^{m} - \widetilde{\alpha}_{qr}^{m'}\right)^{2} \max(\widetilde{\rho}_{r}^{m}, \widetilde{\rho}_{r}^{m'})
|
|
\]
|
|
|
|
\begin{figure}[H]
|
|
\centering
|
|
\begin{tikzpicture}
|
|
\tikzstyle{instruct}=[font=\small, text justified, rectangle,draw,fill=yellow!50]
|
|
\tikzstyle{first_col}=[rectangle, text justified, draw,fill=gray!50]
|
|
\tikzstyle{second_col}=[scale=0.55, circle, draw,fill=red!50]
|
|
\tikzstyle{test}=[font=\small, text justified, diamond, aspect=2.5,thick,
|
|
draw=blue,fill=yellow!50,text=blue]
|
|
\tikzstyle{es}=[font=\small, text justified, rectangle,draw,rounded corners=4pt,fill=cyanind!25]
|
|
|
|
\node[es] (liste) at (0,4) {Supply a collection to partition};
|
|
\node[instruct, text width=5cm, below = 0.45cm of liste] (1-collection) {Fit \emph{colBiSBM}};
|
|
\node[first_col, right = 0.5cm of 1-collection] (1-col-obj) {};
|
|
\node[instruct, text width=5cm, below = 0.45cm of 1-collection] (dissimi) {Compute a dissimilarity matrix over the collection};
|
|
\node[instruct, text width=5cm, below = 0.45cm of dissimi] (2-sous-collection) {Split the \emph{collection in 2 sub-collections} and fit the \emph{colBiSBM}};
|
|
\node[second_col, right = 0.25cm of 2-sous-collection] (1-sec-col-obj) {1};
|
|
\node[second_col, right = 0.25cm of 1-sec-col-obj] (1-sec-col-obj) {2};
|
|
\node[test,below = 0.45cm of 2-sous-collection, scale=0.7] (BICL-test) {$\sum_{i=1}^{2} (\text{BIC-L}(\tikz[baseline=-0.25cm]{\node[second_col] {i};} )) > \text{BIC-L}(\tikz[baseline=-0.25cm]{\node[first_col] {};})$?};
|
|
\node[es, right = 0.55cm of BICL-test] (sortie) {Output \tikz{\node[rectangle, draw, fill=gray!50, rounded corners=0pt] {};}};
|
|
\node[es, left = 0.45cm of dissimi, text width = 2cm] (recursion) {Loop over \tikz{\node[second_col] {1};} and \tikz{\node[second_col] {2};} };
|
|
|
|
\tikzstyle{suite}=[->,>=stealth,thick,rounded corners=4pt]
|
|
\draw[suite] (liste) -- (1-collection);
|
|
\draw[suite] (1-collection) -- (dissimi);
|
|
\draw[suite] (dissimi) -- (2-sous-collection);
|
|
\draw[suite] (2-sous-collection) -- (BICL-test);
|
|
\draw[suite] (BICL-test) -| node[near start, above, fill=none] {Yes} (recursion);
|
|
\draw[suite] (recursion.north) |- (1-collection.west);
|
|
\draw[suite] (BICL-test) -- node[near start, above, fill=none] {No} (sortie);
|
|
\end{tikzpicture}
|
|
\caption{Network clustering procedure}
|
|
\label{fig:netclustering-procedure}
|
|
\end{figure}
|
|
|
|
The above figure (\ref{fig:netclustering-procedure}) shows a condensed
|
|
explanation of the network clustering algorithm.
|
|
|
|
The idea is to adjust the \emph{colBiSBM} model over the full collection of $M$
|
|
networks and then compute the dissimilarity matrix between all networks of the
|
|
collection. We obtain the collection $\mathcal{G} = \{\mathcal{M}\}$ the
|
|
trivial partition in a unique group.
|
|
|
|
Then using the \emph{Kmeans} we split the collection in two sub-collections
|
|
with the dissimilarity matrix. The two sub-collections are fitted and we
|
|
compute the score of this new partition $\mathcal{G}^{*} = \{G_1, G_2\}$.
|
|
|
|
If $Sc(\mathcal{G}^{*}) > Sc(\mathcal{G})$ then we repeat the same procedure on
|
|
$G_1$ and $G_2$. Else we return $\mathcal{G}$.
|
|
|
|
We illustrate our capacity to perform a partition of a collection for all
|
|
colBiSBM models in \ref{sec:network-clustering-of-simulated-networks}.
|
|
|
|
\chapter{Simulation studies}\label{chap:simulation-studies}
|
|
|
|
The below simulations are meant to test the capacities of our models. We assess
|
|
the inference capacities of the algorithm and method, the model selection
|
|
performances and the clustering capacities.
|
|
|
|
\paragraph{Reproducibility} All the codes used to obtain data and to perform the analysis can be found on
|
|
the report repository at
|
|
\url{https://gitea.polarolouis.fr/polarolouis/rapport-CEI-MIA-2023}.
|
|
|
|
\include{Rcodes/simulation/inference_analyze}
|
|
\include{Rcodes/simulation/model_selection_analyze}
|
|
\include{Rcodes/simulation/netclustering_analyze}
|
|
|
|
\chapter{Applications}
|
|
\include{Rcodes/real_data/application_dore}
|
|
\include{Rcodes/real_data/CoOPLBM_completion_analyze}
|
|
|
|
\printbibliography
|
|
\listoffigures
|
|
\listoftables
|
|
\end{document} |