566 lines
No EOL
25 KiB
TeX
566 lines
No EOL
25 KiB
TeX
\documentclass[12pt,a4paper]{report}
|
|
|
|
%====En-tête====
|
|
% Ajout des packages
|
|
\usepackage[english]{babel} % pour dire que le texte est en francais
|
|
\usepackage{a4} % pour la taille
|
|
\usepackage[T1]{fontenc} % pour les font postscript
|
|
\usepackage[cyr]{aeguill} % Police vectorielle TrueType, guillemets francais
|
|
\usepackage{epsfig} % pour gérer les images
|
|
\usepackage{amsmath,amsthm} % très bon mode mathématique
|
|
\usepackage{amsfonts,amssymb,bm, bbold}% permet la definition des ensembles
|
|
\usepackage{algorithm2e} % pour les algorithmes
|
|
\usepackage{algpseudocode} % pour les algorithmes
|
|
\usepackage{float} % pour le placement des figure
|
|
\usepackage{url} % pour une gestion efficace des url
|
|
\usepackage[colorlinks,citecolor=blueind,urlcolor=blue,bookmarks=false,hypertexnames=true]{hyperref} % pour les hyperliens dans le document
|
|
\usepackage{tocbibind} % Pour avoir des index pour table des matières, biblio
|
|
\usepackage{tikz} % For graph plots
|
|
|
|
%% Bibliography
|
|
\usepackage[style=apa,citestyle=authoryear-comp]{biblatex}
|
|
\addbibresource{references.bib}
|
|
|
|
|
|
%% Tikz Related
|
|
\usetikzlibrary{calc,shapes,backgrounds,arrows,automata,shadows,positioning}
|
|
\usetikzlibrary{arrows,shapes,positioning,shadows,trees,calc,backgrounds,automata,positioning}
|
|
|
|
|
|
|
|
\tikzset{
|
|
basic/.style = {draw, text width=3cm, font=\sffamily, rectangle},
|
|
root/.style = {basic, rounded corners=2pt, thin, align=center,
|
|
fill=green!30},
|
|
level 2/.style = {basic, rounded corners=6pt, thin,align=center, fill=green!60,
|
|
text width=8em},
|
|
level 3/.style = {basic, thin, align=left, fill=pink!60, text width=3.5cm}
|
|
}
|
|
|
|
|
|
% pour tickz multilevel
|
|
\definecolor{redorg}{RGB}{215,48,39}
|
|
\definecolor{orangeorg}{RGB}{253,174,97}
|
|
|
|
\definecolor{blueind}{RGB}{69,117,233}
|
|
\definecolor{cyanind}{RGB}{116,173,209}
|
|
\definecolor{electricblue}{RGB}{125, 249, 255}
|
|
|
|
\definecolor{greenind}{RGB}{112,130,56}
|
|
|
|
\definecolor{burntorange}{RGB}{204, 85, 0}
|
|
\definecolor{goldenyellow}{RGB}{255, 192, 0}
|
|
\definecolor{peach}{RGB}{255, 229, 180}
|
|
|
|
\definecolor{gray}{RGB}{128,128,128}
|
|
|
|
% Nouvelles commandes
|
|
\newcommand{\Tau}{\mathcal{T}}
|
|
|
|
% titre et auteur
|
|
\title{Rapport de stage dans l'UMR MIA Paris-Saclay}
|
|
\author{Louis Lacoste}
|
|
|
|
\begin{document}
|
|
\maketitle
|
|
\tableofcontents
|
|
|
|
\chapter{Présentation de l'UMR}
|
|
|
|
\chapter{Context}
|
|
|
|
\section{Usage and importance of bipartite graphs}
|
|
|
|
Bipartite graphs, denoted as $G = (U,V,E)$ with $U$ and $V$ two disjoint and
|
|
independent sets of vertices and $E$ the set of edges connecting $U$ vertices to
|
|
$V$ vertices.
|
|
|
|
\begin{minipage}{0.5\linewidth}
|
|
\centering
|
|
Bipartite network\\
|
|
\begin{tikzpicture}[scale=.6]
|
|
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt]
|
|
\tikzstyle{every state}=[draw, text=white,scale=0.95, transform shape]
|
|
\tikzstyle{every state}=[draw=none,text=white,scale=0.75, transform shape]
|
|
\tikzstyle{every node}=[fill=blueind]
|
|
|
|
\node[state, draw=black!50] (A1) at (0,5) {\textbf{R1}};
|
|
\node[state, draw=black!50] (A2) at (2.5,5) {\textbf{R2}};
|
|
\node[state, draw=black!50] (A3) at (5,5) {\textbf{R3}};
|
|
|
|
\tikzstyle{every node}=[fill=greenind, shape=rectangle]
|
|
\tikzstyle{every state}=[draw=none,text=white,scale=0.75, transform shape, shape=rectangle]
|
|
\node[state, draw=black!50] (B1) at (0,0) {\textbf{C1}};
|
|
\node[state, draw=black!50] (B2) at (1.25,0) {\textbf{C2}};
|
|
\node[state, draw=black!50] (B3) at (2.5,0) {\textbf{C3}};
|
|
\node[state, draw=black!50] (B4) at (3.75,0) {\textbf{C4}};
|
|
\node[state, draw=black!50] (B5) at (5,0) {\textbf{C5}};
|
|
\path (A1) edge [] (B1);
|
|
\path (A1) edge (B2);
|
|
\path (A1) edge (B3);
|
|
\path (A1) edge (B4);
|
|
\path (A2) edge (B3);
|
|
\path (A2) edge (B4);
|
|
\path (A3) edge (B5);
|
|
\path (A2) edge (B5);
|
|
\end{tikzpicture}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\linewidth}
|
|
\begin{center}
|
|
Incidence matrix
|
|
$B=\left(
|
|
\begin{array}{rrrrr}
|
|
1 & 1 & 1 & 1 & 0 \\
|
|
0 & 0 & 1 & 1 & 1 \\
|
|
0 & 0 & 0 & 0 & 1 \\
|
|
\end{array}\right)
|
|
$\\
|
|
\end{center}
|
|
\end{minipage}
|
|
|
|
|
|
|
|
This representation can be used to represent various forms of interactions were
|
|
two kinds of ''actors'' interact. Those interactions can be binary or valued and
|
|
a numeric representation is the incidence matrix, in the above example $B$.\\
|
|
|
|
Among the use case of bipartite graphs one can find the Netflix Problem, which
|
|
was a prize organized by Netflix to improve its Recommender system. The row
|
|
nodes are the movies and the columns are the user, at the intersection the value
|
|
is the review of the user $j$ for the movie $i$.\\
|
|
|
|
Another use is the representation of ecological interactions like
|
|
plant-pollinator \parencite{ramos-jilibertoTopologicalChangeAndean2010}, birds-seed
|
|
dispersion, prey-predator or
|
|
host-parasite \parencite{kaszewska-gilasGlobalStudiesHostParasite2021}.
|
|
In those cases, the rows are pollinator species and the columns are plant
|
|
species, and the intersection is a value, binary if it is a presence/absence or
|
|
a value if it is an abundance count.
|
|
|
|
Bipartite graphs are widely used in biology, in various fields, among which the
|
|
previously cited ecological networks, but also in medicine with biomedical
|
|
networks, biomolecular networks or epidemiological
|
|
networks. \parencite{pavlopoulosBipartiteGraphsSystems2018}
|
|
|
|
|
|
Some interesting results can arise when applying a tool widely used on a particular
|
|
kind of interactions is used on another kind of interactions. Companies like
|
|
Netflix use recommender system, to recommend another product to consumers based
|
|
on their previous interactions.
|
|
In \cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the
|
|
\emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing
|
|
preys for predators in a predator-prey network.
|
|
|
|
\section{Latent Block Model}
|
|
|
|
The Latent Block Model (LBM) introduced by \cite{govaertLatentBlockModel2010}
|
|
adapts the Stochastic Block Model (SBM)
|
|
(\cite{hollandStochasticBlockmodelsFirst1983};\cite{snijdersEstimationPredictionStochastic1997})
|
|
to bipartite graphs.
|
|
|
|
\begin{small}
|
|
Please note that we prefer the term ''BiSBM'' and will use both LBM and BiSBM to
|
|
designate the Stochastic Block model applied on bipartite networks.
|
|
\end{small}
|
|
|
|
This model supposes that:
|
|
\begin{itemize}
|
|
\item Row nodes are members of row blocks and column nodes are members of
|
|
column blocks.
|
|
\item The connectivity of two individuals is determined by their block
|
|
memberships.
|
|
\item An interaction can only occur between a row and a column node.
|
|
\end{itemize}
|
|
|
|
\begin{figure}[H]
|
|
\center
|
|
\begin{tikzpicture}[scale=.6]
|
|
\tikzstyle{every state}=[draw, text=white,scale=0.95, transform shape]
|
|
\tikzstyle{every state}=[draw=none,text=white,scale=0.75, transform shape]
|
|
\tikzset{edge_proba/.style={draw=white, fill=none, text=black}}
|
|
|
|
\tikzstyle{every node}=[fill=blueind]
|
|
\node[edge_proba] (pi1) at (1,5.7) {\textbf{$\pi_{{\color{blueind}\bullet}}$}};
|
|
\node[state, draw=black!50] (R11) at (0,5) {\textbf{R11}};
|
|
\node[state, draw=black!50] (R12) at (1,5) {\textbf{R12}};
|
|
\node[state, draw=black!50] (R13) at (2,5) {\textbf{R13}};
|
|
|
|
\tikzstyle{every node}=[fill=cyanind]
|
|
\node[edge_proba] (pi2) at (6.75,5.7) {\textbf{$\pi_{{\color{cyanind}\bullet}}$}};
|
|
\node[state, draw=black!50] (R21) at (6.25,5) {\textbf{R21}};
|
|
\node[state, draw=black!50] (R22) at (7.25,5) {\textbf{R22}};
|
|
|
|
\tikzstyle{every node}=[fill=electricblue]
|
|
\node[edge_proba] (pi3) at (10,5.7) {\textbf{$\pi_{{\color{electricblue}\bullet}}$}};
|
|
\node[state, draw=black!50] (R31) at (10,5) {\textbf{R31}};
|
|
|
|
\tikzstyle{every node}=[fill=burntorange, shape=rectangle]
|
|
\node[edge_proba] (pi3) at (0.5,-0.7) {\textbf{$\rho_{{\color{burntorange}\bullet}}$}};
|
|
\tikzstyle{every state}=[draw=none,text=white,scale=0.75, transform shape, shape=rectangle]
|
|
\node[state, draw=black!50] (B1) at (0,0) {\textbf{C11}};
|
|
\node[state, draw=black!50] (B2) at (1,0) {\textbf{C12}};
|
|
\tikzstyle{every node}=[fill=goldenyellow, shape=rectangle]
|
|
\node[edge_proba] (pi3) at (4,-0.7) {\textbf{$\rho_{{\color{goldenyellow}\bullet}}$}};
|
|
\node[state, draw=black!50] (B3) at (3.5,0) {\textbf{C21}};
|
|
\node[state, draw=black!50] (B4) at (4.5,0) {\textbf{C22}};
|
|
\tikzstyle{every node}=[fill=peach, shape=rectangle]
|
|
\node[edge_proba] (pi3) at (10,-0.7) {\textbf{$\rho_{{\color{peach}\bullet}}$}};
|
|
\node[state, draw=black!50] (B5) at (10,0) {\textbf{C31}};
|
|
|
|
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt,draw opacity=0.2]
|
|
|
|
\path (R11) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[left, fill=none] {$\alpha_{{\color{blueind}\bullet}{\color{burntorange}\bullet}}$} (B1);
|
|
\path (R11) edge (B2);
|
|
\path (R11) edge (B3);
|
|
\path (R11) edge (B4);
|
|
|
|
\path (R12) edge [] (B1);
|
|
\path (R12) edge (B2);
|
|
\path (R12) edge (B3);
|
|
\path (R12) edge (B4);
|
|
|
|
\path (R13) edge [] (B1);
|
|
\path (R13) edge (B2);
|
|
\path (R13) edge (B3);
|
|
\path (R13) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, left, fill=none] {$\alpha_{{\color{blueind}\bullet}{\color{goldenyellow}\bullet}}$} (B4);
|
|
|
|
\path (R21) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{cyanind}\bullet}{\color{goldenyellow}\bullet}}$} (B3);
|
|
\path (R21) edge (B4);
|
|
\path (R21) edge (B5);
|
|
|
|
\path (R22) edge (B3);
|
|
\path (R22) edge (B4);
|
|
\path (R22) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, left, fill=none] {$\alpha_{{\color{cyanind}\bullet}{\color{peach}\bullet}}$} (B5);
|
|
|
|
\path (R31) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{electricblue}\bullet}{\color{peach}\bullet}}$} (B5);
|
|
|
|
\end{tikzpicture}
|
|
\caption{An LBM model visualization}
|
|
\label{fig:LBMvisu}
|
|
\end{figure}
|
|
|
|
Parameters
|
|
\begin{itemize}
|
|
\item $\mathcal{K}_1 = \{{\color{blueind}\bullet},{\color{cyanind}\bullet},{\color{electricblue}\bullet}\}$ blocks in rows
|
|
\item $\mathcal{K}_2 = \{{\color{burntorange}\bullet},{\color{goldenyellow}\bullet},{\color{peach}\bullet}\}$ blocks in columns
|
|
\item $\pi_{\bullet} = \mathbb{P}(i\in\bullet)$ in row et $\rho_{\bullet} = \mathbb{P}(j\in\bullet)$ in column
|
|
\item $\alpha_{{\color{blueind}\bullet}{\color{burntorange}\bullet}} = \mathbb{P}(i \leftrightarrow j | i \in {\color{blueind}\bullet}, j \in {\color{burntorange}\bullet})$ connectivity probability between two nodes, given their clustering
|
|
\end{itemize}
|
|
|
|
On \ref{fig:LBMvisu}, $\pi$ are the probabilities for a row node to belong to
|
|
the row block of corresponding color, $\rho$ are the probabilities for a column
|
|
node to belong to the column block of corresponding color and $\alpha$ are the
|
|
connectivity parameters between the row and column blocks.
|
|
|
|
This model can be used to easily generate bipartite graphs with complex and very
|
|
varied structures. But when trying to determine the structure of a given network
|
|
we need to find those parameters.
|
|
|
|
For this a common approach is to use a VEM algorithm
|
|
(proposed for SBM in \cite{daudinMixtureModelRandom2008} and for LBM in \cite{govaertEMAlgorithmBlock2005})
|
|
those groups and the required parameters can be inferred by maximizing a lower
|
|
bound of the likelihood minus a penalty.
|
|
|
|
\section{colSBM model, a joint model for a collection of networks}
|
|
|
|
The \emph{colSBM} model introduced by \cite{chabert-liddellLearningCommonStructures2023}
|
|
propose an extension of the SBM model to collections of SBMs. A collection is a
|
|
set of networks which nodes are not common or linked between different networks,
|
|
the interactions have the same valuations and are of the same type.
|
|
|
|
The model can retrieve the shared structure in a collection, indicate
|
|
if networks should be grouped in a collection and in a large pool of networks,
|
|
collections with common structures.
|
|
|
|
The next step after designing this collection model for unipartite was to adapt
|
|
it to the bipartite case.
|
|
|
|
\chapter{Adjustment of colSBM to the bipartite case: colBiSBM}
|
|
|
|
\section{Definition of the model}
|
|
Here are some common notations and conventions that we will use in the following
|
|
sections.
|
|
|
|
\subsection{A collection of i.i.d Bipartite SBM}
|
|
|
|
As for \emph{colSBM} this first model is the most constrained. It assumes
|
|
that all the networks are the independent realizations of the same $Q_1$-$Q_2$-BiSBM
|
|
with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
|
|
|
|
\begin{align}
|
|
\tag{\emph{iid-colBiSBM}}
|
|
X^m \sim \mathcal{F}-BiSBM_{n_1,n_2} (Q_1, Q_2, \bm{\pi}, \bm{\rho}, \bm{\alpha}), \forall m = 1, \dots M,
|
|
\end{align}
|
|
|
|
|
|
\section{Variational Expectation step}
|
|
Fixed point formula for the Bernoulli distribution:
|
|
\begin{itemize}
|
|
\item[-] \textit{iid} :
|
|
\[ \bm{\tau}^{m,1} = ~^{t}\pi + \exp((\text{Mask}^{m} \odot A^{m})
|
|
\bm{\tau}^{m,2} ~^{t}(\text{logit}(\alpha)) + \text{Mask}^{m}
|
|
\bm{\tau}^{m,2} ~^{t}\log(\bm{1} - \alpha)) \]
|
|
\[ \bm{\tau}^{m,2} = ~^{t}\rho + \exp(~^{t}(\text{Mask}^{m} \odot A^{m})
|
|
\bm{\tau}^{m,1} \text{logit}(\alpha) + ~^{t}\text{Mask}^{m}
|
|
\bm{\tau}^{m,1} \log(\bm{1} - \alpha)) \]
|
|
\item[-] $\rho\pi$ :
|
|
\[ \bm{\tau}^{m,1} = ~^{t}\pi^{m} + \exp((\text{Mask}^{m} \odot A^{m})
|
|
\bm{\tau}^{m,2} ~^{t}(\text{logit}(\alpha)) + \text{Mask}^{m}
|
|
\bm{\tau}^{m,2} ~^{t}\log(\bm{1} - \alpha)) \]
|
|
\[ \bm{\tau}^{m,2} = ~^{t}\rho^{m} + \exp(~^{t}(\text{Mask}^{m} \odot A^{m})
|
|
\bm{\tau}^{m,1} \text{logit}(\alpha) + ~^{t}\text{Mask}^{m}
|
|
\bm{\tau}^{m,1} \log(\bm{1} - \alpha)) \]
|
|
\end{itemize}
|
|
|
|
with $\text{Mask}^{m}$ the matrix containing $0$ if the value is a NA and a 1
|
|
otherwise.
|
|
|
|
\section{M step of the algorithm}
|
|
|
|
Incorporate the equations from \parencite{chabert-liddellLearningCommonStructures2023}
|
|
|
|
\section{Computation of the variational bound}
|
|
|
|
\section{Penalties}
|
|
|
|
\paragraph*{\textit{iid-colBiSBM}}
|
|
For the \textit{iid-colBiSBM} the penalties were modified in the following way :
|
|
|
|
\begin{itemize}
|
|
\item For the $\pi$s and $\rho$s:
|
|
\[\text{pen}_{\pi}(Q_1) = (Q_1 - 1)\log(\sum_{m=1}^{M}n_{r}^{(m)})\]
|
|
\[\text{pen}_{\rho}(Q_2) = (Q_2 - 1)\log(\sum_{m=1}^{M}n_{c}^{(m)})\]
|
|
\item For the $\alpha$s :
|
|
\[\text{pen}_{\alpha}(Q_1, Q_2) = Q_1 \times Q_2 \log(N_M)\]
|
|
avec
|
|
\[ N_M = \sum_{m = 1}^{M} n_{r}^{(m)} \times n_{c}^{(m)} \]
|
|
\end{itemize}
|
|
And thus the $\text{BIC-L}$ formula is now:
|
|
\[ \text{BIC-L}(\bm{X},Q_1, Q_2) = \max_{\theta} \mathcal{J} (\mathcal{\hat{R}}, \bm{\theta})
|
|
- \frac{1}{2} [\text{pen}_{\pi}(Q_1) + \text{pen}_{\rho}(Q_2) + \text{pen}_{\alpha}(Q_1, Q_2)]\]
|
|
|
|
\paragraph*{\textit{$\rho\pi$-colBiSBM}}
|
|
For the \textit{$\rho\pi$-colBiSBM} the penalties are the following:
|
|
|
|
\begin{itemize}
|
|
\item The support penalties are:
|
|
\[ \text{pen}_{S_1}(Q_1) = -2 \log p_{Q_1} (S_1) \]
|
|
\[ \text{pen}_{S_2}(Q_2) = -2 \log p_{Q_2} (S_2) \]
|
|
with
|
|
\[ \log p_{Q_1}(S_1) = - M \log(Q_1) - \sum_{m=1}^{M} \log {Q_1 \choose Q_1^{(m)}} \]
|
|
\[ \log p_{Q_2}(S_2) = - M \log(Q_2) - \sum_{m=1}^{M} \log {Q_2 \choose Q_2^{(m)}} \]
|
|
\item Penalties for the $\rho$s and $\pi$s:
|
|
\[ \text{pen}_{\pi}(Q_1, S_1) = \sum_{m=1}^{M} (Q_{1}^{(m)} - 1) \log n_{r}^{(m)} \]
|
|
\[ \text{pen}_{\rho}(Q_2, S_2) = \sum_{m=1}^{M} (Q_{2}^{(m)} - 1) \log n_{c}^{(m)} \]
|
|
\item Penalties for the $\alpha$s:
|
|
\[ \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2) = (\sum_{q=1}^{Q_1} \sum_{r=1}^{Q_2} \mathbb{1}_{(S_1)'S_2 > 0}) \log (N_M) \]
|
|
\end{itemize}
|
|
And the corresponding BIC-L formula:
|
|
\[
|
|
\begin{aligned}
|
|
\text{BIC-L}(\bm{X},Q_1, Q_2) =
|
|
\max_{S_1,S_2} [
|
|
& \max_{\theta_{S_1,S_2} \in \Theta_{S_1,S_2}} \mathcal{J}(\mathcal{\hat{R}},\theta_{S_1,S_2})\\
|
|
- \frac{1}{2} & (\text{pen}_{\pi}(Q_1, S_1) + \text{pen}_{\rho}(Q_2, S_2)\\
|
|
&+ \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2)\\
|
|
&+ \text{pen}_{S_1}(Q_1) + \text{pen}_{S_2}(Q_2))]\\
|
|
\end{aligned}
|
|
\]
|
|
|
|
\section{Latent space exploration and model selection}
|
|
In order to explorer the bi-dimensional latent space $(Q_1,Q_2)$
|
|
we use the following strategies.
|
|
|
|
\subsection{Model selection}
|
|
In the following steps the model selection consists of using the BIC-L
|
|
criterion to select the model. We choose among the proposed models the one that
|
|
maximizes the BIC-L
|
|
|
|
\subsection{Initialization and pairing of the models}
|
|
First to combine the information from the $M$ networks we fit a collection model
|
|
for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the
|
|
previously described VEM algorithm we obtain for each network its parameters
|
|
($\rho,\pi,\alpha$).
|
|
|
|
We then compute the marginal laws for each dimension, for each network. Then
|
|
we order the network blocks by the probabilities obtained in decreasing order.
|
|
\begin{itemize}
|
|
\item For the memberships on the columns:
|
|
$col~order_m = order\left(\pi_m \times \alpha_m\right)$
|
|
\item For the memberships on the rows:
|
|
$row~order_m = order\left(\rho_m \times ~^{t}(\alpha_m)\right)$
|
|
\end{itemize}
|
|
|
|
Using this order we relabel the memberships for the $M$ fitted collection of a
|
|
single network.
|
|
Then we use the $M$ memberships to fit a collection containing the $M$ networks.
|
|
\subsection{Greedy exploration to find an estimation of the mode}
|
|
Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to
|
|
perform a greedy exploration to find a first mode.
|
|
|
|
Meaning that for a given $Q = (Q_1, Q_2)$ we will compute all the possible
|
|
memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1, Q_2),
|
|
(Q_1, Q_2 - 1)\}$, fit
|
|
the corresponding models and choose the one that maximizes the BIC-L as the
|
|
next point from which to repeat the procedure. We repeat the procedure until the
|
|
BIC-L stops increasing $2$ times in a row.
|
|
|
|
\begin{algorithm}[H]
|
|
\caption{Greedy Exploration for Mode Estimation}
|
|
\SetAlgoLined
|
|
\SetKwInOut{Input}{Input}
|
|
\SetKwInOut{Output}{Output}
|
|
|
|
\Input{Fitted models for $Q = (1,2)$ and $Q = (2,1)$}
|
|
\Output{Estimation of the mode using greedy exploration}
|
|
|
|
\BlankLine
|
|
Initialize $Q = (1,2)$ as the starting point
|
|
Initialize $\text{BIC-L}_{\text{max}}$ as the maximum achieved BIC-L value
|
|
Initialize $consecutive\_count$ as 0
|
|
|
|
\BlankLine
|
|
\While{$consecutive\_count < 2$}{
|
|
Compute possible memberships for $Q \in \{(Q_1 + 1, Q_2), (Q_1, Q_2 + 1), (Q_1 - 1, Q_2), (Q_1, Q_2 - 1)\}$\;
|
|
Fit models with the computed memberships
|
|
Choose the model with the maximum BIC-L as the next point
|
|
|
|
\BlankLine
|
|
\If{$\text{BIC-L} > \text{BIC-L}_{\text{max}}$}{
|
|
$\text{BIC-L}_{\text{max}} \leftarrow \text{BIC-L}$
|
|
$consecutive\_count \leftarrow 0$
|
|
}
|
|
\Else{
|
|
$consecutive\_count \leftarrow consecutive\_count + 1$
|
|
}
|
|
|
|
\BlankLine
|
|
$Q \leftarrow$ Next selected point
|
|
}
|
|
|
|
\BlankLine
|
|
\textbf{Output:} Estimation of the mode using greedy exploration
|
|
\end{algorithm}
|
|
|
|
When this first estimation of the BIC-L mode has been find we apply the moving
|
|
window on it.
|
|
\subsection{Moving window to update the block memberships and the BIC-L}
|
|
The \emph{moving window} is used to update the block memberships on rows and
|
|
columns and fit new models with those changes.
|
|
To define the window, we use a center point and a \emph{depth}, giving us the
|
|
bottom left corner ($Q_{1,center} - depth, Q_{2,center} - depth$) and the top right corner of the
|
|
window ($Q_{1,center} + depth, Q_{2,center} + depth$). All the points in this square will be
|
|
updated and contribute to the update of the others.
|
|
This procedure is repeated until convergence of the BIC-L.
|
|
|
|
The procedure consists of two alternating steps:
|
|
\begin{itemize}
|
|
\item the \emph{forward pass}: repeatedly computing the possible splits to
|
|
fit the current model.
|
|
\item the \emph{backward pass}: computing the possible merges to fit the current model.
|
|
\end{itemize}
|
|
|
|
|
|
\begin{algorithm}[H]
|
|
\caption{Moving Window Procedure}
|
|
\SetAlgoLined
|
|
\SetKwInOut{Input}{Input}
|
|
\SetKwInOut{Output}{Output}
|
|
|
|
\Input{Center point $(Q_{1,\text{center}}, Q_{2,\text{center}})$, depth}
|
|
\Output{Best model with maximum BIC-L in the window}
|
|
|
|
\BlankLine
|
|
Define bottom left corner $(Q_{1,\text{center}} - \text{depth}, Q_{2,\text{center}} - \text{depth})$\\
|
|
Define top right corner $(Q_{1,\text{center}} + \text{depth}, Q_{2,\text{center}} + \text{depth})$
|
|
|
|
\BlankLine
|
|
\While{not converged}{
|
|
\textbf{Forward pass:}
|
|
|
|
\For{$Q_1 \in \left[ Q_{1,\text{center}} - \text{depth} ; Q_{1,\text{center}} + \text{depth} \right]$}{
|
|
\For{$Q_2 \in \left[ Q_{2,\text{center}} - \text{depth}; Q_{2,\text{center}} + \text{depth} \right] $}{
|
|
Compute possible splits from predecessors $(Q_1 - 1, Q_2)$ and $(Q_1, Q_2 - 1)$
|
|
Fit models with the block membership changes
|
|
Compare and keep the best model based on BIC-L
|
|
}
|
|
}
|
|
|
|
\BlankLine
|
|
\textbf{Backward pass:}
|
|
|
|
\For{$Q_1 \in \left[ Q_{1,\text{center}} + \text{depth} ; Q_{1,\text{center}} - \text{depth} \right]$}{
|
|
\For{$Q_2 \in \left[ Q_{2,\text{center}} + \text{depth}; Q_{2,\text{center}} - \text{depth} \right] $}{
|
|
Compute possible merges from predecessors $(Q_1 + 1, Q_2)$ and $(Q_1, Q_2 + 1)$
|
|
Fit models with the block membership changes
|
|
Compare and keep the best model based on BIC-L
|
|
}
|
|
}
|
|
|
|
\BlankLine
|
|
Update the best model based on the maximum BIC-L
|
|
}
|
|
|
|
\BlankLine
|
|
\textbf{Output:} Best model with maximum BIC-L in the window
|
|
\end{algorithm}
|
|
|
|
|
|
\paragraph*{Forward pass} The forward pass consists for a model at $(Q_1, Q_2)$
|
|
to compute the possible splits from the block memberships of its "predecessors".
|
|
The predecessors are the point at the left $(Q_1 - 1, Q_2)$ and below
|
|
$(Q_1, Q_2 - 1)$ the current model (if they exist). To update the current model,
|
|
we take its predecessors block memberships and try to split one of the blocks in
|
|
two. Then the current model is fitted using this clustering as a starting
|
|
clustering. Once all the possible splits are fitted, they are compared, keeping
|
|
the best, in the sense of the BIC-L. If a model was already present it is also
|
|
compared and the best is chosen as the model for this round at $(Q_1, Q_2)$.\\
|
|
The procedure then repeats for the point at $(Q_1 + 1, Q_2)$ until it reaches
|
|
$(Q_{1,center} + depth, Q_2)$ from which it repeats from
|
|
$(Q_{1,center} - depth, Q_2 + 1)$. This repeats until computing the best model
|
|
for $(Q_{1,center} + depth, Q_{2,center} + depth)$.
|
|
\textit{Note on the initialization:} The forward pass starts from the point
|
|
$(Q_{1,center} + depth, Q_{2,center} + depth)$, so this points needs to have at
|
|
least a model fitted. In the best case, the greedy exploration will have visited
|
|
this point. But if the point has not been visited, a model will be fitted from
|
|
a spectral initialization (i.e the block memberships is computed by using a
|
|
spectral clustering). From this point, the next model will have at least one
|
|
predecessor and the procedure can iterate.
|
|
|
|
\paragraph*{Backward pass} The backward pass consists for a model at $(Q_1, Q_2)$
|
|
to compute the possible merges from the block memberships of its "predecessors".
|
|
The predecessors are the point at the right $(Q_1 + 1, Q_2)$ and on top
|
|
$(Q_1, Q_2 + 1)$ of the current model (if the predecessors exist). To update the
|
|
current model, we take its predecessors block memberships and try to merge two
|
|
blocks in one. Then the current model is fitted using this clustering as
|
|
a starting clustering. Once all the possible merges are fitted, they are
|
|
compared, keeping the best, in the sense of the BIC-L.
|
|
If a model was already present it is also
|
|
compared and the best is chosen as the model for this round at $(Q_1, Q_2)$.\\
|
|
The procedure then repeats for the point at $(Q_1 - 1, Q_2)$ until it reaches
|
|
$(Q_{1,center} - depth, Q_2)$ from which it repeats from
|
|
$(Q_{1,center} - depth, Q_2 - 1)$. This repeats until computing the best model
|
|
for ($Q_{1,center} - depth, Q_{2,center} - depth$).
|
|
\textit{Note on the initialization:} The backward pass starts from
|
|
$(Q_{1,center} + depth, Q_{2,center} + depth)$, we know it was initialized at
|
|
least by the forward pass, no special case here.\\
|
|
|
|
At the end of the moving window pass, the model of max BIC-L is the new best
|
|
fit and the procedure can repeat until convergence.
|
|
|
|
\section{Networks clustering}
|
|
As in \parencite{chabert-liddellLearningCommonStructures2023} we use a recursive
|
|
algorithm to determine the best clustering of the given networks. The procedure
|
|
being the same, only the technical modifications for the bipartite case will be
|
|
explained below.
|
|
\subsection{Distance between two networks}
|
|
The distance weights uses $\pi$ and $\rho$.
|
|
\[
|
|
D_{\mathcal{M}}(m,m') = \sum_{q = 1}^{Q_1} \sum_{r = 1}^{Q_2} \max(\widetilde{\pi}_{q}^{m}, \widetilde{\pi}_{q}^{m'}) \left( \frac{\widetilde{\alpha}_{qr}^{m}}{\widehat{\delta}_{m}} - \frac{\widetilde{\alpha}_{qr}^{m'}}{\widehat{\delta}_{m'}}\right)^{2} \max(\widetilde{\rho}_{r}^{m}, \widetilde{\rho}_{r}^{m'})
|
|
\]
|
|
|
|
|
|
\printbibliography
|
|
\listoffigures
|
|
\listoftables
|
|
\end{document} |