Refactored rapport.tex

This commit is contained in:
Louis Lacoste 2023-06-26 22:09:00 +02:00
parent dd1ed631d3
commit 00c926bf1b
2 changed files with 155 additions and 145 deletions

Binary file not shown.

View file

@ -70,7 +70,7 @@
\chapter{Context} \chapter{Context}
\section{Usage and importance of bipartite graphs} \section{Usage and importance of bipartite graphs}
\label{sec:usage-and-importance-of-bipartite-graphs}
Bipartite graphs, denoted as $G = (U,V,E)$ with $U$ and $V$ two disjoint and Bipartite graphs, denoted as $G = (U,V,E)$ with $U$ and $V$ two disjoint and
independent sets of vertices and $E$ the set of edges connecting $U$ vertices to independent sets of vertices and $E$ the set of edges connecting $U$ vertices to
$V$ vertices. $V$ vertices.
@ -78,7 +78,7 @@ $V$ vertices.
\begin{minipage}{0.5\linewidth} \begin{minipage}{0.5\linewidth}
\centering \centering
Bipartite network\\ Bipartite network\\
\begin{tikzpicture}[scale=.6] \begin{tikzpicture}[scale=.6]
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt] \tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt]
\tikzstyle{every state}=[draw, text=black,scale=0.95, transform shape] \tikzstyle{every state}=[draw, text=black,scale=0.95, transform shape]
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape] \tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape]
@ -103,25 +103,25 @@ $V$ vertices.
\path (A2) edge (B4); \path (A2) edge (B4);
\path (A3) edge (B5); \path (A3) edge (B5);
\path (A2) edge (B5); \path (A2) edge (B5);
\end{tikzpicture} \end{tikzpicture}
\end{minipage} \end{minipage}
\begin{minipage}{0.5\linewidth} \begin{minipage}{0.5\linewidth}
\begin{center} \begin{center}
Incidence matrix Incidence matrix
$B=\left( $B=\left(
\begin{array}{rrrrr} \begin{array}{rrrrr}
1 & 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 & 0 \\
0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \\
0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 \\
\end{array}\right) \end{array}\right)
$\\ $\\
\end{center} \end{center}
\end{minipage} \end{minipage}
This representation can be used to represent various forms of interactions were This representation can be used to represent various forms of interactions were
two kinds of ''actors'' interact. Those interactions can be binary or valued and two kinds of "actors" interact. Those interactions can be binary or valued and
a numeric representation is the incidence matrix, in the above example $B$.\\ a numeric representation is the incidence matrix, in the above example $B$.\\
Among the use case of bipartite graphs one can find the Netflix Problem, which Among the use case of bipartite graphs one can find the Netflix Problem, which
@ -147,20 +147,20 @@ Some interesting results can arise when applying a tool widely used on a particu
kind of interactions is used on another kind of interactions. Companies like kind of interactions is used on another kind of interactions. Companies like
Netflix use recommender system, to recommend another product to consumers based Netflix use recommender system, to recommend another product to consumers based
on their previous interactions. on their previous interactions.
In \cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the In ~\cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the
\emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing \emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing
preys for predators in a predator-prey network. preys for predators in a predator-prey network.
\section{Latent Block Model} \section{Latent Block Model}
\label{sec:latent-block-model}
The Latent Block Model (LBM) introduced by \cite{govaertLatentBlockModel2010} The Latent Block Model (LBM) introduced by ~\cite{govaertLatentBlockModel2010}
adapts the Stochastic Block Model (SBM) adapts the Stochastic Block Model (SBM)
(\cite{hollandStochasticBlockmodelsFirst1983};\cite{snijdersEstimationPredictionStochastic1997}) (~\cite{hollandStochasticBlockmodelsFirst1983};~\cite{snijdersEstimationPredictionStochastic1997})
to bipartite graphs. to bipartite graphs.
\begin{small} \begin{small}
Please note that we prefer the term ''BiSBM'' and will use both LBM and BiSBM to Please note that we prefer the term "BiSBM" and will use both LBM and BiSBM to
designate the Stochastic Block model applied on bipartite networks. designate the Stochastic Block model applied on bipartite networks.
\end{small} \end{small}
This model supposes that: This model supposes that:
@ -236,8 +236,8 @@ This model supposes that:
\path (R31) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{electricblue}\bullet}{\color{yellow}\bullet}}$} (B5); \path (R31) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{electricblue}\bullet}{\color{yellow}\bullet}}$} (B5);
\end{tikzpicture} \end{tikzpicture}
\caption{An LBM model visualization} \caption{An LBM model visualization}
\label{fig:LBMvisu} \label{fig:LBMvisu}
\end{figure} \end{figure}
Parameters Parameters
@ -258,13 +258,13 @@ varied structures. But when trying to determine the structure of a given network
we need to find those parameters. we need to find those parameters.
For this a common approach is to use a VEM algorithm For this a common approach is to use a VEM algorithm
(proposed for SBM in \cite{daudinMixtureModelRandom2008} and for LBM in \cite{govaertEMAlgorithmBlock2005}) (proposed for SBM in ~\cite{daudinMixtureModelRandom2008} and for LBM in ~\cite{govaertEMAlgorithmBlock2005})
those groups and the required parameters can be inferred by maximizing a lower those groups and the required parameters can be inferred by maximizing a lower
bound of the likelihood minus a penalty. bound of the likelihood minus a penalty.
\section{colSBM model, a joint model for a collection of networks} \section{colSBM model, a joint model for a collection of networks}
\label{sec:colsbm-model-a-joint-model-for-a-collection-of-networks}
The \emph{colSBM} model introduced by \cite{chabert-liddellLearningCommonStructures2023} The \emph{colSBM} model introduced by ~\cite{chabert-liddellLearningCommonStructures2023}
propose an extension of the SBM model to collections of SBMs. A collection is a propose an extension of the SBM model to collections of SBMs. A collection is a
set of networks which nodes are not common or linked between different networks, set of networks which nodes are not common or linked between different networks,
the interactions have the same valuations and are of the same type. the interactions have the same valuations and are of the same type.
@ -279,11 +279,12 @@ it to the bipartite case.
\chapter{Adjustment of colSBM to the bipartite case: colBiSBM} \chapter{Adjustment of colSBM to the bipartite case: colBiSBM}
\section{Definition of the model} \section{Definition of the model}
\label{sec:definition-of-the-model}
Here are some common notations and conventions that we will use in the following Here are some common notations and conventions that we will use in the following
sections. sections.
\subsection{A collection of i.i.d Bipartite SBM} \subsection{A collection of i.i.d Bipartite SBM}
\label{ssec:a-collection-of-i-i-d-bipartite-sbm}
As for \emph{colSBM} this first model is the most constrained. It assumes As for \emph{colSBM} this first model is the most constrained. It assumes
that all the networks are the independent realizations of the same $Q_1$-$Q_2$-BiSBM that all the networks are the independent realizations of the same $Q_1$-$Q_2$-BiSBM
with identical parameters. The \emph{iid-colBiSBM} is defined as follows: with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
@ -295,6 +296,7 @@ with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
\section{Variational Expectation step} \section{Variational Expectation step}
\label{sec:variational-expectation-step}
Fixed point formula for the Bernoulli distribution: Fixed point formula for the Bernoulli distribution:
\begin{itemize} \begin{itemize}
\item[-] \textit{iid} : \item[-] \textit{iid} :
@ -317,13 +319,14 @@ with $\text{Mask}^{m}$ the matrix containing $0$ if the value is a NA and a 1
otherwise. otherwise.
\section{M step of the algorithm} \section{M step of the algorithm}
\label{sec:m-step-of-the-algorithm}
Incorporate the equations from \parencite{chabert-liddellLearningCommonStructures2023} Incorporate the equations from \parencite{chabert-liddellLearningCommonStructures2023}
\section{Computation of the variational bound} \section{Computation of the variational bound}
\label{sec:computation-of-the-variational-bound}
\section{Penalties} \section{Penalties}
\label{sec:penalties}
\paragraph*{\textit{iid-colBiSBM}} \paragraph*{\textit{iid-colBiSBM}}
For the \textit{iid-colBiSBM} the penalties were modified in the following way : For the \textit{iid-colBiSBM} the penalties were modified in the following way :
@ -338,7 +341,7 @@ For the \textit{iid-colBiSBM} the penalties were modified in the following way :
\end{itemize} \end{itemize}
And thus the $\text{BIC-L}$ formula is now: And thus the $\text{BIC-L}$ formula is now:
\[ \text{BIC-L}(\bm{X},Q_1, Q_2) = \max_{\theta} \mathcal{J} (\mathcal{\hat{R}}, \bm{\theta}) \[ \text{BIC-L}(\bm{X},Q_1, Q_2) = \max_{\theta} \mathcal{J} (\mathcal{\hat{R}}, \bm{\theta})
- \frac{1}{2} [\text{pen}_{\pi}(Q_1) + \text{pen}_{\rho}(Q_2) + \text{pen}_{\alpha}(Q_1, Q_2)]\] - \frac{1}{2} [\text{pen}_{\pi}(Q_1) + \text{pen}_{\rho}(Q_2) + \text{pen}_{\alpha}(Q_1, Q_2)]\]
\paragraph*{\textit{$\rho\pi$-colBiSBM}} \paragraph*{\textit{$\rho\pi$-colBiSBM}}
For the \textit{$\rho\pi$-colBiSBM} the penalties are the following: For the \textit{$\rho\pi$-colBiSBM} the penalties are the following:
@ -361,23 +364,26 @@ And the corresponding BIC-L formula:
\begin{aligned} \begin{aligned}
\text{BIC-L}(\bm{X},Q_1, Q_2) = \text{BIC-L}(\bm{X},Q_1, Q_2) =
\max_{S_1,S_2} [ \max_{S_1,S_2} [
& \max_{\theta_{S_1,S_2} \in \Theta_{S_1,S_2}} \mathcal{J}(\mathcal{\hat{R}},\theta_{S_1,S_2})\\ & \max_{\theta_{S_1,S_2} \in \Theta_{S_1,S_2}} \mathcal{J}(\mathcal{\hat{R}},\theta_{S_1,S_2}) \\
- \frac{1}{2} & (\text{pen}_{\pi}(Q_1, S_1) + \text{pen}_{\rho}(Q_2, S_2)\\ - \frac{1}{2} & (\text{pen}_{\pi}(Q_1, S_1) + \text{pen}_{\rho}(Q_2, S_2) \\
&+ \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2)\\ & + \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2) \\
&+ \text{pen}_{S_1}(Q_1) + \text{pen}_{S_2}(Q_2))]\\ & + \text{pen}_{S_1}(Q_1) + \text{pen}_{S_2}(Q_2))] \\
\end{aligned} \end{aligned}
\] \]
\section{Latent space exploration and model selection} \section{Latent space exploration and model selection}
\label{sec:latent-space-exploration-and-model-selection}
In order to explorer the bi-dimensional latent space $(Q_1,Q_2)$ In order to explorer the bi-dimensional latent space $(Q_1,Q_2)$
we use the following strategies. we use the following strategies.
\subsection{Model selection} \subsection{Model selection}
\label{ssec:model-selection}
In the following steps the model selection consists of using the BIC-L In the following steps the model selection consists of using the BIC-L
criterion to select the model. We choose among the proposed models the one that criterion to select the model. We choose among the proposed models the one that
maximizes the BIC-L maximizes the BIC-L
\subsection{Initialization and pairing of the models} \subsection{Initialization and pairing of the models}
\label{ssec:initialization-and-pairing-of-the-models}
First to combine the information from the $M$ networks we fit a collection model First to combine the information from the $M$ networks we fit a collection model
for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the
previously described VEM algorithm we obtain for each network its parameters previously described VEM algorithm we obtain for each network its parameters
@ -396,12 +402,13 @@ Using this order we relabel the memberships for the $M$ fitted collection of a
single network. single network.
Then we use the $M$ memberships to fit a collection containing the $M$ networks. Then we use the $M$ memberships to fit a collection containing the $M$ networks.
\subsection{Greedy exploration to find an estimation of the mode} \subsection{Greedy exploration to find an estimation of the mode}
\label{ssec:greedy-exploration-to-find-an-estimation-of-the-mode}
Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to
perform a greedy exploration to find a first mode. perform a greedy exploration to find a first mode.
Meaning that for a given $Q = (Q_1, Q_2)$ we will compute all the possible Meaning that for a given $Q = (Q_1, Q_2)$ we will compute all the possible
memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1, Q_2), memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1, Q_2),
(Q_1, Q_2 - 1)\}$, fit (Q_1, Q_2 - 1)\}$, fit
the corresponding models and choose the one that maximizes the BIC-L as the the corresponding models and choose the one that maximizes the BIC-L as the
next point from which to repeat the procedure. We repeat the procedure until the next point from which to repeat the procedure. We repeat the procedure until the
BIC-L stops increasing $2$ times in a row. BIC-L stops increasing $2$ times in a row.
@ -446,6 +453,7 @@ BIC-L stops increasing $2$ times in a row.
When this first estimation of the BIC-L mode has been find we apply the moving When this first estimation of the BIC-L mode has been find we apply the moving
window on it. window on it.
\subsection{Moving window to update the block memberships and the BIC-L} \subsection{Moving window to update the block memberships and the BIC-L}
\label{ssec:moving-window-to-update-the-block-memberships-and-the-bic-l}
The \emph{moving window} is used to update the block memberships on rows and The \emph{moving window} is used to update the block memberships on rows and
columns and fit new models with those changes. columns and fit new models with those changes.
To define the window, we use a center point and a \emph{depth}, giving us the To define the window, we use a center point and a \emph{depth}, giving us the
@ -550,11 +558,13 @@ At the end of the moving window pass, the model of max BIC-L is the new best
fit and the procedure can repeat until convergence. fit and the procedure can repeat until convergence.
\section{Networks clustering} \section{Networks clustering}
\label{sec:networks-clustering}
As in \parencite{chabert-liddellLearningCommonStructures2023} we use a recursive As in \parencite{chabert-liddellLearningCommonStructures2023} we use a recursive
algorithm to determine the best clustering of the given networks. The procedure algorithm to determine the best clustering of the given networks. The procedure
being the same, only the technical modifications for the bipartite case will be being the same, only the technical modifications for the bipartite case will be
explained below. explained below.
\subsection{Distance between two networks} \subsection{Distance between two networks}
\label{ssec:distance-between-two-networks}
The distance weights uses $\pi$ and $\rho$. The distance weights uses $\pi$ and $\rho$.
\[ \[
D_{\mathcal{M}}(m,m') = \sum_{q = 1}^{Q_1} \sum_{r = 1}^{Q_2} \max(\widetilde{\pi}_{q}^{m}, \widetilde{\pi}_{q}^{m'}) \left( \frac{\widetilde{\alpha}_{qr}^{m}}{\widehat{\delta}_{m}} - \frac{\widetilde{\alpha}_{qr}^{m'}}{\widehat{\delta}_{m'}}\right)^{2} \max(\widetilde{\rho}_{r}^{m}, \widetilde{\rho}_{r}^{m'}) D_{\mathcal{M}}(m,m') = \sum_{q = 1}^{Q_1} \sum_{r = 1}^{Q_2} \max(\widetilde{\pi}_{q}^{m}, \widetilde{\pi}_{q}^{m'}) \left( \frac{\widetilde{\alpha}_{qr}^{m}}{\widehat{\delta}_{m}} - \frac{\widetilde{\alpha}_{qr}^{m'}}{\widehat{\delta}_{m'}}\right)^{2} \max(\widetilde{\rho}_{r}^{m}, \widetilde{\rho}_{r}^{m'})