Refactored rapport.tex
This commit is contained in:
parent
dd1ed631d3
commit
00c926bf1b
2 changed files with 155 additions and 145 deletions
BIN
rapport.pdf
BIN
rapport.pdf
Binary file not shown.
74
rapport.tex
74
rapport.tex
|
|
@ -70,7 +70,7 @@
|
||||||
\chapter{Context}
|
\chapter{Context}
|
||||||
|
|
||||||
\section{Usage and importance of bipartite graphs}
|
\section{Usage and importance of bipartite graphs}
|
||||||
|
\label{sec:usage-and-importance-of-bipartite-graphs}
|
||||||
Bipartite graphs, denoted as $G = (U,V,E)$ with $U$ and $V$ two disjoint and
|
Bipartite graphs, denoted as $G = (U,V,E)$ with $U$ and $V$ two disjoint and
|
||||||
independent sets of vertices and $E$ the set of edges connecting $U$ vertices to
|
independent sets of vertices and $E$ the set of edges connecting $U$ vertices to
|
||||||
$V$ vertices.
|
$V$ vertices.
|
||||||
|
|
@ -78,7 +78,7 @@ $V$ vertices.
|
||||||
\begin{minipage}{0.5\linewidth}
|
\begin{minipage}{0.5\linewidth}
|
||||||
\centering
|
\centering
|
||||||
Bipartite network\\
|
Bipartite network\\
|
||||||
\begin{tikzpicture}[scale=.6]
|
\begin{tikzpicture}[scale=.6]
|
||||||
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt]
|
\tikzstyle{every edge}=[-,>=stealth',shorten >=1pt,auto,draw,line width=1.5pt]
|
||||||
\tikzstyle{every state}=[draw, text=black,scale=0.95, transform shape]
|
\tikzstyle{every state}=[draw, text=black,scale=0.95, transform shape]
|
||||||
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape]
|
\tikzstyle{every state}=[draw=none,text=black,scale=0.75, transform shape]
|
||||||
|
|
@ -103,25 +103,25 @@ $V$ vertices.
|
||||||
\path (A2) edge (B4);
|
\path (A2) edge (B4);
|
||||||
\path (A3) edge (B5);
|
\path (A3) edge (B5);
|
||||||
\path (A2) edge (B5);
|
\path (A2) edge (B5);
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\end{minipage}
|
\end{minipage}
|
||||||
\begin{minipage}{0.5\linewidth}
|
\begin{minipage}{0.5\linewidth}
|
||||||
\begin{center}
|
\begin{center}
|
||||||
Incidence matrix
|
Incidence matrix
|
||||||
$B=\left(
|
$B=\left(
|
||||||
\begin{array}{rrrrr}
|
\begin{array}{rrrrr}
|
||||||
1 & 1 & 1 & 1 & 0 \\
|
1 & 1 & 1 & 1 & 0 \\
|
||||||
0 & 0 & 1 & 1 & 1 \\
|
0 & 0 & 1 & 1 & 1 \\
|
||||||
0 & 0 & 0 & 0 & 1 \\
|
0 & 0 & 0 & 0 & 1 \\
|
||||||
\end{array}\right)
|
\end{array}\right)
|
||||||
$\\
|
$\\
|
||||||
\end{center}
|
\end{center}
|
||||||
\end{minipage}
|
\end{minipage}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
This representation can be used to represent various forms of interactions were
|
This representation can be used to represent various forms of interactions were
|
||||||
two kinds of ''actors'' interact. Those interactions can be binary or valued and
|
two kinds of "actors" interact. Those interactions can be binary or valued and
|
||||||
a numeric representation is the incidence matrix, in the above example $B$.\\
|
a numeric representation is the incidence matrix, in the above example $B$.\\
|
||||||
|
|
||||||
Among the use case of bipartite graphs one can find the Netflix Problem, which
|
Among the use case of bipartite graphs one can find the Netflix Problem, which
|
||||||
|
|
@ -147,20 +147,20 @@ Some interesting results can arise when applying a tool widely used on a particu
|
||||||
kind of interactions is used on another kind of interactions. Companies like
|
kind of interactions is used on another kind of interactions. Companies like
|
||||||
Netflix use recommender system, to recommend another product to consumers based
|
Netflix use recommender system, to recommend another product to consumers based
|
||||||
on their previous interactions.
|
on their previous interactions.
|
||||||
In \cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the
|
In ~\cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the
|
||||||
\emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing
|
\emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing
|
||||||
preys for predators in a predator-prey network.
|
preys for predators in a predator-prey network.
|
||||||
|
|
||||||
\section{Latent Block Model}
|
\section{Latent Block Model}
|
||||||
|
\label{sec:latent-block-model}
|
||||||
The Latent Block Model (LBM) introduced by \cite{govaertLatentBlockModel2010}
|
The Latent Block Model (LBM) introduced by ~\cite{govaertLatentBlockModel2010}
|
||||||
adapts the Stochastic Block Model (SBM)
|
adapts the Stochastic Block Model (SBM)
|
||||||
(\cite{hollandStochasticBlockmodelsFirst1983};\cite{snijdersEstimationPredictionStochastic1997})
|
(~\cite{hollandStochasticBlockmodelsFirst1983};~\cite{snijdersEstimationPredictionStochastic1997})
|
||||||
to bipartite graphs.
|
to bipartite graphs.
|
||||||
|
|
||||||
\begin{small}
|
\begin{small}
|
||||||
Please note that we prefer the term ''BiSBM'' and will use both LBM and BiSBM to
|
Please note that we prefer the term "BiSBM" and will use both LBM and BiSBM to
|
||||||
designate the Stochastic Block model applied on bipartite networks.
|
designate the Stochastic Block model applied on bipartite networks.
|
||||||
\end{small}
|
\end{small}
|
||||||
|
|
||||||
This model supposes that:
|
This model supposes that:
|
||||||
|
|
@ -236,8 +236,8 @@ This model supposes that:
|
||||||
\path (R31) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{electricblue}\bullet}{\color{yellow}\bullet}}$} (B5);
|
\path (R31) edge[-,>=stealth',shorten >=1pt,auto,draw=gray,line width=1.5pt, fill=gray, opacity=1] node[midway, right, fill=none] {$\alpha_{{\color{electricblue}\bullet}{\color{yellow}\bullet}}$} (B5);
|
||||||
|
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\caption{An LBM model visualization}
|
\caption{An LBM model visualization}
|
||||||
\label{fig:LBMvisu}
|
\label{fig:LBMvisu}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
|
|
@ -258,13 +258,13 @@ varied structures. But when trying to determine the structure of a given network
|
||||||
we need to find those parameters.
|
we need to find those parameters.
|
||||||
|
|
||||||
For this a common approach is to use a VEM algorithm
|
For this a common approach is to use a VEM algorithm
|
||||||
(proposed for SBM in \cite{daudinMixtureModelRandom2008} and for LBM in \cite{govaertEMAlgorithmBlock2005})
|
(proposed for SBM in ~\cite{daudinMixtureModelRandom2008} and for LBM in ~\cite{govaertEMAlgorithmBlock2005})
|
||||||
those groups and the required parameters can be inferred by maximizing a lower
|
those groups and the required parameters can be inferred by maximizing a lower
|
||||||
bound of the likelihood minus a penalty.
|
bound of the likelihood minus a penalty.
|
||||||
|
|
||||||
\section{colSBM model, a joint model for a collection of networks}
|
\section{colSBM model, a joint model for a collection of networks}
|
||||||
|
\label{sec:colsbm-model-a-joint-model-for-a-collection-of-networks}
|
||||||
The \emph{colSBM} model introduced by \cite{chabert-liddellLearningCommonStructures2023}
|
The \emph{colSBM} model introduced by ~\cite{chabert-liddellLearningCommonStructures2023}
|
||||||
propose an extension of the SBM model to collections of SBMs. A collection is a
|
propose an extension of the SBM model to collections of SBMs. A collection is a
|
||||||
set of networks which nodes are not common or linked between different networks,
|
set of networks which nodes are not common or linked between different networks,
|
||||||
the interactions have the same valuations and are of the same type.
|
the interactions have the same valuations and are of the same type.
|
||||||
|
|
@ -279,11 +279,12 @@ it to the bipartite case.
|
||||||
\chapter{Adjustment of colSBM to the bipartite case: colBiSBM}
|
\chapter{Adjustment of colSBM to the bipartite case: colBiSBM}
|
||||||
|
|
||||||
\section{Definition of the model}
|
\section{Definition of the model}
|
||||||
|
\label{sec:definition-of-the-model}
|
||||||
Here are some common notations and conventions that we will use in the following
|
Here are some common notations and conventions that we will use in the following
|
||||||
sections.
|
sections.
|
||||||
|
|
||||||
\subsection{A collection of i.i.d Bipartite SBM}
|
\subsection{A collection of i.i.d Bipartite SBM}
|
||||||
|
\label{ssec:a-collection-of-i-i-d-bipartite-sbm}
|
||||||
As for \emph{colSBM} this first model is the most constrained. It assumes
|
As for \emph{colSBM} this first model is the most constrained. It assumes
|
||||||
that all the networks are the independent realizations of the same $Q_1$-$Q_2$-BiSBM
|
that all the networks are the independent realizations of the same $Q_1$-$Q_2$-BiSBM
|
||||||
with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
|
with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
|
||||||
|
|
@ -295,6 +296,7 @@ with identical parameters. The \emph{iid-colBiSBM} is defined as follows:
|
||||||
|
|
||||||
|
|
||||||
\section{Variational Expectation step}
|
\section{Variational Expectation step}
|
||||||
|
\label{sec:variational-expectation-step}
|
||||||
Fixed point formula for the Bernoulli distribution:
|
Fixed point formula for the Bernoulli distribution:
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item[-] \textit{iid} :
|
\item[-] \textit{iid} :
|
||||||
|
|
@ -317,13 +319,14 @@ with $\text{Mask}^{m}$ the matrix containing $0$ if the value is a NA and a 1
|
||||||
otherwise.
|
otherwise.
|
||||||
|
|
||||||
\section{M step of the algorithm}
|
\section{M step of the algorithm}
|
||||||
|
\label{sec:m-step-of-the-algorithm}
|
||||||
Incorporate the equations from \parencite{chabert-liddellLearningCommonStructures2023}
|
Incorporate the equations from \parencite{chabert-liddellLearningCommonStructures2023}
|
||||||
|
|
||||||
\section{Computation of the variational bound}
|
\section{Computation of the variational bound}
|
||||||
|
\label{sec:computation-of-the-variational-bound}
|
||||||
|
|
||||||
\section{Penalties}
|
\section{Penalties}
|
||||||
|
\label{sec:penalties}
|
||||||
\paragraph*{\textit{iid-colBiSBM}}
|
\paragraph*{\textit{iid-colBiSBM}}
|
||||||
For the \textit{iid-colBiSBM} the penalties were modified in the following way :
|
For the \textit{iid-colBiSBM} the penalties were modified in the following way :
|
||||||
|
|
||||||
|
|
@ -338,7 +341,7 @@ For the \textit{iid-colBiSBM} the penalties were modified in the following way :
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
And thus the $\text{BIC-L}$ formula is now:
|
And thus the $\text{BIC-L}$ formula is now:
|
||||||
\[ \text{BIC-L}(\bm{X},Q_1, Q_2) = \max_{\theta} \mathcal{J} (\mathcal{\hat{R}}, \bm{\theta})
|
\[ \text{BIC-L}(\bm{X},Q_1, Q_2) = \max_{\theta} \mathcal{J} (\mathcal{\hat{R}}, \bm{\theta})
|
||||||
- \frac{1}{2} [\text{pen}_{\pi}(Q_1) + \text{pen}_{\rho}(Q_2) + \text{pen}_{\alpha}(Q_1, Q_2)]\]
|
- \frac{1}{2} [\text{pen}_{\pi}(Q_1) + \text{pen}_{\rho}(Q_2) + \text{pen}_{\alpha}(Q_1, Q_2)]\]
|
||||||
|
|
||||||
\paragraph*{\textit{$\rho\pi$-colBiSBM}}
|
\paragraph*{\textit{$\rho\pi$-colBiSBM}}
|
||||||
For the \textit{$\rho\pi$-colBiSBM} the penalties are the following:
|
For the \textit{$\rho\pi$-colBiSBM} the penalties are the following:
|
||||||
|
|
@ -361,23 +364,26 @@ And the corresponding BIC-L formula:
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
\text{BIC-L}(\bm{X},Q_1, Q_2) =
|
\text{BIC-L}(\bm{X},Q_1, Q_2) =
|
||||||
\max_{S_1,S_2} [
|
\max_{S_1,S_2} [
|
||||||
& \max_{\theta_{S_1,S_2} \in \Theta_{S_1,S_2}} \mathcal{J}(\mathcal{\hat{R}},\theta_{S_1,S_2})\\
|
& \max_{\theta_{S_1,S_2} \in \Theta_{S_1,S_2}} \mathcal{J}(\mathcal{\hat{R}},\theta_{S_1,S_2}) \\
|
||||||
- \frac{1}{2} & (\text{pen}_{\pi}(Q_1, S_1) + \text{pen}_{\rho}(Q_2, S_2)\\
|
- \frac{1}{2} & (\text{pen}_{\pi}(Q_1, S_1) + \text{pen}_{\rho}(Q_2, S_2) \\
|
||||||
&+ \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2)\\
|
& + \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2) \\
|
||||||
&+ \text{pen}_{S_1}(Q_1) + \text{pen}_{S_2}(Q_2))]\\
|
& + \text{pen}_{S_1}(Q_1) + \text{pen}_{S_2}(Q_2))] \\
|
||||||
\end{aligned}
|
\end{aligned}
|
||||||
\]
|
\]
|
||||||
|
|
||||||
\section{Latent space exploration and model selection}
|
\section{Latent space exploration and model selection}
|
||||||
|
\label{sec:latent-space-exploration-and-model-selection}
|
||||||
In order to explorer the bi-dimensional latent space $(Q_1,Q_2)$
|
In order to explorer the bi-dimensional latent space $(Q_1,Q_2)$
|
||||||
we use the following strategies.
|
we use the following strategies.
|
||||||
|
|
||||||
\subsection{Model selection}
|
\subsection{Model selection}
|
||||||
|
\label{ssec:model-selection}
|
||||||
In the following steps the model selection consists of using the BIC-L
|
In the following steps the model selection consists of using the BIC-L
|
||||||
criterion to select the model. We choose among the proposed models the one that
|
criterion to select the model. We choose among the proposed models the one that
|
||||||
maximizes the BIC-L
|
maximizes the BIC-L
|
||||||
|
|
||||||
\subsection{Initialization and pairing of the models}
|
\subsection{Initialization and pairing of the models}
|
||||||
|
\label{ssec:initialization-and-pairing-of-the-models}
|
||||||
First to combine the information from the $M$ networks we fit a collection model
|
First to combine the information from the $M$ networks we fit a collection model
|
||||||
for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the
|
for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the
|
||||||
previously described VEM algorithm we obtain for each network its parameters
|
previously described VEM algorithm we obtain for each network its parameters
|
||||||
|
|
@ -396,12 +402,13 @@ Using this order we relabel the memberships for the $M$ fitted collection of a
|
||||||
single network.
|
single network.
|
||||||
Then we use the $M$ memberships to fit a collection containing the $M$ networks.
|
Then we use the $M$ memberships to fit a collection containing the $M$ networks.
|
||||||
\subsection{Greedy exploration to find an estimation of the mode}
|
\subsection{Greedy exploration to find an estimation of the mode}
|
||||||
|
\label{ssec:greedy-exploration-to-find-an-estimation-of-the-mode}
|
||||||
Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to
|
Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to
|
||||||
perform a greedy exploration to find a first mode.
|
perform a greedy exploration to find a first mode.
|
||||||
|
|
||||||
Meaning that for a given $Q = (Q_1, Q_2)$ we will compute all the possible
|
Meaning that for a given $Q = (Q_1, Q_2)$ we will compute all the possible
|
||||||
memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1, Q_2),
|
memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1, Q_2),
|
||||||
(Q_1, Q_2 - 1)\}$, fit
|
(Q_1, Q_2 - 1)\}$, fit
|
||||||
the corresponding models and choose the one that maximizes the BIC-L as the
|
the corresponding models and choose the one that maximizes the BIC-L as the
|
||||||
next point from which to repeat the procedure. We repeat the procedure until the
|
next point from which to repeat the procedure. We repeat the procedure until the
|
||||||
BIC-L stops increasing $2$ times in a row.
|
BIC-L stops increasing $2$ times in a row.
|
||||||
|
|
@ -446,6 +453,7 @@ BIC-L stops increasing $2$ times in a row.
|
||||||
When this first estimation of the BIC-L mode has been find we apply the moving
|
When this first estimation of the BIC-L mode has been find we apply the moving
|
||||||
window on it.
|
window on it.
|
||||||
\subsection{Moving window to update the block memberships and the BIC-L}
|
\subsection{Moving window to update the block memberships and the BIC-L}
|
||||||
|
\label{ssec:moving-window-to-update-the-block-memberships-and-the-bic-l}
|
||||||
The \emph{moving window} is used to update the block memberships on rows and
|
The \emph{moving window} is used to update the block memberships on rows and
|
||||||
columns and fit new models with those changes.
|
columns and fit new models with those changes.
|
||||||
To define the window, we use a center point and a \emph{depth}, giving us the
|
To define the window, we use a center point and a \emph{depth}, giving us the
|
||||||
|
|
@ -550,11 +558,13 @@ At the end of the moving window pass, the model of max BIC-L is the new best
|
||||||
fit and the procedure can repeat until convergence.
|
fit and the procedure can repeat until convergence.
|
||||||
|
|
||||||
\section{Networks clustering}
|
\section{Networks clustering}
|
||||||
|
\label{sec:networks-clustering}
|
||||||
As in \parencite{chabert-liddellLearningCommonStructures2023} we use a recursive
|
As in \parencite{chabert-liddellLearningCommonStructures2023} we use a recursive
|
||||||
algorithm to determine the best clustering of the given networks. The procedure
|
algorithm to determine the best clustering of the given networks. The procedure
|
||||||
being the same, only the technical modifications for the bipartite case will be
|
being the same, only the technical modifications for the bipartite case will be
|
||||||
explained below.
|
explained below.
|
||||||
\subsection{Distance between two networks}
|
\subsection{Distance between two networks}
|
||||||
|
\label{ssec:distance-between-two-networks}
|
||||||
The distance weights uses $\pi$ and $\rho$.
|
The distance weights uses $\pi$ and $\rho$.
|
||||||
\[
|
\[
|
||||||
D_{\mathcal{M}}(m,m') = \sum_{q = 1}^{Q_1} \sum_{r = 1}^{Q_2} \max(\widetilde{\pi}_{q}^{m}, \widetilde{\pi}_{q}^{m'}) \left( \frac{\widetilde{\alpha}_{qr}^{m}}{\widehat{\delta}_{m}} - \frac{\widetilde{\alpha}_{qr}^{m'}}{\widehat{\delta}_{m'}}\right)^{2} \max(\widetilde{\rho}_{r}^{m}, \widetilde{\rho}_{r}^{m'})
|
D_{\mathcal{M}}(m,m') = \sum_{q = 1}^{Q_1} \sum_{r = 1}^{Q_2} \max(\widetilde{\pi}_{q}^{m}, \widetilde{\pi}_{q}^{m'}) \left( \frac{\widetilde{\alpha}_{qr}^{m}}{\widehat{\delta}_{m}} - \frac{\widetilde{\alpha}_{qr}^{m'}}{\widehat{\delta}_{m'}}\right)^{2} \max(\widetilde{\rho}_{r}^{m}, \widetilde{\rho}_{r}^{m'})
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue