rapport : modification texte

This commit is contained in:
Louis Lacoste 2024-07-11 23:52:59 +02:00
parent e412e00151
commit cd793e3094
2 changed files with 38 additions and 21 deletions

View file

@ -81,15 +81,15 @@ interactions, the rows are pollinator species and the columns are plant species,
and the intersection is a value, binary if it is a presence/absence or a value
if it is an abundance count.
Bipartite graphs are widely used in biology, in various fields, among which the
previously cited ecological networks, but also in medicine with biomedical
networks, biomolecular networks or epidemiological networks.
Bipartite graphs are widely used in biology in general, in various fields, among
which the previously cited ecological networks, but also in medicine with
biomedical networks, biomolecular networks or epidemiological networks.
\parencite{pavlopoulosBipartiteGraphsSystems2018}
Some interesting results can arise when applying a tool widely used on a
particular kind of interactions is used on another kind of interactions.
Companies like Netflix use recommender system, to recommend another product to
consumers based on their previous interactions. In
Companies like Netflix or Amazon use recommender system, to recommend other
products to consumers based on their previous interactions. In
~\cite{desjardins-proulxEcologicalInteractionsNetflix2017} the authors use the
\emph{K-nearest neighbour} (KNN) algorithm as a Recommender to predict missing
preys for predators in a predator-prey network.
@ -101,9 +101,9 @@ adapts the Stochastic Block Model (SBM)
\parencite{hollandStochasticBlockmodelsFirst1983, snijdersEstimationPredictionStochastic1997}
to bipartite graphs.
\begin{small}
\textit{Note :}\begin{small}
Please note that we prefer the term ``BiSBM`` and will use both LBM and BiSBM to
designate the Stochastic Block model applied on bipartite networks.
designate the Stochastic Block Model applied on bipartite networks.
\end{small}
This model supposes that:

View file

@ -112,14 +112,14 @@ the same problems as~\cite{chabert-liddellLearningCommonStructures2024a} and
adapt the support $S$ they define for the $\pi$-colSBM to the bipartite case by
having $S^1$ of size $M\times Q_1$ the support for the rows and $S^2$ of size
$M\times Q_2$ the support for the columns. Thus
$S^1_{mq} = \mathbb{1}_{\pi^m_q > 0}$ and
$S^2_{mr} = \mathbb{1}_{\rho^m_r > 0}$. In this case, $S^2 = \bm{1}$, because
$S^1_{mq} = \mathbbb{1}_{\pi^m_q > 0}$ and
$S^2_{mr} = \mathbbb{1}_{\rho^m_r > 0}$. In this case, $S^2 = \bm{1}$, because
there is no freedom on the column dimension.
For a given number of blocks $Q_1$, $Q_2$ and matrix $S^1$ ($S^2$ being in this
case the matrix full of ones), the number of parameters is:
\begin{equation*}
\text{NP}(\pi\text{-}colBiSBM) = \sum_{m=1}^{M}\Bigg( \sum_{q=1}^{Q_1} S^1_{mq} - 1 \Bigg) + (Q_2 - 1) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
\text{NP}(\pi\text{-}colBiSBM) = \sum_{m=1}^{M}\Bigg( \sum_{q=1}^{Q_1} S^1_{mq} - 1 \Bigg) + (Q_2 - 1) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
\end{equation*}
The first term corresponds to the non-null block proportions in each network.
The third quantity accounts for the fact that some blocks may never be
@ -147,7 +147,7 @@ the column dimension.
For a given number of blocks $Q_1$, $Q_2$ and matrix $S^2$ ($S^1$ being in this
case the matrix full of ones), the number of parameters is:
\begin{equation*}
\text{NP}(\rho\text{-}colBiSBM) = (Q_1 - 1) + \sum_{m=1}^{M}\Bigg( \sum_{r=1}^{Q_2} S^2_{mr} - 1 \Bigg) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
\text{NP}(\rho\text{-}colBiSBM) = (Q_1 - 1) + \sum_{m=1}^{M}\Bigg( \sum_{r=1}^{Q_2} S^2_{mr} - 1 \Bigg) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
\end{equation*}
$\pi\rho$-colBiSBM model still assumes that the networks share a common connectivity
@ -165,7 +165,7 @@ $\rho^m_r \in \left[ 0,1 \right], \sum_{r=1}^{Q_2} \rho^m_r = 1 $.
For a given number of blocks $Q_1$, $Q_2$ and matrices $S^1$, $S^2$, the number
of parameters is:
\begin{equation*}
\text{NP}(\pi\rho\text{-}colBiSBM) = \sum_{m=1}^{M}\Bigg( \sum_{q=1}^{Q_1} S^1_{mq} - 1 \Bigg) + \sum_{m=1}^{M}\Bigg( \sum_{r=1}^{Q_2} S^2_{mr} - 1 \Bigg) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
\text{NP}(\pi\rho\text{-}colBiSBM) = \sum_{m=1}^{M}\Bigg( \sum_{q=1}^{Q_1} S^1_{mq} - 1 \Bigg) + \sum_{m=1}^{M}\Bigg( \sum_{r=1}^{Q_2} S^2_{mr} - 1 \Bigg) + \sum_{\substack{q=1,\dots,Q_1 \\ r=1,\dots,Q_2}} \mathbbb{1}_{{(S^{1\prime}S^2)}_{qr}>0}
\end{equation*}
\section{Variational estimation of the parameters}\label{sec:variational-estimation-of-the-parameters}
@ -289,6 +289,10 @@ all networks over the number of number of possible interactions:
% Adapt bicl, methode explo car defi
% 1 bicl 2 model exploration
% Citer la conclusion de l'article de St Clair discussion sur bipartite
The section \ref{sec:variational-estimation-of-the-parameters} explains how we
estimate the parameters of the model for \emph{fixed} number of blocks
$Q_1$ and $Q_2$. But as they are in general not known we need to explore the
latent space to find the \emph{best} values.
As discussed in~\cite{chabert-liddellLearningCommonStructures2024a}, the
algorithmic aspect becomes complex when dealing with the bipartite case. Due to
the size of the latent space being $\mathbb{N}^2$, conducting a complete
@ -299,8 +303,14 @@ challenge involved making significant choices, which are outlined below.
The below procedures are implemented in the \emph{colSBM} package, available on
\url{https://github.com/Chabert-Liddell/colSBM}.
\subsection{The BIC-L criterion for model selection}
\subsection{The \emph{Bayesian Information Criterion like} (BIC-L) criterion for model selection}
\label{ssec:the-bic-l-criterion-for-model-selection}
To select the best number of blocks we need a criterion to
measure adequacy between our model and data. The ELBO might seem a good
criterion at first but as for the likelihood, the more complex a model the
higher it gets. And thus a good criterion should make a \emph{trade-off} between
fitting to data and model complexity.
The Integrated Classified Likelihood (ICL) is a well-established tool in the SBM
and LBM domains for selecting the appropriate number of blocks. It was
introduced by~\cite{biernackiAssessingMixtureModel2000,
@ -322,8 +332,9 @@ well-separated blocks by imposing a penalty on the entropy of node grouping.
However, the objective of our study extends beyond grouping nodes into coherent
blocks. We also aim to assess the similarity of connectivity patterns across
different networks. Consequently, we aim to permit models that offer more
flexible node grouping without penalizing entropy. This leads us to formulate a
BIC-like criterion in the following manner:
flexible node grouping without penalizing entropy.
This leads us to formulate a BIC-like criterion in the following manner:
\[
\text{BIC-L} = \max_{\bm{\theta}} \mathbb{E}_{\widehat{\mathcal{R}}} [\ell(\bm{X,Z,W;\theta})] + \mathcal{H(\widehat{R})} - \frac{1}{2}\text{pen} = \max_{\bm{\theta}} \mathcal{J(\widehat{R}, \bm{\theta})} - \frac{1}{2}\text{pen}
@ -364,7 +375,7 @@ propose.
\log n_{2}^{m}. \]
Penalties for the $\bm\alpha$
\[ \text{pen}_{\alpha}(Q_1, Q_2, S_1, S_2) = (\sum_{q=1}^{Q_1}
\sum_{r=1}^{Q_2} \mathbb{1}_{(S_1)'S_2 > 0}) \log (N_M). \]
\sum_{r=1}^{Q_2} \mathbbb{1}_{(S_1)'S_2 > 0}) \log (N_M). \]
And the corresponding BIC-L formula,
\[
\begin{aligned}
@ -380,11 +391,16 @@ propose.
\subsection{Initialization and pairing of the models}
\label{ssec:initialization-and-pairing-of-the-models}
First to combine the information from the $M$ networks we fit a collection model
The row (resp. column) block memberships are the labels of row (resp. column)
nodes corresponding to the group to which they were assigned based on their
connection patterns. This adds another layer of complexity to the model
selection as we need to find the best $Q_1, Q_2$ and the best memberships for
each vertex.
First to combine the information from the $M$ networks we fit a LBM model
for each network at the two points $Q = (1, 2)$ and $Q = (2, 1)$. Using the
previously described VEM algorithm we obtain for each network its parameters
($\bm{\rho,\pi,\alpha}$).
We then compute the marginal laws for each dimension, for each network. Then we
order the network blocks by the probabilities obtained in decreasing order.
@ -395,10 +411,10 @@ For the memberships on the rows: $row~order_m = order\left(\rho_m \times
~^{t}(\alpha_m)\right)$.
Using this order we relabel the memberships for the $M$ fitted collection of a
single network. Then we use the $M$ memberships to fit a collection containing
single network.
We then use the $M$ memberships to fit a collection containing
the $M$ networks.
\subsection{Greedy exploration to find an estimation of the mode}\label{ssec:greedy-exploration-to-find-an-estimation-of-the-mode}
Using the previously fitted models for $Q = (1,2)$ and $Q = (2,1)$ we choose to
perform a greedy exploration to find a first mode.
@ -408,7 +424,7 @@ memberships for the points $Q \in \{(Q_1 + 1, Q_2),(Q_1, Q_2 + 1),(Q_1 - 1,
maximizes the BIC-L as the next point from which to repeat the procedure. We
repeat the procedure until the BIC-L stops increasing $2$ times in a row.
\begin{algorithm}[t]
\begin{algorithm}[H]
\caption{Greedy Exploration for Mode Estimation}
\SetAlgoLined
\SetKwInOut{Input}{Input}
@ -447,6 +463,7 @@ repeat the procedure until the BIC-L stops increasing $2$ times in a row.
When this first estimation of the BIC-L mode has been find we apply the moving
window on it.
\subsection{Moving window to update the block memberships and the BIC-L}
\label{ssec:moving-window-to-update-the-block-memberships-and-the-bic-l}
The \emph{moving window} is used to update the block memberships on rows and