rapport : changing margins, explaining ARI negatives, adding NA robustness sims,

This commit is contained in:
Louis Lacoste 2024-07-18 17:01:51 +02:00
parent 7775298a00
commit cc77bcc7fc
6 changed files with 5989 additions and 4 deletions

View file

@ -70,8 +70,12 @@ use the following indicators:
their true values. ($Q_1 = 4$ and $Q_2 = 4$)
\item Finally, we assess the quality of the node grouping by computing the
Adjusted Rand Index \parencite{hubertComparingPartitions1985}, ARI = 0
for a random grouping, ARI = 1 for a perfect recovery. For each
network, for the
for a random grouping, ARI = 1 for a
perfect match between groupings\footnote{Please note that even if Rand
Index can only yield values between 0 and 1, ARI can return
negative values if the RI is less than the expected value. This
indicates a structure in grouping discordance.}.
For each network, for the
$\pi\text{-}colBiSBM$, $\rho\text{-}colBiSBM$,
$\pi\rho\text{-}colBiSBM$ we compare the inferred block memberships to
the real ones by computing the mean of the ARI per axis over the two

View file

@ -36,11 +36,15 @@ less with the other communities. And $\bm{\alpha}^{nested}$ represents a common
structure detected in ecology with generalist and specialist species and a
\enquote{nested} structure.
The collections contain two networks of size $n^{m=1}_1 = n^{m=1}_2 = 40$ and
The collections contain two networks ($M=2$) of size $n^{m=1}_1 =
n^{m=1}_2 = 40$ and
$n^{m=2}_1 = n^{m=2}_2 = 120$. One collection is generated for each $colBiSBM$
model. And the nodes block memberships (i.e., the row and column blocks they
belong to) are saved.
Per $colBiSBM$ model, 10 collections are generated and their results are
averaged.
In the network $m=1$ (i.e., the smaller one) a proportion of the edges
$p_{\texttt{NA}}$ see their values replaced by \texttt{NA}s, the
\enquote{forgotten} values are stored.
@ -57,6 +61,39 @@ and we store the same predictions.
\emph{Area Under the Curve} (AUC) for predicted versus real link values and the
ARI for predicted versus real block memberships.
For the comparison we subtract the metric given by the LBM to the one
given by $colBiSBM$ and denote it $\Delta\mbox{metric}$.
\begin{figure}[ht]
\centering
\input{../tikz/simulations/na_robustness/auc-model}
\caption{$\Delta\mbox{AUC}$ in function of $p_{\texttt{NA}}$. The dashed red
lines indicate the value 0 for which
$\mbox{AUC}_{LBM} = \mbox{AUC}_{colBiSBM}$}
\label{fig:auc-plot}
\end{figure}
\begin{figure}[ht]
\centering
\input{../tikz/simulations/na_robustness/ari-dim-model}
\caption{$\Delta\mbox{ARI}$ in function of $p_{\texttt{NA}}$. The dashed red
lines indicate the value 0 for which
$\mbox{ARI}_{LBM} = \mbox{ARI}_{colBiSBM}$}
\label{fig:ari-dim-plot-na}
\end{figure}
\paragraph{Results}
On figure~\ref{fig:auc-plot} one can see that overall the nested structure seems
to be the one benefitting most from the collection model having generally
slightly higher $\Delta$AUC than the modular one.
But in general it seems that for $\epsilon\in[0.1,0.7]$ there are no clear
differences between LBM and colBiSBM regarding link prediction. For $\epsilon
\in[0.8,0.9]$ this is where the collection model seems to be most effective.
For the ARI, figure~\ref{fig:ari-dim-plot-na} suggests that collection model
does at least as well as LBM and improves nodes memberships recovery for modular
structure starting from $\epsilon = 0.7$. Again, nested structure benefits
of collection model for smaller $\epsilon$ values but those increase in
$\Delta$ARI are also smaller than what can be observed for modular structure.
\clearpage

Binary file not shown.

View file

@ -19,6 +19,7 @@
\usepackage[citecolor=blueind,urlcolor=blueps,bookmarks=false,hypertexnames=true]{hyperref} % pour les hyperliens dans le document
\usepackage{tocbibind} % Pour avoir des index pour table des matières, biblio
\usepackage{geometry}
\geometry{bmargin=25mm}
\usepackage{tikz} % For graph plots
\usepackage[outline]{contour}

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff