58 lines
No EOL
2.8 KiB
TeX
58 lines
No EOL
2.8 KiB
TeX
\addtocounter{customchapter}{1}
|
|
\chapter{Conclusions and future work}
|
|
\label{chap:conclusions-and-future-work}
|
|
\section{Conclusion}
|
|
\label{sec:conclusion}
|
|
|
|
\subsection{Difficulties encountered}
|
|
\label{ssec:difficulties-encountered}
|
|
\paragraph{Seed dependance} While using our clustering on data
|
|
from~\cite{doreRelativeEffectsAnthropogenic2021} we obtained quite interesting
|
|
results but investigating further, we noticed that the clustering on such big
|
|
collections ($M=123$) was not fully reproducible. It depends a lot on the random
|
|
generator seed and as there is no possibility to merge back
|
|
collections\footnote{
|
|
This is due to the need of having same sized $\bm{\alpha}$ to be able to compute
|
|
the distance. Meaning that the networks must have been fitted together
|
|
in the same collection.}
|
|
the clustering dendrograms do not stabilize on large collections.
|
|
This, currently, prevents us to clusterize large collections.
|
|
|
|
\paragraph{Large penalties with free mixture models}
|
|
We observed while testing clustering with the different models that
|
|
the $\pi$, $\rho$ and $\pi\rho$ model, with their increased number of parameters
|
|
for block memberships parameters tends to give smaller BIC-L criterion values
|
|
while having a higher Evidence Lower Bound than the \emph{iid}.
|
|
This arises because of the penalties on the block memberships and supports that
|
|
increase significantly and exceeds the gain on the ELBO and the diminution of
|
|
the connectivity parameters.
|
|
|
|
\section{Future work}
|
|
\label{sec:future-work}
|
|
|
|
\paragraph{Fixing seed dependance}
|
|
We are currently investigating the procedure and code to see if reducing or
|
|
escaping seed dependance is possible.
|
|
|
|
\paragraph{Identifiability}
|
|
As stated in section~\ref{sec:model-identifiability}, we only have
|
|
identifiability for the \emph{iid}-colBiSBM and we will work on establishing
|
|
identifiability for $\pi$, $\rho$ and $\pi\rho$ models which are the most
|
|
challenging with regard to identifiability.
|
|
|
|
\paragraph{Finding a trade-off between \emph{iid} and $\pi\rho$}
|
|
An idea to tackle the problem of large penalties with $\pi$, $\rho$ and
|
|
$\pi\rho$ could be to suppose that the block memberships
|
|
for network $m$ are themselves the realizations of random variables and
|
|
thus introduce sort of a mixed effect model. This may allow a self-penalization
|
|
that could keep the flexibility intended in these models.
|
|
|
|
\paragraph{Comparison to other graphs clustering methods}
|
|
Recent work have been comparing
|
|
\texttt{colSBM}~\parencite{chabert-liddellLearningCommonStructures2024a} and
|
|
\texttt{graphclust}~\parencite{rebafkaModelbasedClusteringMultiple2023} assessing various
|
|
capabilities of the models and particularly focusing on networks clustering.
|
|
We will reproduce and adapt the analysis to test other simulation settings that
|
|
were not considered in this work.
|
|
|
|
\section*{Thank you for reading this work} |