\addtocounter{customchapter}{1} \chapter{Conclusions and future work} \label{chap:conclusions-and-future-work} \section{Conclusion} \label{sec:conclusion} \subsection{State of work at the end of the internship} At the end of this internship we now have: \begin{itemize} \item A model capable to find structure in collections of bipartite network. Enabling the possibility to bring together networks that may have not seem evident to put together. \item A clustering method that is able to partition a collection into collections that are similar in their structures. \item All the described methods implemented into an \texttt{R} package. \end{itemize} \subsection{Difficulties encountered} \label{ssec:difficulties-encountered} \paragraph{Local optima} While using our clustering on data from~\cite{doreRelativeEffectsAnthropogenic2021} we obtained quite interesting results but investigating further, we noticed that the clustering on such big collections ($M=123$) was not fully reproducible. It depends a lot on the random generator seed and as there is no possibility to merge back collections the clustering dendrograms do not stabilize on large collections. This, currently, prevents us to clusterize large collections. We suspect that our model selection method gets stuck on local optima of the BIC-L. \paragraph{Large penalties with free mixture models} We observed while testing clustering with the different models that the $\pi$, $\rho$ and $\pi\rho$ model, with their increased number of parameters for block memberships parameters tends to give smaller BIC-L criterion values while having a higher Evidence Lower Bound than the \emph{iid}. This arises because of the penalties on the block memberships and supports that increase significantly and exceeds the gain on the ELBO and the diminution of the connectivity parameters. \section{Future work} \label{sec:future-work} \paragraph{Fixing local optima} We are currently investigating the procedure and code to see if reducing or escaping seed dependance is possible and would allow escaping the local optima. \paragraph{Identifiability} As stated in section~\ref{sec:model-identifiability}, we only have identifiability for the \emph{iid}-colBiSBM and we will work on establishing identifiability for $\pi$, $\rho$ and $\pi\rho$ models which are the most challenging with regard to identifiability. \paragraph{Finding a trade-off between \emph{iid} and $\pi\rho$} An idea to tackle the problem of large penalties with $\pi$, $\rho$ and $\pi\rho$ could be to suppose that the block memberships for network $m$ are themselves the realizations of random variables and thus introduce sort of a mixed effect model. This may allow a self-penalization that could keep the flexibility intended in these models. \paragraph{More ecological applications} We have leads to apply the models on interesting cases, among them are the following: \begin{itemize} \item Networks that are spaced along an altitude gradient, which could be accounted for in the dissimilarity measure for instance. \item Collection of different sorts of interactions (plants-seed dispersors, host-parasites, \dots) \end{itemize} \paragraph{Turning this work into an article} We will work in the upcoming months to write an article about the method. \paragraph{Comparison to other graphs clustering methods} Recent work have been comparing \texttt{colSBM}~\parencite{chabert-liddellLearningCommonStructures2024a} and \texttt{graphclust}~\parencite{rebafkaModelbasedClusteringMultiple2023} assessing various capabilities of the models and particularly focusing on networks clustering. We will reproduce and adapt the analysis to test other simulation settings that were not considered in this work.