rapport-CEI-MIA-2023/Rcodes/real_data/CoOPLBM_completion_analyze.tex

141 lines
4.9 KiB
TeX

% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
%
\documentclass[
]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
\KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
\hypersetup{
pdftitle={Netclustering analysis with the CoOPLBM completion},
hidelinks,
pdfcreator={LaTeX via pandoc}}
\urlstyle{same} % disable monospaced font for URLs
\usepackage[margin=1in]{geometry}
\usepackage{graphicx}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
% Set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{-\maxdimen} % remove section numbering
\newlength{\cslhangindent}
\setlength{\cslhangindent}{1.5em}
\newenvironment{cslreferences}%
{\setlength{\parindent}{0pt}%
\everypar{\setlength{\hangindent}{\cslhangindent}}\ignorespaces}%
{\par}
\title{Netclustering analysis with the CoOPLBM completion}
\author{}
\date{\vspace{-2.5em}}
\begin{document}
\maketitle
\hypertarget{context-of-this-analysis}{%
\section{Context of this analysis}\label{context-of-this-analysis}}
After performing a netclustering on the raw data, we will see if the
detect structure resulting in the clustering comes from the sampling
effort. To test this we will use the CoOPLBM model by Anakok et al.
(2022) to complete the data.
The CoOPLBM model assumes that the observed incidence matrix \(R\) is an
element-wise product of an \(M\) matrix following an LBM and an \(N\)
matrix which elements follow Poisson distributions independent on \(M\).
The model gives us the \(\hat{M}\) matrix, which elements are:
\begin{itemize}
\tightlist
\item
1 if the interaction was observed
\item
a probability, that there should be an interaction but it wasn't
observed
\end{itemize}
This \emph{completed matrix} can be used in different manners to be fed
to the colSBM model.
\hypertarget{threshold-based-completions}{%
\section{Threshold based
completions}\label{threshold-based-completions}}
With the thresholds, the infered incidence matrix obtained by CoOPLBM is
used to generate a completed incidence matrix by the following procedure
: \[X_{ij} = \begin{cases}
1 & \text{if the value is over the threshold} \\
0 & \text{else} \\
\end{cases}\]
\hypertarget{completed-threshold}{%
\subsection{0.5 completed threshold}\label{completed-threshold}}
Here, the completion threshold is set to \(0.5\).
\hypertarget{ari-of-networks-clustering-0.5-threshold-vs-raw-data}{%
\subsubsection{ARI of networks clustering: 0.5 threshold vs raw
data}\label{ari-of-networks-clustering-0.5-threshold-vs-raw-data}}
\hypertarget{sample-based-completions}{%
\section{Sample based completions}\label{sample-based-completions}}
The \(M\) matrix is used to sample a new \(X\) matrix which elements are
the realisation of Bernoulli distributions of probability \(M_{i,j}\).
\[\mathbb{P}(X_{i,j} = 1) = M_{i,j} \]
\hypertarget{references}{%
\section*{References}\label{references}}
\addcontentsline{toc}{section}{References}
\hypertarget{refs}{}
\begin{cslreferences}
\leavevmode\hypertarget{ref-anakokDisentanglingStructureEcological2022}{}%
Anakok, Emre, Pierre Barbillon, Colin Fontaine, and Elisa Thebault.
2022. ``Disentangling the Structure of Ecological Bipartite Networks
from Observation Processes.'' arXiv.
\url{http://arxiv.org/abs/2211.16364}.
\end{cslreferences}
\end{document}