Merge branch 'master' of https://lab.compute.dtu.dk/s183917/ml_data

71d645db · Anders Henriksen · 699b8de0 · c88d351e · 71d645db
Commit 71d645db authored 5 years ago by Anders Henriksen
--- a/docs/report/tex/report3.tex
+++ b/docs/report/tex/report3.tex
 % !TeX spellcheck = en_US
 \documentclass[11pt, fleqn]{article}
-
+\usepackage{bm}
 \usepackage{float}
 \usepackage[caption = false]{subfig}

@@ -62,7 +62,17 @@ NMI: 0.013523079325376108


 \subsection{Observation Ranking using different Scoring Methods}
-\textit{Rank the observations in terms of leave-one-out Gaussian Kernel Density, KNN Density and KNN Average Relative Density}
+%\textit{Rank the observations in terms of leave-one-out Gaussian Kernel Density, KNN Density %and KNN Average Relative Density}
+The leave-one-out Gaussian Kernel Density estimation is calculated with the following expression,
+\begin{equation}\label{key}
+p(\mathbf{x})=\sum_{n=1}^{N} \frac{1}{N} \mathcal{N}\left(\mathbf{x} | \mathbf{x}_{n}, \sigma^{2} \mathbf{I}\right)
+\end{equation}
+ The kernel density estimation is a way to approximate the probability density function of a random variable in a non-parametric way. In the case of the spotify data-set the fitted GMM is a multivariate normal distribution due to the number of features in the date-set. The fitted GMM is then evaluated on the songs in order to calculate their individual density scores. An outlier in this model would then have a low density score, meaning the probability that a song fits into any of the clusters made by the GMM is low. The then lowest density score sogns are illustrated in a bar chart plot below.
+\begin{figure}[H]
+	\centering
+	\includegraphics[width=\linewidth]{out_KDE}	
+\end{figure}
+The k-neighbor estimation detects which objects deviate from normal behavior. First, the inverse distance density estimation is calculated through the following expression, 

 \begin{figure}[H]
 	\centering