Skip to content
Snippets Groups Projects
Commit 71d645db authored by Anders Henriksen's avatar Anders Henriksen
Browse files
parents 699b8de0 c88d351e
No related branches found
No related tags found
No related merge requests found
% !TeX spellcheck = en_US
\documentclass[11pt, fleqn]{article}
\usepackage{bm}
\usepackage{float}
\usepackage[caption = false]{subfig}
......@@ -62,7 +62,17 @@ NMI: 0.013523079325376108
\subsection{Observation Ranking using different Scoring Methods}
\textit{Rank the observations in terms of leave-one-out Gaussian Kernel Density, KNN Density and KNN Average Relative Density}
%\textit{Rank the observations in terms of leave-one-out Gaussian Kernel Density, KNN Density %and KNN Average Relative Density}
The leave-one-out Gaussian Kernel Density estimation is calculated with the following expression,
\begin{equation}\label{key}
p(\mathbf{x})=\sum_{n=1}^{N} \frac{1}{N} \mathcal{N}\left(\mathbf{x} | \mathbf{x}_{n}, \sigma^{2} \mathbf{I}\right)
\end{equation}
The kernel density estimation is a way to approximate the probability density function of a random variable in a non-parametric way. In the case of the spotify data-set the fitted GMM is a multivariate normal distribution due to the number of features in the date-set. The fitted GMM is then evaluated on the songs in order to calculate their individual density scores. An outlier in this model would then have a low density score, meaning the probability that a song fits into any of the clusters made by the GMM is low. The then lowest density score sogns are illustrated in a bar chart plot below.
\begin{figure}[H]
\centering
\includegraphics[width=\linewidth]{out_KDE}
\end{figure}
The k-neighbor estimation detects which objects deviate from normal behavior. First, the inverse distance density estimation is calculated through the following expression,
\begin{figure}[H]
\centering
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment