@@ -90,7 +90,10 @@ To interpret this optimal GMM with \(K=9\), the nine cluster centers \(\mu_{1..9
\centering
\includegraphics[width=\linewidth]{clusterfuck23}
\caption{The data and the clusters plotted on the second and third principal component. In the third principal component cluster 6 and cluster 8 correspond somewhat to principal component directions.}
\end{figure}
\end{figure}\noindent
From the first plot, it is observed that most of the data clusters in a raisin bun plot in the middle showing that many of the clusters captures differences that are too small to be shown in these two principal components. It is seen, though, that the fourth cluster corresponds to high values of the first principal component which we in the first report was linked to high acousticness, low energy and minor key. A direction upwards in the second principal component can also be linked with the third cluster and songs in the fourth class corresponding to high energy.
The second plot using the second and third principal components is used to gain more information but is also very grouped around (0,0). The sixth cluster can though be linked to a low value of the third principal component which was linked with short, loud, high-energy songs with little speech. This cluster seems to correspond somewhat well to the fourth class: the songs with highest tempo.
...
...
@@ -117,23 +120,23 @@ To evaluate if the cluterings are similar to the premade clusterings of the temp
\label{simtab}
\caption{The table shows the similarity scores between the model clusters and the target clusters}
\end{table}\noindent
To see if the Hierachical clustering is similar to the Gaussian mixture model the same similarity measures are calculated for these two clusters. The similarity score of the Hierachical clusters and GMM clusters measured as:
To see if the Hierachical clustering is similar to the Gaussian mixture model the same similarity measures are calculated for the clusterings between the two models. The similarity score of the Hierachical clusters and GMM clusters are measured as:
\begin{align*}
&\mathrm{Rand \ Index}: \quad 0.7241 &&
\mathrm{Jaccard}: \quad 0.1460 &&&
\mathrm{NMI}: \quad 0.2302
\end{align*}
This suggest that the clusters of the GMM and Hierachical models are mutually more similar than they are to the target clusters. Overall the similarity scores seen in table \eqref{simtab} shows that the models found does not describe the target clustering of tempo very well. This does not necessarily imply that the GMM and Hierchical clustering models does not cluster the data well, but rather that clustering tempo does not explain the data well.
This suggest that the clusters of the GMM and Hierachical models are mutually more similar than they are to the target clusters. Overall the similarity scores seen in table \eqref{simtab} shows that the models found does not describe the target clustering of tempo well. This does not necessarily imply that the GMM and Hierchical clustering models does not cluster the data well, but rather that clustering tempo does not explain the data well.
%Celebratory behavior is therefore more applicable to monkeys than human robots.
%RAND BOI:0.7241369982135971
%Jaccard_boi: 0.1460234998195718
%NMI: 0.23017110206217814
\section{Outlier Detection/Anomaly Detection}
\section{Outlier Detection}
\subsection{Observation Ranking using different Scoring Methods}
\subsection{Ranking songs after typicality}
%\textit{Rank the observations in terms of leave-one-out Gaussian Kernel Density, KNN Density %and KNN Average Relative Density}
The leave-one-out Gaussian Kernel Density estimation is calculated with the following expression,