Samardziska, Anastasija and Martinovska Bande, Cveta (2020) Comparison of clustering algorithms for thyroid database. Balkan Journal of Applied Mathematics and Informatics, 3 (1). pp. 73-84.
Preview |
Text
3520-Article Text-6031-1-10-20200624.pdf Download (2MB) | Preview |
Abstract
The main idea of this paper is to propose a methodology for analyzing, visualizing and clustering
data of patients with different symptoms from a thyroid database. In previous work, the thyroid data were
analyzed using the WITT algorithm. This clustering method properly formed the clusters of a control group
and hypothyroid patients but failed to cluster the hyperthyroid patients. In this paper we analyzed the data
using several algorithms: K-means, hierarchical clustering, EM algorithm, DBSCAN and Cobweb algorithm.
The main idea is to determine the degree of matching between the clusters produced and the class labels in
order to determine which algorithms give better results. Classification-oriented measures are used to validate
the clustering results. We propose several preprocessing steps to overcome the problems with the large
amount of noise and unbalanced classes in the given data set.
Item Type: | Article |
---|---|
Subjects: | Natural sciences > Computer and information sciences |
Divisions: | Faculty of Computer Science |
Depositing User: | Cveta Martinovska Bande |
Date Deposited: | 02 Oct 2020 09:45 |
Last Modified: | 02 Oct 2020 09:45 |
URI: | https://eprints.ugd.edu.mk/id/eprint/24499 |
Actions (login required)
View Item |