Comparison of clustering algorithms for thyroid database

Samardziska, Anastasija and Martinovska Bande, Cveta (2020) Comparison of clustering algorithms for thyroid database. Balkan Journal of Applied Mathematics and Informatics, 3 (1). pp. 73-84.

[thumbnail of 3520-Article Text-6031-1-10-20200624.pdf]
Preview
Text
3520-Article Text-6031-1-10-20200624.pdf

Download (2MB) | Preview

Abstract

The main idea of this paper is to propose a methodology for analyzing, visualizing and clustering
data of patients with different symptoms from a thyroid database. In previous work, the thyroid data were
analyzed using the WITT algorithm. This clustering method properly formed the clusters of a control group
and hypothyroid patients but failed to cluster the hyperthyroid patients. In this paper we analyzed the data
using several algorithms: K-means, hierarchical clustering, EM algorithm, DBSCAN and Cobweb algorithm.
The main idea is to determine the degree of matching between the clusters produced and the class labels in
order to determine which algorithms give better results. Classification-oriented measures are used to validate
the clustering results. We propose several preprocessing steps to overcome the problems with the large
amount of noise and unbalanced classes in the given data set.

Item Type: Article
Subjects: Natural sciences > Computer and information sciences
Divisions: Faculty of Computer Science
Depositing User: Cveta Martinovska Bande
Date Deposited: 02 Oct 2020 09:45
Last Modified: 02 Oct 2020 09:45
URI: https://eprints.ugd.edu.mk/id/eprint/24499

Actions (login required)

View Item View Item