Association and Classification Data Mining Algorithms Comparison over Medical Datasets

Bruno Fernandes Chimieski, Rubem Dutra Ribeiro Fagundes


Objectives: Compare Data Mining algorithms related to Classification and Association tasks over medical datasets about dermatology, vertebral column and breast cancer patients, analyzing which is the best one over each of these datasets. Methods: The classification algorithms are ran over these datasets and compared using precision, F-measure, ROC curve and Kappa performance metrics. For associaton task, the Apriori algorithm is ran to get a significant number of rules with confidence above 90%. Results: For diagnostics prediction about breast cancer and dermatology issues, the best classification algorithm was BayesNet and for vertebral column was the Logistic Model Tree. For association task, were extracted 100 knowledge rules for breast cancer and dermatology issues with confidence higher than 90% while for vertebral column were found 18 with same confidence. Conclusion: The comparison was useful to prove the possibility of using Data Mining algorithms to help Medicine decision engine with good precision.


Data Mining; Classification; Association

Texto completo: PDF

Journal of Health Informatics - ISSN 2175-4411
Rua Tenente Gomes Ribeiro, 57 - sala 33 CEP 04038-040 São Paulo - SP - Brasil
Tel./Fax: + 55 11 3791 3343 - E-mail: