Visual data mining techniques for classification of diabetic patients
- 1 February 2013
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 1070-1075
- https://doi.org/10.1109/iadcc.2013.6514375
Abstract
Clustering is a data mining technique for finding important patterns in unorganized and huge data collections. The likelihood approach of clustering technique is quite often used by many researchers for classifications due to its' being simple and easy to implement. It uses Expectation-Maximization (EM) algorithm for sampling. The study of classification of diabetic patients was main focus of this research work. Diabetic patients were classified by data mining techniques for medical data obtained from Pima Indian Diabetes (PID) data set. This research was based on three techniques of EM Algorithm, h-means+ clustering and Genetic Algorithm (GA). These techniques were employed to form clusters with similar symptoms. Result analyses proved that h-means+ and double crossover genetics process based techniques were better on performance comparison scale. The simulation tests were performed on WEKA software tool for three models used to test classification. The hypothesis of similar patterns of diabetes case among PID and local hospital data was tested and found positive with correlation coefficient of 0.96 for two types of the data sets. About 35% of a total of 768 test samples were found with diabetes presence.Keywords
This publication has 8 references indexed in Scilit:
- Cascading GA & CFS for Feature Subset selection in Medical Data MiningPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2009
- TRUST-TECH-Based Expectation Maximization for Learning Finite Mixture ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence, 2008
- Mean shift-based clusteringPattern Recognition, 2007
- Parallel medical image analysis for diabetic diagnosisInternational Journal of Computer Applications in Technology, 2005
- Use of genetic algorithms for neural networks to predict community-acquired pneumoniaPublished by Elsevier ,2003
- Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture modelsComputational Statistics & Data Analysis, 2003
- Unsupervised learning of finite mixture modelsIEEE Transactions on Pattern Analysis and Machine Intelligence, 2002
- A Component-Wise EM Algorithm for MixturesJournal of Computational and Graphical Statistics, 2001