Diabetes Data Analysis and Prediction Model Discovery Using RapidMiner

1 December 2008

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 3 (21531447), 96-99
https://doi.org/10.1109/fgcn.2008.226

Abstract

Data mining techniques have been extensively applied in bioinformatics to analyze biomedical data. In this paper, we choose the Rapid-I¿s RapidMiner as our tool to analyze a Pima Indians Diabetes Data Set, which collects the information of patients with and without developing diabetes. The discussion follows the data mining process. The focus will be on the data preprocessing, including attribute identification and selection, outlier removal, data normalization and numerical discretization, visual data analysis, hidden relationships discovery, and a diabetes prediction model construction.

Keywords

This publication has 2 references indexed in Scilit:

Data Mining Methods and Models
Published by Wiley ,2005
An Exploratory Technique for Investigating Large Quantities of Categorical Data
Journal of the Royal Statistical Society Series C: Applied Statistics, 1980

Cited by 48 articles