Regularized F‐Measure Maximization for Feature Selection and Classification
Open Access
- 1 January 2009
- journal article
- research article
- Published by Wiley in BioMed Research International
- Vol. 2009 (1), 617946
- https://doi.org/10.1155/2009/617946
Abstract
Receiver Operating Characteristic (ROC) analysis is a common tool for assessing the performance of various classifications. It gained much popularity in medical and other fields including biological markers and, diagnostic test. This is particularly due to the fact that in real-world problems misclassification costs are not known, and thus, ROC curve and related utility functions such as F-measure can be more meaningful performance measures. F-measure combines recall and precision into a global measure. In this paper, we propose a novel method through regularized F-measure maximization. The proposed method assigns different costs to positive and negative samples and does simultaneous feature selection and prediction with 𝐿1 penalty. This method is useful especially when data set is highly unbalanced, or the labels for negative (positive) samples are missing. Our experiments with the benchmark, methylation, and high dimensional microarray data show that the performance of proposed algorithm is better or equivalent compared with the other popular classifiers in limited experiments.Keywords
This publication has 13 references indexed in Scilit:
- Insights into latent class analysis of diagnostic test performanceBiostatistics, 2006
- Evaluating technologies for classification and prediction in medicineStatistics in Medicine, 2005
- The use of receiver operating characteristic curves in biomedical informaticsJournal of Biomedical Informatics, 2005
- A comparison of cluster analysis methods using DNA methylation dataBioinformatics, 2004
- Hierarchical clustering of lung cancer cell lines using DNA methylation markers.2002
- Robust Classification for Imprecise EnvironmentsMachine Learning, 2001
- Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arraysProceedings of the National Academy of Sciences, 1999
- THE LASSO METHOD FOR VARIABLE SELECTION IN THE COX MODELStatistics in Medicine, 1997
- The Nature of Statistical Learning TheoryPublished by Springer Nature ,1995
- Quasi-Newton Methods and their Application to Function MinimisationMathematics of Computation, 1967