Machine Learning Techniques Accurately Classify Microbial Communities by Bacterial Vaginosis Characteristics
Open Access
- 3 February 2014
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 9 (2), e87830
- https://doi.org/10.1371/journal.pone.0087830
Abstract
Microbial communities are important to human health. Bacterial vaginosis (BV) is a disease associated with the vagina microbiome. While the causes of BV are unknown, the microbial community in the vagina appears to play a role. We use three different machine-learning techniques to classify microbial communities into BV categories. These three techniques include genetic programming (GP), random forests (RF), and logistic regression (LR). We evaluate the classification accuracy of each of these techniques on two different datasets. We then deconstruct the classification models to identify important features of the microbial community. We found that the classification models produced by the machine learning techniques obtained accuracies above 90% for Nugent score BV and above 80% for Amsel criteria BV. While the classification models identify largely different sets of important features, the shared features often agree with past research.Keywords
This publication has 21 references indexed in Scilit:
- Vaginal microbiome and sexually transmitted infections: an epidemiologic perspectiveJCI Insight, 2011
- pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference treeBMC Bioinformatics, 2010
- Molecular analysis of the diversity of vaginal microbiota associated with bacterial vaginosisBMC Genomics, 2010
- Vaginal microbiome of reproductive-age womenProceedings of the National Academy of Sciences, 2010
- A polymicrobial perspective of pulmonary infections exposes an enigmatic pathogen in cystic fibrosis patientsProceedings of the National Academy of Sciences, 2008
- The Human Vaginal Bacterial Biota and Bacterial VaginosisInterdisciplinary Perspectives on Infectious Diseases, 2008
- Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial TaxonomyApplied and Environmental Microbiology, 2007
- Human gut microbes associated with obesityNature, 2006
- An obesity-associated gut microbiome with increased capacity for energy harvestNature, 2006
- Cross-Validation of Regression ModelsJournal of the American Statistical Association, 1984