Application of the Random Forest model for chlorophyll-a forecasts in fresh and brackish water bodies in Japan, using multivariate long-term databases

7 November 2017

journal article
research article
Published by IWA Publishing in Journal of Hydroinformatics

Vol. 20 (1), 206-220
https://doi.org/10.2166/hydro.2017.010

Abstract

There is a growing world need for predicting algal blooms in lakes and reservoirs to better manage water quality. We applied the random forest model with a sliding window strategy, which is one of the machine learning algorithms, to forecast chlorophyll-a concentrations in the fresh water the Urayama Reservoir and the saline waters of Lake Shinji. Both water bodies are situated in Japan and have historical water records containing more than ten years of data. The Random Forest model allowed us to forecast trends in time series of chlorophyll-a in these two water bodies. In the case of the reservoir, we used the data separately from two sampling stations. We found that the best model parameters for the number of min-leaf, and with/without pre-selection of predictors, varied at different stations in the same reservoir. We also found that the best performance of lead-time and accuracy of the prediction varied between the two stations. In the case of the lake, we found the best combination of a min-leaf and pre-selection of predictors was different from that of the reservoir case. Finally, the most influential parameters for the random forest model in the two water bodies were identified as BOD, COD, pH, and TN/TP.

Keywords

This publication has 24 references indexed in Scilit:

Changes in phytoplankton biomass due to diversion of an inflow into the Urayama Reservoir
Ecological Engineering, 2013
Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?
Briefings in Bioinformatics, 2012
Water quality comprehensive evaluation method for large water distribution network based on clustering analysis
Journal of Hydroinformatics, 2011
Evolutionary product unit based neural networks for hydrological time series analysis
Journal of Hydroinformatics, 2010
Predictive models for forecasting hourly urban water demand
Published by Elsevier ,2010
Understanding and forecasting hypoxia using machine learning algorithms
Journal of Hydroinformatics, 2010
Prediction of weekly nitrate-N fluctuations in a small agricultural watershed in Illinois
Journal of Hydroinformatics, 2010
SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
Nature Genetics, 2008
MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features
Nucleic Acids Research, 2007
Elucidation and short-term forecasting of microcystin concentrations in Lake Suwa (Japan) by means of artificial neural networks and evolutionary algorithms
Water Research, 2007

Cited by 62 articles