Abstract
There is a growing world need for predicting algal blooms in lakes and reservoirs to better manage water quality. We applied the random forest model with a sliding window strategy, which is one of the machine learning algorithms, to forecast chlorophyll-a concentrations in the fresh water the Urayama Reservoir and the saline waters of Lake Shinji. Both water bodies are situated in Japan and have historical water records containing more than ten years of data. The Random Forest model allowed us to forecast trends in time series of chlorophyll-a in these two water bodies. In the case of the reservoir, we used the data separately from two sampling stations. We found that the best model parameters for the number of min-leaf, and with/without pre-selection of predictors, varied at different stations in the same reservoir. We also found that the best performance of lead-time and accuracy of the prediction varied between the two stations. In the case of the lake, we found the best combination of a min-leaf and pre-selection of predictors was different from that of the reservoir case. Finally, the most influential parameters for the random forest model in the two water bodies were identified as BOD, COD, pH, and TN/TP.