Identifying Health Status in Grazing Dairy Cows from Milk Mid-Infrared Spectroscopy by Using Machine Learning Methods

Abstract
The early detection of health problems in dairy cattle is crucial to reduce economic losses. Mid-infrared (MIR) spectrometry has been used for identifying the composition of cow milk in routine tests. As such, it is a potential tool to detect diseases at an early stage. Partial least squares discriminant analysis (PLS-DA) has been widely applied to identify illness such as lameness by using MIR spectrometry data. However, this method suffers some limitations. In this study, a series of machine learning techniques—random forest, support vector machine, neural network (NN), convolutional neural network and ensemble models—were used to test the feasibility of identifying cow sickness from 1909 milk sample MIR spectra from Holstein-Friesian, Jersey and crossbreed cows under grazing conditions. PLS-DA was also performed to compare the results. The sick cow records had a time window of 21 days before and 7 days after the milk sample was analysed. NN showed a sensitivity of 61.74%, specificity of 97% and positive predicted value (PPV) of nearly 60%. Although the sensitivity of the PLS-DA was slightly higher than NN (65.6%), the specificity and PPV were lower (79.59% and 15.25%, respectively). This indicates that by using NN, it is possible to identify a health problem with a reasonable level of accuracy.