Statistical Learning Procedures for Monitoring Regulatory Compliance: an Application to Fisheries Data
- 25 January 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Journal of the Royal Statistical Society Series A: Statistics in Society
- Vol. 170 (3), 671-689
- https://doi.org/10.1111/j.1467-985x.2006.00460.x
Abstract
Summary: As a special case of statistical learning, ensemble methods are well suited for the analysis of opportunistically collected data that involve many weak and sometimes specialized predictors, especially when subject-matter knowledge favours inductive approaches. We analyse data on the incidental mortality of dolphins in the purse-seine fishery for tuna in the eastern Pacific Ocean. The goal is to identify those rare purse-seine sets for which incidental mortality would be expected but none was reported. The ensemble method random forests is used to classify sets according to whether mortality was (response 1) or was not (response 0) reported. To identify questionable reporting practice, we construct ‘residuals’ as the difference between the categorical response (0,1) and the proportion of trees in the forest that classify a given set as having mortality. Two uses of these residuals to identify suspicious data are illustrated. This approach shows promise as a means of identifying suspect data gathered for environmental monitoring.Keywords
This publication has 10 references indexed in Scilit:
- SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivationNature Genetics, 2008
- Random Forests and Adaptive Nearest NeighborsJournal of the American Statistical Association, 2006
- An Introduction to Ensemble Methods for Data AnalysisSociological Methods & Research, 2006
- Comparison of logbook reports of incidental blue shark catch rates by Hawaii-based longline vessels to fishery observer data by application of a generalized additive modelFisheries Research, 2002
- Stochastic gradient boostingComputational Statistics & Data Analysis, 2002
- UNOBSERVED KILL OF NURSING DOLPHIN CALVES IN A TUNA PURSE‐SEINE FISHERYMarine Mammal Science, 2001
- The Elements of Statistical LearningSpringer Series in Statistics, 2001
- Random ForestsMachine Learning, 2001
- An ecological view of the tuna--dolphin problem: impacts and trade-offsReviews in Fish Biology and Fisheries, 1998
- The tuna‐dolphin controversy in the eastern pacific ocean: Biological, economic, and political impactsOcean Development & International Law, 1994