Poisson point process models solve the “pseudo-absence problem” for presence-only data in ecology
Top Cited Papers
Open Access
- 1 September 2010
- journal article
- Published by Institute of Mathematical Statistics in The Annals of Applied Statistics
- Vol. 4 (3), 1383-1402
- https://doi.org/10.1214/10-aoas331
Abstract
Presence-only data, point locations where a species has been recorded as being present, are often used in modeling the distribution of a species as a function of a set of explanatory variables—whether to map species occurrence, to understand its association with the environment, or to predict its response to environmental change. Currently, ecologists most commonly analyze presence-only data by adding randomly chosen “pseudo-absences” to the data such that it can be analyzed using logistic regression, an approach which has weaknesses in model specification, in interpretation, and in implementation. To address these issues, we propose Poisson point process modeling of the intensity of presences. We also derive a link between the proposed approach and logistic regression—specifically, we show that as the number of pseudo-absences increases (in a regular or uniform random arrangement), logistic regression slope parameters and their standard errors converge to those of the corresponding Poisson point process model. We discuss the practical implications of these results. In particular, point process modeling offers a framework for choice of the number and location of pseudo-absences, both of which are currently chosen by ad hoc and sometimes ineffective methods in ecology, a point which we illustrate by example.Keywords
All Related Versions
This publication has 19 references indexed in Scilit:
- Species Distribution Models: Ecological Explanation and Prediction Across Space and TimeAnnual Review of Ecology, Evolution, and Systematics, 2009
- A working guide to boosted regression treesJournal of Animal Ecology, 2008
- Sensitivity of predictive species distribution models to change in grain sizeDiversity and Distributions, 2007
- Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splinesDiversity and Distributions, 2007
- Modelling distribution and abundance with presence‐only dataJournal of Applied Ecology, 2005
- Non‐ and semi‐parametric estimation of interaction in inhomogeneous point patternsStatistica Neerlandica, 2000
- Area-interaction point processesAnnals of the Institute of Statistical Mathematics, 1995
- Approximating Point Process Likelihoods with GLIMJournal of the Royal Statistical Society Series C: Applied Statistics, 1992
- Continuum Concept, Ordination Methods, and Niche TheoryAnnual Review of Ecology and Systematics, 1985
- A new algorithm for adaptive multidimensional integrationJournal of Computational Physics, 1978