1 Introduction
Amphibians are declining locally and globally for reasons that act at local and global scales [1–4]. Both the evidence for the declines and our understanding of the demography, population dynamics, and distributional changes that underlie amphibian population declines are largely based on incomplete count data that are used as index values for demography, population size, or distributions. Unfortunately, valid inference based on index values is only possible under the restrictive assumption that the detection probability is constant. Often it is impossible to detect all individuals, populations, or species; that is, the probability of detection, p, is <1 and possibly variable among observers, time periods, and other important factors. A simple formula can be used to describe the relationship between an index (C) and the true value (N):
I will use the simple formula C=Np to explain why index values are generally unreliable for understanding of the demography, dynamics, distributions, and declines of amphibians. This is the first goal of this paper. The second goal of the paper is to discuss how to estimate the probability of detection and how to adjust count data for detection probabilities that are less than one and variable. The subject is not novel at all; it has been discussed for many years in the wildlife management literature (and I will cite only a few key references [7,8,10,13–16] and hope that they serve as an entry point to the relevant literature). Nevertheless, amphibian ecologists seem to be largely unaware of the pitfalls of unadjusted count data (but see [17–20]) and therefore it seems worthwhile to discuss the issue. Some examples may serve to illustrate how pervasive the use of unadjusted count data is in amphibian ecology. Houlahan et al. [4] compiled 936 time series of amphibian populations of which roughly 95% were based on unadjusted count data (index values). The demography of amphibians is also largely based on unadjusted count data (return rates; [21,22]). To my knowledge, there is only one study of the distribution of amphibians that that takes the probability of detection into account [23].
2 Why are unadjusted count data of limited value?
I will provide some examples of unadjusted count data for demography, population size, and distributions to introduce the problem. The first use of unadjusted count data is to “estimate” demographic parameters such as survival probabilities. For example, a researcher may mark 65 frogs in the first year of a study and recapture 12 marked frogs in the second year. Then, an often used “minimum estimate” of survival, also known as the return rate, is 12/65=0.184.
Unadjusted count data are often used to describe the size of a population by, e.g., counting the number of frogs detected on a transect, counting the number of frogs heard calling at a pond, or the number of newts counted in a pond at night. This is then an index of population size.
Enumerating the number of populations in an area is also an example for the use of count data that are not adjusted for the detection probability, p. One may survey a large area and detect frogs in 7 out of the 34 wetlands visited. The belief is then that this reflects the true distribution. In all three examples, demography, population size, and distribution, there is an obvious problem, that is nevertheless often overlooked. Some individuals or populations may have been missed; that is, the probability of detection, p, is unknown, very likely <1, and variable among sampling occasions. Thus, all that count data that are not adjusted for the probability of detection can provide is a minimum estimate of unknown quality. The quality is unknown because it is not known how many individuals or populations were missed (p can take any value between 0 and 1). In the example given above the return rate was 0.184. The true value for the survival probability could be anywhere between 0.184 and 1. Such a rough measure of “survival” is clearly not useful for population viability analyses [24], sensitivity analyses [25], or for other purposes, such as the assessment of the effect of toe clipping on anuran survival (field studies of the latter always analyse return rates rather than survival probabilities; e.g., [26]).
The problem even gets worse when one attempts to compare two unadjusted counts (indexes). For example, one may wish to compare past and current population size or past and current distribution of a species. As shown above, a count C is the product of the true value of a population parameter N and a detection probability, p. A comparison of two unadjusted counts (the estimation of spatial variation or a temporal trend), C1 and C2, is therefore given by
Count data that are not adjusted for p are problematic. N and variation in N is of biological interest, but p and variation in p is sampling variation that is of no biological interest (except in special cases when the probability of detection is related to the probability of breeding [29]). Variation in p can generate variation in C when there is actually no variation in N. This makes the interpretation of unadjusted count data difficult or impossible and sampling variation can mask ecological patterns or generate spurious relationships [11,21,30–32]. Disentangling sampling and biological variation is essential to distinguishing between natural and anthropogenic population fluctuations [33].
3 Adjusting count data for the probability of detection
No ecologist assumes that all individuals in a population can be caught or that all populations of a species in an area can be found. There are essentially two ways of dealing with this problem. Some ecologists would assume that the resulting bias is small and ignore the problem. The other common solution to variable p<1 is the use of standard methods (e.g., [34]). The often tacit goal of standardization is to make p constant in time and space, such that C becomes a reliable index of N.
Standard methods should be used whenever possible. However, standard methods often cannot guarantee that p is constant in both time and space. Even when effort, for example, is constant and standardized, there are many other variables, such as weather, that affect p that cannot be standardized [10,13]. Indeed, the very focus of a study, e.g. the size of the population under study, may affect p, when calling activity or some other behaviour is a function of density. In the natterjack toad, Bufo calamita, for example, there are calling males and satellite males and the decision to call depends on density [35]. Thus, anybody who uses standard methods should provide some evidence that p is indeed constant. Even books that describe standard methods [34] usually do not provide such evidence. Even when p is constant, and C is assumed to be a reliable index of N, it remains to been shown that there is a positive and linear relationship between C and N; that is, the index has to be calibrated [7]. Often there is no clear relationship between C and N [12].
In my opinion, capture–recapture and distance sampling methods are the only reliable methods for the analysis of demography, population dynamics, and distribution of amphibians because they explicitly deal with detection probabilities that are variable and <1. Capture–recapture methods are available for the estimation of demographic rates, population sizes, species diversity, and the analysis of distributions [16,23,36–41]. Capture–recapture and distance sampling methods estimate p and this estimate, p̂, is then used to obtain an estimate of N, N̂ (i.e., p̂ is used to adjust C):
I will use the Lincoln estimator to illustrate the principle how the probability of detection is estimated and the count data adjusted. The Lincoln estimator can be used to estimate population size of a closed population and the number of species of a closed community [7]. As with all capture–recapture methods, a population or site must be visited more than once. The Lincoln estimator requires that a site is visited twice. The first observer recorded species A and B. The second observer recorded species B and C. Both observers recorded two species and it is clear that there at least three species and both observers missed at least one species. The Lincoln estimator can be used to estimate the true number of species at the site. The number of species detected in the first visit is n1=2, and the number of species detected during the second visit is n2=2. The number of species seen during both visits is m=1. The total number of species is estimated as
Capture–recapture methods are sometimes considered not useful because they are too time-consuming and cost too much (e.g., [42]; see [9] for suggestions how to design cost-effective monitoring programs). When individuals have to be marked in order to estimate capture probabilities, it may be easier and cheaper to collect unadjusted count data (but see [18,43]) but they are also of lesser value for both research and conservation. For example, it may not even be possible to assign populations to size classes (e.g., small, medium, large; [18,44]). However, in the case of estimating the number of species at a site, a capture–recapture analysis does not cost more, because sites are usually visited more than once. Thus, the raw data for a capture–recapture analysis are collected anyway. The same is probably true when the distribution of a species in an area is the focus of the study [23]. With repeated visits to several sites it is also possible to infer the absence of a species from a site [45].
I would also like to briefly discuss another and often overlooked advantage of using capture–recapture methods beyond an adjustment for p and standard errors. Unadjusted count data obtained using different standard methods are sometimes only weakly correlated. Hyde and Simons [19,20] used several standard methods to count terrestrial salamanders. The correlations between the counts obtained using different methods were low (the largest correlation coefficient was +0.43). Thus, one always has to use the same standard method, e.g. use bottle traps to capture newts. With capture–recapture methods, there is no such constraint. In one year, one may use bottle traps to capture newts and in the next year, one may use dipnets to capture newts. Capture–recapture estimates of population parameters will be the same for both methods (assuming that the methods have no sampling bias).
4 Discussion
Count data that are not adjusted for the probability of detection are usually poor data and they should be replaced with capture–recapture or distance sampling estimates (and their associated standard errors). Unadjusted count data are not unique to amphibian population studies. The famous cycles of the Canadian fur bearers (lynx, hare, muskrat, etc.) are based on unadjusted counts [46,47], most fisheries data are counts [48], and a vast majority of the bird population data are counts that are not adjusted [15]. Whereas we can extract a lot of information from unadjusted count data (see the review by Bjørnstad and Grenfell [48]), dealing with sampling variation (variation in p) is a major challenge for the future [27,48]. Unadjusted count data can generate variation when there is actually no variation and this variation can hide ecological relationships or generate spurious ones [11,21,30–32] whereas capture–recapture estimators account for sampling variation and are unbiased if assumptions are met [36].
Amphibian population data are usually unadjusted count data and this could be one of the reasons why we do not understand global amphibian declines very well. Clearly, some species, such as the golden toad Bufo periglenes, are extinct rather than underground and undetected for decades [49]. This conclusion seems safe, even though it is based on unadjusted count data. This does not mean that unadjusted count data can be recommended because for other declining species we would like to detect trends in distribution, population size, or demographic rates before they go extinct. For example, several recent studies suggest that variation in terrestrial juvenile survival has the largest effect of all life history stages on amphibian population growth rate [25,50,51]. Changes in terrestrial juvenile survival that may lead to population declines could be subtle: a 15% reduction in terrestrial survival could reduce population growth rate by 19% whereas a 46% reduction in embryo survival is required to lead to a similar decline in population growth rate [25]. Subtle changes are unlikely to be detected using count data that are not adjusted for the probability of detection. Estimates of population parameters based on capture–recapture analyses are sufficiently sensitive (e.g., [24]) such that we can predict population declines and conservation action can hopefully reverse the negative trends.
Acknowledgements
I thank D.R. Anderson, M. Kéry, and R.D. Semlitsch for comments on the manuscript and the Schweizerischer Nationalfonds for support (grant no. 31-55426.98 to J. Van Buskirk).