Human demography and epidemiology cover much common ground: the occurrence and causes of diseases, and with the possibility of a fatal outcome being of particular importance in demography, and reproduction, particularly its pathological aspects being of particular importance in epidemiology. The methods used in these two disciplines also have a number of points in common: the use of life tables (in the broad sense of the word), standardisation, multivariate analysis models (e.g., logistic regression) and considerations of selection and interaction phenomena. However, these two disciplines ignored each other for many years. In some cases, this was simply a question of vocabulary: demographers carry out epidemiological studies without realising it and epidemiologists are demographers but do not know it. Sometimes, the two disciplines pose the same questions, but provide different answers: in this case, each has much to gain by comparing the two approaches. In other situations, demographers and epidemiologists take genuinely different viewpoints, justified by different historical traditions and cultures. Here, we examine the state of play in the domain of fertility.
1 Studies of fertility; the demographer's approach
Ever since the first studies of populations (the term demography only appeared much later, in the middle of the 19th century [1]), fertility1 has occupied an important position in the field. One of the essential aims of such studies has always been to estimate present and future population growth rates. The population growth rate at a given time is calculated as the difference between the birth and death rates (annual numbers of births and deaths divided by the size of the population), plus or minus the net number of migrants (depending on whether there has been more immigration or emigration). The future growth rate depends, obviously, on fertility conditions and changes in death rates from year to year, but previous changes continue to make their effects felt for very long periods. Alfred Lotka, in the 1920s and 1930s, was the first to define the relationships between fertility, mortality, growth rate, and age structure of the population, in certain conditions of stability and in the absence of migration [2,3]. He showed, in particular, that the intrinsic growth rate (also known as the Lotka rate) depends only on current fertility and mortality rates. This intrinsic growth rate is the level towards which the population will converge if there is no further change in fertility and mortality rates, regardless of the initial age structure of the population.
Unlike death, which is a single, certain event in every individual's life, fertility is a repeatable (it is possible to have one, two, or even 10 children) and optional (some individuals have no children) event. It is also possible to ask individuals about their wishes concerning fertility and about the means (contraception, abortion, or fertility treatments) they use to control their fertility; it is however not possible to ask these individuals about the causes of their deaths after they have died and it is entirely pointless to ask them about this subject beforehand – almost everyone hopes to delay this event for as long as possible. This may well account for the greater importance given to fertility than to mortality by demographers. It begins with a strictly demographic approach, involving the description of the arrival of children as a function of parental (maternal or paternal) age, marital status, number of children already born and the time elapsed since the last birth; this constituted the basis of demographic analysis, a speciality of the French School (Pressat [4]; Henry [5]). Such analysis may be complemented by observations (through specific surveys) of the contraceptive methods used and the aspirations of couples or individuals: number of children desired, desired spacing, and ideal age for beginning one's reproductive life. These studies can also be pursued by demonstrating differences according to diverse socio-economic characteristics (e.g., level of education, profession, economic status), these differences having possible effects on all the above-listed variables.
The question of how to link current fertility measurements to future population growth was rapidly posed. Alfred Lotka provided the response in the case of constant fertility and mortality rates. His intrinsic growth rate – the limit value for growth rate – reflects the growth potential of the population, regardless of its current effective growth rate. In France, for example, the natural growth rate, excluding migration, was in 1975. In the same year, the Lotka rate was . The comparison of mortality and fertility rates for that year showed that there were too few births to compensate for the number of deaths in the absence of a change in behaviour. This happens also if the age structure of the population in a year includes a large proportion of women of reproductive age. However, the situation becomes more complicated if one of the parameters is not constant, particularly if the variable parameter is fertility. The effects of variable fertility are particularly strong because not only can the final number of children change, but also can the parental age at which those children are born, because the reproductive process may extend over several tens of years within a single generation. Let us return for a moment to a ‘stable’ population, in the sense meant by Lotka (constant fertility and mortality rates), in which fertility rates are such that 100 women from one generation are replaced by 140 women arriving at maternal age in the next. Let us assume that the mean age at first delivery is 28 years. The (female) population would therefore increase by 40% every 28 years, giving a growth rate of 1.2% per year. If we now assume that the mean age at first delivery is 34 years rather than 28, the growth rate falls to 40% every 34 years, or 1.0% per year. Thus, even if the number of children per parent (lineage) in successive generations is unchanged, and fertility rates are identical, the population growth rate is nevertheless still variable.2
Demographers, like politicians, are often impatient. They cannot wait for a generation to finish constituting its lineage (towards the age of 50 for women) to measure fertility or, more precisely, its ‘intensity’. They therefore prefer to make use of annual data, to which all generations of childbearing age contribute (women between the ages of 15 and 49 years). It is easy to construct, for year A, a measure of fertility intensity equivalent to the final lineage of a generation: this requires the simple summation of fertility rates by age for year A. The result is expressed as a number of children per woman and is known as the total fertility rate (TFR). Each of the 35 one-year age groups concerned contributes its behaviour for a given year of age, and the result does not therefore reflect the overall behaviour of any one of these real cohorts. Thus, when we follow changes in this indicator over successive years, the results obtained may differ considerably from the final lineages of the generations reproducing during the corresponding period. The real problem is that when the TFR decreases, we do not know whether this decrease results from a decline in final fertility over successive generations, or whether it simply reflects a change in the timing of births (as is the case with the current tendency to have children later in France, e.g.).
A simple example will make this clearer. Let us assume that, in year A, all the women of childbearing age decide to avoid becoming pregnant, for whatever reason. All the indicators of fertility for year A (birth rate, TFR, fertility rates by age, etc.), with a lag time of nine months to take into account the mean duration of pregnancy, will therefore be zero. Nonetheless, in a context of low fertility, in which couples wish to have two children on average, for example, these couples have time to make up for the lost year and to end up with about the number of children they wish to have. The final lineage of successive generations can therefore bear no trace of the very strong disturbance registered in year A. It is highly unlikely that such an extreme situation would ever occur, but we have seen, in recent years, sudden changes in fertility due to an abrupt, temporary change in the behaviour of couples. In Japan, 1966 was the astrological year ‘Hinoema’ or ‘Fire Horse’, a conjunction occurring once every 60 years. Girls born under this sign are reputed to be very difficult to marry as they will ‘destroy their husband’ in one way or another. As it was difficult to control the sex of the children to be born, many Japanese couples preferred not to have children in that year. The fertility rate fell to 1.58 children per woman, from 2.14 in 1965. It raised again, to 2.23, in 1967. This represents a temporary decrease of 26%, greater than has ever been observed outside periods of serious political or social crisis [6]. For similar reasons, but in the opposite direction, a spectacular increase in fertility was observed in the Chinese Population of Korea during a year of the Dragon [7]. In both examples, the final numbers of children of the generations bear no trace of these abrupt variations in the fertility calendar.
Considerations of tempo effects (these effects relating to the timing of births) and quantum effects (relating to fertility intensity) are particularly important to demographers, due to the consequences of these effects for future population growth. By contrast, these issues are of little or no interest to epidemiologists.
2 Studies of fecundity
As fecundity refers to the ability to reproduce, it is necessarily much more difficult to measure than fertility, which is calculated from the number of births observed. It was again in the 1920s that a demographer and statistician, Corrado Gini, developed the concept of fecundability, defined as the likelihood of conceiving during a normal menstrual cycle with sexual relations and no contraception [8,9]. Gini looked for explanations for the decreasing fertility and natural growth in European populations observed at the time, considering that it might result from a decrease in the reproductive capacities of these populations for biological reasons. This notion – fecundability – is important, but insufficient to describe fecundity fully. We will return to this concept in the next section.
From the 1950s onwards, efforts were made to analyse the process of human reproduction in its entirety, by searching for and identifying its principal components. Biological data, conditioning the ability to achieve the desired level of fertility, were identified as important. In addition to fecundability, as defined above, scientists wanted to determine the age at which couples become definitively sterile. This issue is of particular importance today, when couples are increasingly delaying having children and run the risk of not being able to have the children they wish to have as they are no longer fertile. It is difficult to estimate the proportion of couples who are definitively sterile (except in cases of medical sterilization) as a function of the age of the woman. In most cases, estimates are indirect, based on populations not practising birth control, such as historical populations. The degree to which these estimates can be applied to current populations is debatable, and there are two major arguments, one for and the other against. General health conditions have greatly improved over time, and the risks of infection after delivery, for example, are now much lower than they once were. Thus, some of the pathological causes of sterility (other than those linked simply to the ageing process) have become less important, decreasing the risk of early sterility. In contrast, one of the major causes of pathological sterility, sexually transmitted diseases, has certainly become more widespread, due to the greater sexual freedom of both men and women today. In practice, the proportion of couples remaining childless at the age of 50 years, having married at an age of 20 to 25 years, was low in French populations under the ‘Ancien Régime’ (17th and 18th centuries: 3 to 5% [10]), and is currently even lower in populations of countries that have maintained a high fertility rate, particularly in Africa and Asia (often less than 3% [11,12]). Such low rates suggest that poor hygiene conditions do not always result in high sterility rates.
It is also possible to estimate infertility from data obtained in surveys in which women are asked about the exact conditions of their exposure to the risk of an unwanted pregnancy and their desire to have children. We should first rule out the possibility of asking people directly whether they consider themselves sterile. As sterility is not a disease and, in most cases, has no visible symptoms, very few couples would be in a position to declare themselves sterile. Even the menopause is only a very late indicator of the end of a woman's reproductive life, and is becoming increasingly difficult to detect due to hormonal treatments administered in the perimenopausal period. Only couples actively seeking to have children are in a position to report possible difficulties, and these difficulties may relate to a state of subfecundity rather than total sterility.
This is why we sometimes construct indicators based on the information available about the degree of exposure to the risk of pregnancy in the years preceding a survey. We thus measure perceived infertility, as declared by the women and men interviewed, in the same way that epidemiologists evaluate the proportion of the population ‘in good health’ or construct indicators of life expectancy without disability [13]. A series of surveys giving this type of indicator are available for two countries: France and the United States. In France, for example, representative samples of women have been questioned about whether it had taken longer than they would have liked to conceive (but they managed to conceive eventually) or whether they had tried to conceive but had failed. Three surveys carried out by INED in 1978, 1988, and 1994 can be considered. Surprisingly, the data show almost no effect of age on difficulties that could not be overcome (Table 1) and a decrease after the age of 30 years for difficulties that were overcome (Table 2). We might have expected women interviewed at 40 to 44 years of age to declare more such problems than women interviewed at the age of 30 to 34 years do. In fact, most of the declared problems seemed to have been encountered early during the constitution of the family, at the same age in all cohorts. Conversely, we observe an increase in declared problems over successive surveys: given the short interval separating these surveys, this pattern of change is unlikely to be real. We have shown that couples have become increasingly impatient over the last 20 years, accepting failure and delayed conception less and less willingly [14]. As a result, it is the youngest women (from the most recent generations) who declare the most problems.
Proportions (%) of women declaring at least one failure to conceive (per 100 women trying to conceive), as a function of age at the time of interview
Age at survey | 1978 | 1988 | 1994 |
25–29 years | 3.1 | 6.2 | 9.4 |
30–34 years | 4.6 | 6.1 | 13.9 |
35–39 years | 1.8 | 6.3 | 12.7 |
40–44 years | 4.6 | 6.4 | 11.2 |
45–49 years | – | 6.1 | – |
25–44 years | 3.6 | 6.3 | 11.9 |
Proportions (%) of women declaring problems conceiving (per 100 women trying to conceive), as a function of age at the time of interview
Age at survey | 1978 | 1988 | 1994 |
25–29 years | 14.7 | 29.4 | 14.5 |
30–34 years | 16.2 | 26.8 | 26.2 |
35–39 years | 14.4 | 24.9 | 31.4 |
40–44 years | 12.9 | 18.4 | 20.1 |
45–49 years | – | 14.2 | – |
25–44 years | 14.6 | 24.8 | 23.3 |
In the United States, data from the NSFG (National Survey of Family Growth) surveys have been used to calculate the number of ‘infertile couples’ [15]. The infertile couples identified were married, neither partner had undergone medical sterilisation, and they had used no method of contraception over the last 12 months but had not conceived. The tendency of this indicator to decrease over several decades has caused some astonishment and triggered a recent debate [16]: although calculated more objectively than the other indicator described above, it is nonetheless based on declarations by individuals concerning their behaviour, calling its pertinence into question.
3 Fecundability and time to pregnancy
The notion of fecundability has been used by demographers since the term was first coined in the 1920s. It is directly linked to the concept of time to pregnancy (TTP), which is generally used to measure it. Ideally, the first month of exposure to the risk of pregnancy should be sufficient to measure the mean fecundability of the group concerned, but several factors render such estimates unreliable. Firstly, as demography generally deals only with live births, demographers initially limited their estimates to effective fecundability – the probability of a pregnancy leading to a live birth. Given the duration of pregnancy, with most births occurring in the ninth month, it is at this time point that effective fecundability is estimated. In reality, variability in the duration of pregnancy leads to a slight underestimation of the true rate of conception.
It is also very difficult to constitute a cohort in which the first month of exposure to the risk of pregnancy can be identified with certainty. Fecundability has often been estimated from the start of marriage, assuming that there are almost no conceptions before marriage and that exposure to the risk of pregnancy begins at the time of marriage. In practice, some of the newly weds may have had sexual relations before the marriage exposing them to the risk of premarital conception. Such premarital conceptions would result in births in the first few months of the marriage and would reduce the number of births in the ninth month. It is possible to correct the estimate obtained by ignoring all pregnancies resulting from premarital conception, but only at the expense of a probable bias: the most fertile of the couples having sexual intercourse before marriage are the most likely to conceive first. By excluding these couples, we increase the risk of underestimating the fecundability of the group.
These issues were first discussed by Christopher Tietze, an American doctor and demographer, in two pioneering articles published in 1950 and 1956 [17,18]. In these articles, Tietze reported (mostly previously unpublished) data from surveys carried out in the United States between 1942 and 1953. He described the distribution of time periods over which women had conceived a live-born child since the start of exposure to the risk of pregnancy (marriage or cessation of contraception). These distributions presented a frequently observed distortion: there were too many conceptions in the first few months with respect to the number in subsequent months, suggesting that the couples tended to declare that they had “not had to wait at all”, even if the time taken to conceive was actually one or two months (unless these conceptions were actually accidental, but were subsequently declared as planned). Although these articles broke new ground in the way in which the conditions for analysing TTP were discussed, they did not deal with any of the factors potentially influencing TTP, not even maternal age. Instead, Tietze stated: “Little is known about the interrelationships of time required for conception and such factors as age, duration of marriage, number of children borne, and so forth” [18 (p. 94)].
We therefore prefer to base estimates of fecundability on the entire distribution of births (or conceptions). Analysis of this distribution also made it possible to demonstrate an essential point: all populations are heterogeneous in terms of the likelihood of conception. The probability of conception decreases month after month, at such a rate that it cannot be due to a decrease in reproductive capacity or in the frequency of sexual relations. The only plausible explanation is that the observed decrease in the likelihood of conception is due to the gradual selection of the least fertile couples, as fertile couples conceive first. This observation is far from surprising, particularly for epidemiologists investigating the risk factors responsible for interindividual variation in TTP.
4 The two points of view
Both demographers and epidemiologists work at the scale of populations or particular groups. However, their approaches differ in several key ways. Firstly, demographers are principally interested in measuring the level of a particular variable, such as mean fecundability, in a population, and in the distribution of values for this variable around the mean. They consider these measures interesting in themselves, providing useful information about the populations studied. These measures are also required as input data for more general models, such as those for analysing the entire reproductive history of a generation. These data would also be of potential value to epidemiologists, if epidemiologists did not have the unfortunate tendency to consider ‘descriptive’ epidemiology as making little real contribution to scientific progress. This is why the first good measures of the likelihood of conception in a given month, like those for spontaneous abortion or becoming sterile, were developed largely by demographers.
Epidemiologists prefer to focus on the factors accounting for differences between individuals or populations. Indeed, certain epidemiological methods cannot be used to calculate mean levels: instead, they provide only an indication of the difference between groups (relative risk and odds ratio). Why is it that demographers, who are perfectly aware of this variability and seek to estimate its magnitude, do not also consider the causes of this variation? The most likely reason is that the data they routinely handle provide no information whatsoever about these factors. When we speak of fertility ‘factors’, demographers think about age (maternal and paternal), parity, duration of marriage and time since last pregnancy. These variables are generally available to demographers but, with the exception of age, are not considered ‘explanatory’ by epidemiologists. Instead, epidemiologists try to consider particular types of exposure or medical histories that might provide a biological explanation for the observed variation.
For epidemiologists, it is also important for the variables of interest, such as exposure, to have some meaning at the individual level. The aim is to be able to tell a couple with well-defined characteristics (e.g., age) by how much TTP is likely to be increased if, for example, the man smokes or if the woman has had a gynaecological infection. Demographers do not have such concerns: they can therefore use indirect estimation methods to evaluate indicators that cannot be measured at individual level. Fecundability provides a good example. We cannot determine the fecundability of a given couple (it would require a large set of TTP data for the couple, which is unrealistic if the couple only wants two or three children), but it is possible to estimate the distribution of fecundability in large groups of couples. The age at which definitive fertility is acquired is another example. It is not possible to predict this for a given couple and there is no biological indicator demonstrating whether definitive fertility has been reached, but we can construct a distribution of the age at which definitive fertility is reached in particular populations. This is almost certainly why all other disciplines largely ignore the notion of “the age of acquisition of definitive sterility” routinely used by demographers studying reproduction.
It is almost certainly also the reason why TTP is the principal working tool of epidemiologists in the domain of reproduction, whereas, starting from the same measure, demographers prefer to consider the level of fecundability, a notion that appears pointlessly ‘reconstructed’ in the eyes of their epidemiologist colleagues.
If we come back to the history of measurements of fecundability, we see that the paths of demographers and epidemiologists diverged in the 1980s. It was an Italian demographer, C. Gini, who defined TTP and first drew attention to this concept [8,9]. In the 1920s and 1930s, Raymond Pearl [19,20], an American doctor and biologist, had sought to identify the various components of fertility, with the aim of distinguishing between ‘natural’ and ‘controlled’ fertility. The Pearl index measures the likelihood of conception by combining diverse durations of exposure for a group of couples. Pearl had the merit of creating links between biologists, demographers, and epidemiologists. The work of C. Tietze, particularly in his articles published in 1950 and 1956, subsequently also became a common reference for these different groups of scientists. In 1964, the American demographers R. Potter and M. Parker [21], as well as M. Sheps [22] and the Frenchman L. Henry [23], suggested simultaneously that the variability in fecundability between couples could be modelled using a Pearson-I (or Beta) distribution. The rationale behind this proposal was largely practical, as this function is particularly suitable for progressive selection processes. If the parameters of the initial distribution are a and b, after n months in the absence of conception, the distribution of fecundability among couples who have not yet conceived is still of the beta type, with the parameters a and . The mean, in particular, is very easy to calculate, initially as and then after n months as . The same function has also been used for estimating the heterogeneity of intra-uterine mortality [24].3
These initial articles were followed by a large body of methodological work, e.g., [10,25–30]. Models have been developed more recently for the simultaneous estimation of fecundability and sterility [31–34] and of fecundability and its variation with age [35].
However, objections to this approach have been raised, based on the argument that the beta function is not appropriate, for at least two reasons. Firstly, W. James [36] suggested that the beta function is inappropriate when including the frequency and distribution of sexual relations throughout the cycle into the calculation of fecundability. An entire area of demographic research on reproduction [37–39] has been devoted to the integration of such data, which are probably a major determinant of fecundability (in the absence of data, we generally assume that couples have ‘normal’ activity, with no further details). Some demographers and epidemiologists have also explored this path as a means of breaking down the process leading to conception into successive steps: ovulation, exposure of the egg to the possibility of fertilisation, abortion, or implantation [40–42]. Secondly, Wood and Weinstein [43] suggested that the use of the beta function to describe fecundability might fail to account correctly for certain biological factors that might explain differences in fecundity. However, this did not stop Wood concluding a few years later that no better approach had yet been found [10].
Consequently, doctors and epidemiologists preferred to use Kaplan–Meier estimates, with no specific hypotheses. During the 1960s and 1970s, many studies aimed to measure pregnancy rates after fertility treatment. A large number of articles on this subject were published in specialist journals, but no particular attention was paid to the methodology used. In the 1980s, questions were raised about the selection bias involved when studying couples undergoing treatment and the need to consider spontaneous pregnancy rates in the absence of or during treatment. These considerations led to question the validity of the success rates published for ovarian stimulation methods [44,45]. Attention then gradually moved towards studies of untreated populations, to identify factors controlling the differences in fertility between couples, with the aid of survival curves. The statistical basis of these approaches was well established [46], so debate focused principally on data quality [47]: recall bias [48], selection bias [49,50], and sampling methods [51,52]. The question of the role of the age of the woman and the man is still relevant (the article by Schwartz and Mayaux, in 1982 [53], triggered vigorous debate), but the study of environmental factors is becoming increasingly important, with the article of Baird et al. often considered a reference in this area [54–56].
This interest in environmental factors has been triggered by fears about a fall in reproductive capacity, particularly among men, due to environmental degradation. It is very difficult to demonstrate trends over time, due to the uncertainties involved in measurement and variability in observation methods; consequently, the creation of ‘observatories’ using robust methods and repeat surveys have recently been recommended [51,57,58]. Fecundity is thus becoming an increasingly important area in epidemiology.
The good news is that demographers and epidemiologists move now closer together.
1 It should be noted that, for demographers, fertility concerns the observation of births (live), as registered by the administration, for example, whereas fecundity describes an ability to procreate, which varies from individual to individual. It should also be noted that the French translate fertility as fecondité and fecundity as fertilité. We will consider this distinction further below.
2 More precisely, we assume that survival rates at 34 years in the second generation are identical to survival rates at 28 years in the first generation, which is plausible. In any case, the difference between the survival rates at these two ages is very small.
3 In fact, L. Henry had suggested using this function in a previous article (1961). In the appendix of the article in 1964, he proposed using the same mathematical law to estimate the dispersal of intrauterine mortality data, by analysing the risk of miscarriage in successive pregnancies. We generalised this approach in 1976.