1 Introduction
In the introduction to “The origin of species” published in 1859, Charles Darwin described in length his observations on the process of domestication of both plants and animals and how these inspired, to some extent, his theory of natural selection. It is particularly noteworthy that, despite the fact that he was not aware of the contemporary work of G. Mendel on the laws of heredity, C. Darwin observed that the ancestral traits of domesticates often appear in the progenies of crosses obtained between distinct breeds. This particular observation led him to propose that domesticates originated from their wild relatives through human selection. Approximately 150 years later, this theory has undergone considerable changes, in particular, in view of the spectacular advances in the field of genetics. Nevertheless, there is no doubt that it has participated to a large extent to the development of modern agriculture and to the sustainability of global food security. Over the past ten years, the technological development of new approaches for the study of plant and animal genomes has yielded considerable amounts of DNA sequence data. For several organisms, the complete genome sequence has been made publicly available. In plants, this is the case for Arabidopsis thaliana [1], rice [2], poplar [3], and grapevine [4]. The completion of the rice genomic sequence, in particular, has been considered as a milestone in agricultural research because it gave access for the first time to the complete gene repertoire of a crop species. For many other economically important cultivated plants, increasing amounts of genomic resources are available, which facilitate the development of molecular markers that can be used for the construction of dense molecular maps and allow the discovery of the genetic factors controlling important agronomic traits. All these recent advances in plant genomics have improved our knowledge of the process of plant domestication. On the one hand, several genes involved in the phenotypic changes associated with the selection of cultivated types have been characterized [5]. On the other, the development of molecular markers has enabled the tracing of the origin of their domestication [6]. This includes both the identification of the wild gene pool from which the proto-farmers have selected the first cultivars as well as the geographical origin of the(se) event(s). In this article, we will give an overview of the recent advances that molecular genetics and genomics have made possible in the study of plant domestication. We will essentially focus on the genetic bases of domestication traits, mainly for cereals, since most of the data available concerns this group of crops. We will then review the most recent reports concerning the exploitation of molecular data for the study of the origin of rice.
2 Unravelling the genetic bases of the domestication syndrome: The case of cereals
The domestication syndrome is defined as a suite of traits that allow the farmer to grow the crop in the field [7]. In the case of cereals, these traits include reduced seed dispersal (the loss of seed shattering at maturity being considered as a key trait enabling the harvest of the crop), changes in plant architecture (reduced number of tillers, both primary and secondary, reduction of the seed envelope), increase in seed size and loss of dormancy. During the past ten years, several studies have focused on the characterization of the genetic factors involved in the domestication syndrome in cereals. In maize, early studies have shown that only few loci are involved in the dramatic changes that differentiate the cultivated form from its wild progenitor, i.e. the teosinte [5]. Teosinte plants harbor many tillers, both basal and axillary, that give them a bushy appearance. Each of these tillers ends with an inflorescence, baring few seeds at maturity. In addition, teosinte seeds are completely covered with a thick glume, the cupulate fruitcase, the stony nature of which makes the processing of the kernel difficult. These morphological differences between the cultivated and the wild forms, mainly in the architecture of the plant and the inflorescences, are such that teosinte was initially given a different genus name than maize by botanists (Euglena mexicana for the wild vs Zea mays for the cultivated form). It was then established that maize and teosinte are interfertile and thus should be given the same species name (i.e. Z. mays). Two genetic factors involved in the morphological differentiation associated with maize domestication, first identified as quantitative trait loci (QTLs) having a major effect in progenies of maize × teosinte crosses, have been recently cloned and characterized at the molecular level. Tb1 (teosinte branching) controls the development of axillary branches in teosinte. A mutation at this locus lead to the suppression of the development of axillary meristems, thus causing a switch from the “bushy” phenotype of teosinte to the maize phenotype, i.e. with the development of only basal tillers [8]. The second locus, Tga1 (teosinte glume architecture), controls the seed shape. A mutation at this locus is responsible for the loss of development of the glume in maize, making possible the processing of the seed [9]. This gene belongs to the SBP-domain family of transcriptional regulators. Although the causal relationships between the sequence variation at these two loci and the switch from the wild to the cultivated phenotype has not yet been definitively established, these studies clearly show that only few genes may have been the target of selection during the process of domestication [10]. It should be mentioned that other QTLs with minor effects have been identified for the domestication traits in this crop. Whether these genetic factors have been selected posterior or concomitantly to the domestication remains unclear. In any case the combined effect of both Tb1 and Tga1 mutations on the wild form could solely account for a clear domesticated phenotype that could have allowed protofarmers to grow and harvest the crop for their consumption. In addition to the maize studies, several recent reports have demonstrated that only few genetic factors are involved in the domestication syndrome of other cereals. This is the case for the shattering genes qSH1 [11] and sh4 [12] in rice. These two genes are transcription regulators involved in the formation of an abcission layer in the rachis of the inflorescence. In the wild plant, the abcission layer is composed of cells with thin walls that fragilize the panicle at maturity, thus causing the shattering of the seeds. The mutation of the genes involved in the formation of the abcission layer therefore prevent the shattering and the loss of the grains before harvest. These recent studies provide for the first time a molecular proof of the effect of human selection on a plant during its domestication several thousand years ago and an eventual validation of the theory proposed by Darwin. In addition, these studies show that only few loci may have been the target of selection during the process of domestication and consequently at the molecular level, wild and domesticated forms are distinct for only a few genes. Given that plant genomes, like those of higher eukaryotes, harbour some 30 000–40 000 genes, one would expect that these differences represent a very small percentage of the overall genome. One consequence of this important result is that most of the molecular markers developed for any crop species should be useful in unravelling the history of the crop and, in particular, the origin of its domestication.
3 The theory of selection and modern plant breeding
Although the domestication is undoubtedly the crucial step in the birth of agriculture, the actual diversity of cultivated plants that is found today (and also observed by Darwin during his studies) can be regarded as the result of breeding efforts of generations of farmers. In this regard, one should particularly emphasize that the selection of crops has been continuous in human history since the late neolithic [13]. Moreover, the development of genetics over the last 100 years and advancements of molecular biology and genomics more recently has put the selection of crops in a totally new perspective, although the action of the plant breeder is based on the same concepts as at the beginning of agriculture, i.e. the selection of the most favorable phenotypes. Human selection has first switched from an unconscious action (e.g. the harvest of seeds on non-shattering plants has increased the frequency of the non-shattering alleles in the cultivated population) to a conscious action, by applying the concepts of genetics, even before these were formulated by G. Mendel in 1866 [14]. It is indeed noteworthy that C. Darwin, in the introduction to “The origin of species”, describes the observations made on the offspring of various crosses between pigeon breeds (particularly in relation with the domestication of the bird) with the same details and pertinence as in the case of G. Mendel with garden peas. This conscious selection of the “modern” plant breeder targets one or few genes that controls the favorable traits, among the thousands that constitute a plant genome and therefore necessitates a good knowledge of the genetic bases of the traits which have to be improved. The advances made in plant molecular biology make possible the identification of the genetic factors controlling most of the agronomically important traits, as mentioned previously in the case of the domestication genes of maize. In addition, progress in genetic mapping have provided the breeders with efficient molecular markers that allow one to target a specific genomic region harbouring one or several of these genetic factors during a breeding scheme. In this regard, several innovative breeding approaches have been developed in order to exploit the reservoir of genetic diversity that represents the wild relatives of most of the major crops: because only few genetic factors are responsible of the phenotypic switch between the wild and the cultivated forms, one could predict that introgressions from the wild progenitor into the cultivated plant should not affect the agronomical performance of the crop and even, in some cases, significantly contribute to its improvement. This has been demonstrated in the tomato [15] and rice [16]. The major consequence of this conceptual change is that selection (in the case of plant breeding) now acts on the genotype directly. More recently, the advances made in plant biotechnology have enabled the breeders to go one step further in the selection of favorable genes by engineering them in vitro and introducing them in the plant by transformation. The new phenotypes arising from such biotechnological approaches are still the products of human selection, but the target of this selection process, i.e. the sequence of DNA, is now physically distinct from the plant itself. More importantly, these new alleles are actually not selected from some individuals within the population, but created in vitro by man. One could therefore argue that this double shift in paradigm, from a populational to a molecular level, and from selection to creation, has caused the obsolescence of the neo-darwinian approach of plant breeding. This is not exactly the case yet, because the “classical” approach of cultivar development, exploiting the existing diversity of the cultivated species and of their wild relatives, is still predominantly applied by breeders throughout the world to face the ever-changing challenge of maintaining food security.
4 Understanding the origin of rice domestication using molecular data
Most of the economically important crops have been domesticated during the late neolithic, i.e. 8000–10 000 years ago. This is the case for all the cereals including rice. There are at least five centers of origin for cereals, spread in three continents: the fertile crescent in Middle East for the Triticeaes (wheat, barley and oat); Central America for maize; West Africa for pearl millet, sorghum and rice (the African species Oryza glaberrima); South Asia for rice (the Indica type of the Asian species Oryza sativa) and East Asia for rice (the Japonica type of O. sativa) and foxtail millet [17]. The origin of rice, i.e. the places where the domestication of the crop occurred, has been long debated and is still being discussed [18]. In this section, we will show how the molecular data currently available for the species has allowed one to clarify, to some extent, this important question.
Rice is cultivated in the five continents. It is one of the main stays of the global food security and the basic source of carbohydrates for billions of people in Asia, Africa and South America. Because of its success as a staple food crop worldwide, it is difficult to establish the geographical origin of its domestication event(s) [19]. There are two cultivated species of rice. Oryza sativa, referred to as the Asian rice (although it is cultivated worldwide) and Oryza glaberrima, exclusively found in Africa. It is known that these two species originated from two distinct domestication events. Oryza glaberrima was domesticated in West Africa, probably in the central delta region of Niger river. The closest wild relative of O. glaberrima is O. barthii, an endemic species of west Africa, often found as a weed in rice fields.
Although it is clearly established that the closest wild relative of the Asian species O. sativa is O. rufipogon, a species found exclusively throughout Asia, the actual origin of the domestication of the crop is not as known as in the case of the African species. The main debate concerns the number of domestication centers. Today, there are two main cultivar types of O. sativa: the Japonica and the Indica types. The existence of these two types is very ancient, because both are described during the Han dynasty (206 BC – 220 AD, [20]). Several authors have thus proposed that these two forms may have originated from two distinct domestication events. However, the most ancient traces of rice cultivation have been found only in the Yangtze river basin in China [21], although there are a number of more recent additional archaeological records of rice cultivation throughout East Asia, including Korea and Taiwan. Moreover, archaeological evidence of ancient rice cultivation in South Asia (South of the Himalayas) is more scarce, due to the tropical climate that prevents a good conservation of such ancient material. Consequently, if only archaeological records were to be taken into account, one can argue that there is only one center of domestication of rice in Asia, probably located in the yellow river basin. Other archaeological records also suggest an early spreading of rice cultivation throughout Asia from this center of origin. If this was the case, then the two cultivar types, Japonica and Indica, would have differentiated posterior to this unique domestication event, i.e. within the last 10 000 years.
There are therefore two contrasting hypotheses on the origin of Asian rice. The single domestication hypothesis posits that the crop was domesticated only once in China and that it differentiated through selection into the two main cultivar types known today, i.e. Japonica and Indica. The alternative hypothesis, hereafter referred to as the double domestication hypothesis, posits that Japonica and Indica types originate from two distinct domestication events. Since traditional Japonica varieties are predominantly found in East Asia, while Indica varieties are predominantly found in south Asia, these two centers of domestication would be located north of the Himalayas for the Japonica type and south of the Himalayas for the Indica type.
Several studies on the diversity of rice (both the cultivated forms and the wild relatives) using molecular markers have been published. Glaszmann [22] conducted a comprehensive survey of O. sativa using isozyme markers. This study clearly showed the genetic differentiation of both Indica and Japonica type at the molecular level. This result has since been confirmed with many other molecular markers, such as RFLP [23], AFLP [24] and SSR [25]. It should be emphasized, however, that this genetic differentiation into two distinct gene pools is not in conflict with the single domestication hypothesis, because it could be the result of a strong selection for the two distinct plant types, posterior to the domestication. The main question remains unanswered: when did the radiation of the gene pool into both Indica and Japonica occur?
Recently, some studies on the timing of the differentiation of the two types, based on the analysis of transposable elements, have allowed one to establish that Indica and Japonica types arose from two distinct domestication events. Transposable elements are mobile genomic sequences found in most living organisms. They can be classified into two major classes (I and II). Class I elements, commonly called the retrotransposons, transpose in the form of a mRNA, via a copy and paste mechanism. Class II elements, the transposons, transpose in the form of DNA and thus via a cut and paste mechanism. Class I elements, the retrotransposons, are of particular interest as genetic marker for the study of the history of a crop. First, they insert randomly into the genome. Second, their insertion is not reversible (an element can only transpose by first being transcribed into a mRNA, which is in turn reverse-transcribed into DNA and can reinsert in another location into the genome). As a consequence, if two accessions harbor the same element in orthologous position, one can conclude that they derive from a common ancestor which also carried the same element. Third, a particular type of retrotransposons, the LTR-retrotransposons can be used to date the radiation events in a particular evolutionary lineage: upon insertion into the genome, the two LTR (long terminal repeats) flanking the retrotransposon are strictly identical in sequence. Then, these two sequences diverge from each other over time. The level of divergence between the two LTRs of a given element can be translated into an insertion date by translating the divergence rate into a time period (the concept of molecular clock). This concept was named the genomic palaeontology [26]. It was applied to rice in order to tentatively date the radiation between Japonica and Indica gene pools. The authors estimated this date at 200 000 years, which is undoubtedly prior to the domestication (10 000 years ago). A similar study by Ma et al. [27] led to a similar conclusion, although their estimation date for the radiation between the gene pools was 400 000 years (because the authors used a different rate of molecular clock). These two studies clearly show that there are at least two domestication centers for O. sativa in Asia.
The next step in the study of rice domestication is to unravel the location of these domestication centers. This could be achieved by looking for LTR-retrotransposon insertions which are common between the cultivated form and its wild relative, for both Indica and Japonica types. Some preliminary studies have shown that several insertions are common between the Japonica type varieties and some Chinese accessions of the wild relative O. rufipogon, which is in agreement with the hypothesis of a center of domestication in the Yangtze river basin (Ishii, pers. comm.). In the case of the Indica gene pool, the first results of a screen of O. rufipogon accessions from the Southern hills of the Himalayas did not provide any conclusive results. Several authors have suggested that the domestication of Indica-type rice was a diffuse process extending from Nepal to Thailand. If this is the case, then it would be difficult or even impossible to locate the first event of domestication in this region of Asia.
5 Conclusion
A hundred and fifty years after the publication of “The origin of species”, where Charles DARWIN exposed his theory of natural selection, the synthetic theory of evolution, which is now widely accepted by the scientific community. In plant breeding, this theory has been applied over the last century to produce highly performant varieties of most crops. Today, biotechnologies have brought considerable changes in the way new cultivars can be developed. These involve the engineering in vitro of new alleles or genes. The modern plant breeder, heir of the first farmer who selected the domesticated forms 10 000 years ago, will have to face the challenge of maintaining the global food security by integrating these new concepts with the more classical approaches of breeding, while preserving biodiversity, that still remains the key factor for the sustainability of agrosystems. Finally, the increasing amounts of genomic resources made available by the sequencing consortium will generate in the future more information that could help us understand the history of our crops, our agriculture, and in some ways of mankind.