Dangerous liaisons: human genetic adaptation to infectious agents

The study of the demographic and adaptive history of Homo sapiens has entered its golden age with the advent of genome-wide approaches. The analyses of genome diversity across different human populations have allowed us to better understand the ways in which our species rapidly dispersed around the world, how our ancestors admixed with archaic, now-extinct hominins, and the effects of natural selection on the diversity of the human genome. This work has, in turn, made it possible to increase our understanding of the genetic mechanisms by which humans have adapted to the wide range of environments they have encountered. These studies, combined with functional genomics approaches, have helped to identify genes and biological functions of key importance for host survival against pathogens and involved in the phenotypic variability of our species, including the risk to develop infectious, autoimmune and inflammatory diseases. Résumé. L’étude de l’histoire démographique et adaptative d’Homo sapiens est entrée dans son âge d’or avec l’arrivée des approches pan-génomiques. L’analyse de la variabilité du génome à travers différentes populations humaines a permis de mieux comprendre la façon dont notre espèce a peuplé la planète en un temps record, l’histoire du métissage de nos ancêtres avec d’autres hominidés, et l’influence de la sélection naturelle sur la diversité du génome humain. Ces travaux, à leur tour, ont permis de mieux appréhender les mécanismes génétiques par lesquels l’Homme a pu s’adapter au large éventail d’environnements dans lequel il vit. Ces études, combinées à des approches en génomique fonctionnelle, ont aidé à l’identification des gènes et des fonctions biologiques de première importance pour la survie de l’hôte face aux pathogènes et impliqués dans la variabilité phénotypique de notre espèce, y compris le risque de développer des maladies infectieuses, auto-immunes et inflam-


Introduction
Over the last decade, with the advent of wholegenome approaches and the ability to sequence DNA from fossil remains, we have learned a great deal about the demographic and migratory history of our species. According to genomic studies, anatomically modern humans appeared in Africa at least 200,000 years ago, then dispersed out of Africa about 40,000-80,000 years ago and rapidly expanded to Southeast Asia, Australia, Europe and East Asia [1]. Humans would then have reached more distant destinations, such as the Americas about 15,000-35,000 years ago, and the distant islands of Oceania, where they would have settled more recently, only over the last 1000-4000 years.
During their migrations across the globe, humans have been confronted with a wide range of climatic, nutritional and pathogenic conditions to which they have had to adapt [2][3][4]. Various studies have evaluated how natural selection has affected the diversity of the human genome and have consistently identified genes associated with immune functions and host defense against pathogens as recurrent targets of natural selection [4][5][6][7][8][9]. Dissecting the legacy of selection within the genome has been crucial for the identification of genes responsible for the great morphological and physiological diversity observed across human populations and for a better understanding of the genetic architecture of adaptive phenotypes [2,3,10].
Surprisingly, a growing body of evidence seems to indicate that admixture between our species and ancient hominids, such as the Neanderthals or Denisovans, has introduced advantageous immune variants into the genome of modern humans [11]. Thus, the study of the action of natural selection on immune genes and the search for the effects of genetic variants on transcriptional responses to immune activation, through the mapping of expression quantitative trait loci (eQTLs), have become highly informative approaches to identify key immune mechanisms of host defense and to better understand the different factors-genetic or environmental-underlying the risk of developing infectious, autoimmune or inflammatory diseases [4,[12][13][14].

Natural selection and human adaptation to environmental pressures
Although natural selection acts on phenotypes, it is mainly the underlying genetic factors that are inherited, so genotypes are influenced by natural selection. Natural selection is subdivided into cases in which an allele is disadvantaged (purifying selection) or favoured (positive selection). Purifying selection removes deleterious alleles from the population and is the most common form of selection at the genomewide scale [10,15]. Positive selection acts on advantageous mutations according to different evolutionary models. According to the classic sweep model, selection acts on a newly-emerged advantageous mutation that will increase in frequency, whereas selection on standing variation involves positive selection of an allele that is already present in the population [10,15]. Polygenic adaptation, on the other hand, involves the simultaneous selection of variants over a large number of loci, each of which makes a small contribution to better adaptation [15]. Finally, adaptation can also be achieved by balancing selection, which preserves functional diversity through heterozygous advantage, frequency-dependent selection or pleiotropy [16]. Genome-wide scans of natural selection have identified several hundreds of genomic regions as candidates for positive selection, based on a number of molecular signatures such as the degree of population differentiation or haplotype homozygosity. Among the most iconic cases of positive selection are the lactase gene (LCT ), responsible for lactose tolerance in adults [17]; genes associated with variation in skin pigmentation [18]; genes involved in human adaptation to different habitats, from the tropical forest to the hypoxic conditions of high-altitude regions [19][20][21][22]; and genes associated with immune response and resistance to infectious diseases [4][5][6][7][8][9]23]. Signals of local adaptation have also been found in populations living in extreme physiological conditions, such as the extreme cold of Arctic regions or apnea [24].

Human immunology in the light of evolutionary genetics
Humans and microbes have long had a permanent, double-edged sword relationship. They complement each other, as far as the microbiota is concerned, but sometimes microorganisms can be pathogenic, causing infectious diseases. Ancient diseases, such as malaria or tuberculosis, or more recent ones, such as the Black Death or the Spanish Flu, have decimated millions of people throughout history [6,7,25,26]. Mortality rates from infection remained very high until the end of the 19th and beginning of the 20th centuries, when hygiene conditions improved and vaccines and antibiotics began to appear. Thus, infectious diseases were for a long time, and still remain so for populations that do not have access to therapeutics and vaccines, a major cause of mortality leading to strong selection pressure. John B. S. Haldane and Anthony Allison were the first to establish a causal link between infectious diseases and natural selection, suggesting that red blood cell disorders, such as thalassemia or sickle cell anemia, protect against malaria [27,28].

Essential and redundant immune functions
The study of the molecular signatures of natural selection in the genomic regions of the host involved in the immune response, or the absence of such signatures (neutrality), makes it possible to assess the biological relevance of genes in natura, and to determine whether they are essential, redundant or adaptable. Genes evolving under strong purifying selection are those for which functional variability cannot be tolerated. These genes thus encode products with essential and non-redundant functions, and their mutations can lead to severe diseases [29]. This is the case for innate immunity genes encoding endosomal Toll-like receptors (TLRs) [30], most NALP members of the NOD-like receptor family [31], IFN-gamma [32] and many others [4,9]. Mutations in some of these genes, or in the immune pathways they trigger (e.g., TLR3-TRIF, TIR-MYD88, IFN-gamma, NALP3), have been indeed associated with severe phenotypes, such as HSV-1 herpes encephalitis, pyogenic bacterial infections, Mendelian susceptibility to mycobacterial diseases or inflammatory diseases (for a complete review see [12]). Strong or complete redundancy can also be deduced from genetic data and is indicative of nonessential and redundant functions. Cases of complete loss of gene function, or "human knockouts", have been reported for genes such as IFNA10, IFNE, MBL2 and TLR5, whose variants of loss of function can reach very high frequencies in the general population, reflecting their high redundancy [9]. In rare cases, loss of function may be even beneficial, with variants increasing in frequency in the population owing to positive selection. This is the case for the CASP12, DARC or FUT2 genes, whose loss of function variants confer protection against sepsis, Plasmodium vivax malaria and norovirus infections, respectively [33,34].

Genetic adaptation to local pathogens
Over the past 60,000 years, as human populations dispersed around the globe, they had to adapt to the local pathogens they encountered. The list of genes and immune functions known to be subject to positive selection, following the classic sweep model, is constantly growing, with some signals being supported by functional evidence or epidemiological data. Cases of such local adaptation involve variants in genes associated with resistance to malaria in Africa and Asia (e.g., DARC, G6PD, GYPA, GYPB, GYPE) and Trypanosoma infection in Africa (APOL1), reduced NF-kB-mediated inflammatory responses in Europe (TLR10-TLR1-TLR6) and Africa (TLR5), type-III IFN-mediated antiviral responses in Europe and Asia, Vibrio cholerae infection in Bangladesh (NF-kB signaling pathway genes), and immunity to Lassa virus infection in Africa (LARGE and IL21) [4,6,9,[35][36][37].
The immunology literature also provides examples of balancing selection. An classical case is the extraordinarily high diversity for the major histocompatibility complex in vertebrates (MHC, or HLA in humans), whose genetic variability has been conserved since remote ancestors in primates, but also in other mammals, birds and fish [38]. Likewise, it has recently been shown that the ABO blood group is a trans-specific polymorphism, identical-bydescent in different primate species, including humans and gibbons [39]. Several other immune functions involving genes encoding membrane glycoproteins, such as GYPE, and innate immunity proteins, such as IGFBP7, show strong signals of transspecific polymorphism between humans and chimpanzees [40]. These studies collectively show that, although balanced selection remains a rare selective regime, it has shaped the diversity of a large number of genes associated with immune functions and hostpathogen interactions [16].

The role of admixture in genetic adaptation to pathogens
There is increasing evidence to suggest that genetic variants that are advantageous for host survival can also be acquired from other populations or species through admixture. This phenomenon is increasingly considered as a source of adaptive alleles for several species of plants and other animals [41]. However, in humans, we have only recently begun to understand the extent to which admixture can be a form of genetic adaptation, whether with other hominids, such as Neanderthals, or between different human populations.

The adaptive nature of introgression from archaic hominids
Thanks to the ability to sequence ancient DNA from fossil remains, we now know that non-African populations carry about 1-3% of Neanderthal genetic material in their genomes, while some Oceanian populations can present up to 5% of Denisovan inheritance in their genomes [42]. There is increasing evidence that, in most cases, archaic introgression has been counter-selected, leading to a depletion of archaic ancestry in specific genomic regions [43,44]. However, high levels of archaic ancestry in other regions of the human genome indicate either tolerance of archaic variants (variants evolving neutrally) or positive selection of these variants (archaic variants that improve modern human adaptation after admixture). Cases of adaptive introgression have been reported for genes associated with morphology, metabolism and adaptation to temperature, altitude, sunlight and infectious agents [45,46]. The first link between archaic introgression and immunity was documented for the HLA region [47], some haplotypes of which seem to have been acquired by admixture with Neanderthals or Denisovans. Since these pioneering work, several studies, including ours, have reported immune-related genes for which archaic introgression has been beneficial [29,43,45,[48][49][50][51][52][53]. Some pertinent examples include the OAS1-3, STAT2, TLR6-1-10, and TNFAIP3 genes. Some archaic haplotypes can reach very high frequencies in today's populations, such as OAS1-3 in Europe (∼36%), TLR6-1-10 in East Asia (∼39%) or TNFAIP3 in Melanesians (∼60%) [45,46]. In addition, Neanderthal alleles found in individuals of European origin have been associated with the regulation of gene expression in macrophages and monocytes, particularly in the context of antiviral responses [50,52]. These studies collectively suggest that admixture of Homo sapiens with ancient hominids has been a source of advantageous alleles for host defense against pathogens, viruses in particular [49].

The evolutionary benefits of admixture between human populations
Unlike archaic introgression, the role of admixture between modern human populations in genetic adaptation, known as adaptive admixture, remains largely unexplored. It is within this framework that African populations have received special attention.
The history of the African continent has been marked by admixture events between different populations at different times [54]. We have shown, for example, that during their dispersal through the rainforest, Bantu-speaking farmers encountered populations of rainforest hunter-gatherers known as "pygmies", with whom they admixed and from whom they acquired beneficial alleles in the HLA region [55]. Furthermore, the adaptive nature of the Duffy-null allelewhich is of African origin and protects against Plasmodium vivax malaria-is supported by other studies reporting high levels of African ancestry among admixed populations of Pakistan and Madagascar, where vivax malaria is endemic [56][57][58].
In the context of adaptive admixture, we have recently revisited the evolutionary history of the HbS mutation of the hemoglobin gene-a classic example of balancing selection in humans: heterozygotes are less vulnerable to malaria infection even though homozygotes develop sickle cell anemia [27]. We have shown that the HbS mutation first appeared in Africa among the ancestors of Bantu-speaking farmers more than 20,000 years ago, and has been acquired by rainforest hunter-gatherers over the last 6000 years through adaptive admixture with farmers [59]. These findings suggest that the ancestors of today's farming populations may have been exposed to malaria earlier than previously thought [60]. All of these studies provide proof of concept that host adaptation to infectious agents can be accelerated by admixture, whether with archaic humans or between modern human populations.

Variability of immune phenotypes between individuals and populations
Although population genetics studies have greatly increased our understanding of the key functions of the immune system from an evolutionary perspective, the links between genetic diversity, whether neutral or selected, and immune phenotypes remain to be explored in detail. Genome-wide association studies (GWAS) have identified genetic factors associated with the risk of developing infectious, autoimmune and inflammatory diseases [33,61]. Furthermore, these studies have shown that a non-negligible fraction of the genetic variants associated with immune phenotypes are located in regulatory regions [62,63]. It is in this context that the mapping of expression quantitative trait loci (or eQTLs) has proved to be of great value in recent years, allowing links to be established between genetic variability, intermediate phenotypes, such as gene expression, and ultimate phenotypes, such as infection [13].

Immune response and regulation of gene expression
Various studies, including ours, have mapped eQTLs in the context of immunity, measuring, in particular, gene expression levels in the presence of ligands activating innate immune pathways, such as TLRs or IFNs, or infectious agents, such as influenza A virus, Mycobacterium tuberculosis or Listeria monocytogenes [50,52,[64][65][66][67][68][69][70][71]. These studies have identified hundreds of genetic variants associated with gene expression variability, acting in cis or trans, in a way that is dependent on the cell context and the immune/infectious ligand used, indicating gene-environment interactions. Moreover, some of these regulatory variants have been previously associated by GWAS with several diseases, suggesting that the context-dependent variability of gene expression may explain, at least partially, our differences in susceptibility to infectious or autoimmune diseases [72]. There is also growing evidence that natural selection has contributed to the differences in immune responses observed between human populations, with immunity-associated eQTLs enriched in positive selection signals [50,52,67,73,74]. This has allowed us to identify immune mechanisms that may have conferred a selective advantage to some specific human populations; for example, reduced inflammation associated with activation of the TLR1 pathway in Europeans or other immune phenotypes [52,75]. Furthermore, high levels of Neanderthal ancestry have been detected among regulatory variants of gene expression in monocytes and macrophages from populations of European origin [50,52], which may have further contributed to the diversification of immune responses across human populations, particularly targeting antiviral functions [49,52].

Factors shaping the variability of the immune response
Apart from genetic factors, immune responses to infection can be shaped by intrinsic, non-genetic factors. A recent study of transcriptional responses to bacterial, viral, and fungal infections, using wholeblood samples from 1000 individuals stratified by age and sex, revealed that about 10% of the variance in the immune response is explained by host genetic factors, while sex and age account for about 5% of the total variance [69]. Using the same cohort of 1000 individuals, we have also observed that the observed variability in whole blood of the different cell subtypes involved in adaptive immunity is mainly under the control of non-genetic factors, such as age, sex, cytomegalovirus infection or smoking, while the heterogeneity in cell subtypes of the innate response is primary due to the variability in host genetics [76]. Further studies are nevertheless necessary to better quantify the relative contribution of genetic factors, age and sex, nutritional regimes, chronic infections or even socio-economic conditions to the variability of the immune response observed between individuals and populations. These studies will help us to better define the thresholds beyond which the natural variability of the immune response becomes dysfunctional and thus leads to the development of human diseases [77].

Conclusions and perspectives
Understanding the contribution of natural selection to the diversity of immune response genes in humans has become a complementary and indispensable approach to immunological, clinical and epidemiological genetic studies [8]. Evolutionary-based approaches have thus made it possible to identify immune functions that are essential in host defense against pathogens and to distinguish them from other functions with a higher level of immunological redundancy. It has also been shown that human populations have been able to acquire beneficial immune variability through admixture with other hominids, such as Neanderthals or Denisovans, and between modern human populations. Nevertheless, many links between population and evolutionary genetics, on the one hand, and human immunology, on the other hand, remain to be explored. We need, for example, to better understand how natural selection at different time scales has influenced immune phenotypes and disease risk, and, especially, how selection in the past may lead, following environmental changes, to current maladaptation and thus to pathologies of the immune system. Finally, most of the studies carried out to date have focused on the variability of the immune system in cosmopolitan populations, mainly of European ancestry. Systems immunology studies, integrating multiple demographic, physiological, genetic and epigenetic variables, in other human populations around the world exposed to different environments, may help us to better determine the respective contribution of the different factors that shape the diversity of the immune system [77]. In doing so, these studies will also allow us to better delineate the factors involved in the current differences, observed between individuals and populations, in terms of risk of infectious, inflammatory or autoimmune diseases.