Chapeau UK Biobank! A revolution for integrated research on humans and large-scale data sharing

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

It is obvious, but sometimes it needs to be reminded, researchers need data to generate and test novel hypotheses.The more the dataset is of high quality, the more robust the scientific results and predictions are produced.And the more widely the data is shared, the more verifiable and reproducible the results are.In human research, many initiatives have been made to produce large-scale data, ranging from national registries that include the entire population, to the establishment of cohorts of varying sizes.In 2006, the UK Biobank biomedical database and resource was created, and it has revolutionized research on several levels, particularly because the data is accessible to the scientific and medical communities [1].
The initial budget of UK Biobank was £62 million, funded by the UK government and the Wellcome Trust.The objective was to collect data on 500,000 participants aged 40-69 years old over four years and then follow them for at least 30 years.The database was opened to researchers in March 2012, hosting demographic, medical, psychological, socio-economic data (from interviews or online questionnaires), biological samples (e.g., urine and blood), genetic data and imaging data.Ten years later, more than 28,000 researchers had access to this data.
Why is UK Biobank revolutionizing research?Unlike national registries where data access is limited, researchers from all countries can access UK Biobank data after their research project was approved and their IT security to host the requested data was verified.Although genetic data is considered sensitive, UK Biobank has obtained informed consent from participants to share it.There is a periodical update sent to all registered researchers to ask them to remove participants who chose not to share their data anymore.
More than five petabytes of genetic data are available including >500,000 microarrays, >400,000 whole-exome and >200,000 whole-genome sequences [2].These data allow researchers to explore the genetic contribution to many traits and diseases.Among the major results, this resource has been key to estimate the heritability of more than 4000 phenotypic traits and are available online at the Benjamin Neale laboratory (https://nealelab.github.io/UKBB_ldsc/index.html).Genetic variations have been identified as risk factors and predictive models have been proposed for diseases such as diabetes, cardiovascular disease and cancer [3].You can interrogate if a gene is associated with a specific trait here https://azphewas.com/.
Epidemiological and genetic data are also linked to imaging data.The goal is to have magnetic resonance images (MRI) of the brain, heart, and abdomen for 100,000 participants (already 50,000 available).The combined analysis of genotyping and brain MRI data has allowed, among other things, to the estimation of the heritability of interindividual differences in brain anatomy [4,5].
Regarding COVID-19, all registered researchers received messages informing them of the collection of new data collected in relation to COVID-19 and encouraging them to submit research projects on the pandemic.More than 740 researchers responded, and more than 148 papers were published.A specific UK Biobank study also collected new data on 20,000 volunteers who were either original UK Biobank participants or their children or grandchildren over the age of 18.A total of 6.6% of partici-pants had already been infected by May/June 2020 and this rate increased to 8.8% by the end of November 2020.This study was one of the first to show that antibodies produced following natural infection can protect most people from further infection for at least 6 months (The UK Biobank SARS-CoV-2 Serology Study report is available here: https: //www.ukbiobank.ac.uk/media/x0nd5sul/ukb_ serologystudy_report_revised_6months_jan21.pdf).
I have only scratched the surface of the findings and the possibilities offered by UK Biobank.I have not mentioned all the richness and reliability of the data as well as the next objectives such as the linkage with other registries such as death, cancer, . . .I also omitted the ability to share results of the research projects directly via the UK Biobank portal, and the description of the limitations and biases of representativeness of UK Biobank which are well documented [6].All this information is available on the UK Biobank website https:// www.ukbiobank.ac.uk/.Again, for scientists to work, there is a need for large amounts of high quality and accessible data.There is an urgent need to support such initiatives in other countries to replicate the UK Biobank results, to increase the diversity of people studied and to detect associations that are specific to countries with different health systems.

Conflicts of interest
The author has no conflict of interest to declare.
UK Biobank révolutionne la recherche car contrairement aux registres nationaux dont les données sont peu accessibles, la base est accessible aux chercheurs de tous les pays après validation de leur projet de recherche et vérification de la sécurité informatique dont ils disposent pour héberger les données qu'ils souhaitent récupérer.Bien que les données génétiques soient considérées comme sensibles, UK Biobank a obtenu le consentement éclairé des participants pour les partager.Il y a d'ailleurs continuellement des mises à jour pour indiquer si des participants ne veulent plus participer à la recherche.