Global study reveals deep Eurocentric bias in human gene maps
Ancestry-specific molecular variations have been invisible in standard reference maps, concealing potentially critical clues about genetic risk. / REUTERS
Global study reveals deep Eurocentric bias in human gene maps
Hundreds of potential new genes have been overlooked due to Eurocentric gene maps, leaving crucial disease clues for non-European populations unseen.
December 4, 2025

Major gaps exist in human gene maps because they were mainly based on DNA from individuals of European ancestry, a new study finds.

Researchers identified thousands of previously missing transcripts, RNA molecules that carry a gene’s instructions, in populations from Africa, Asia, and the Americas, according to a study published on Wednesday in Nature Communications.

Many of these newly found transcripts are linked to genes already known to affect diseases that vary greatly by ancestry, including lupus, rheumatoid arthritis, asthma, and cholesterol metabolism.

The findings show that variations in disease prevalence and severity among populations may partly result from the fact that the same gene can produce different RNA transcripts (and thus different proteins) through mechanisms like alternative splicing. 

Until now, these ancestry-specific molecular variations have been essentially invisible in the standard reference maps, concealing potentially critical clues about genetic risk and disease biology.

“Gene maps are used by scientists every day, but we’ve been leaving out huge sections of the world’s population. This study shows, for the first time, how much we’ve been missing,” says first author Pau Clavell-Revelles of the Barcelona Supercomputing Center (BSC) and Centre for Genomic Regulation (CRG).

The first complete draft of the human genome, unveiled in 2001, was a monumental breakthrough, yet it was only a starting point. 

The raw sequence of three billion letters told scientists nothing about where genes actually begin and end, how many there are, or how a single gene can generate multiple protein variants through alternative splicing. This cellular editing process cuts and reassembles RNA instructions.

To solve this, researchers created gene annotation maps: comprehensive catalogues that pinpoint the location of every gene and list all the RNA transcripts it can produce. 

Initiatives like GENCODE transformed the cryptic DNA sequence into a usable roadmap, enabling scientists to identify disease-causing regions and understand how genetic differences among individuals translate into real-world health outcomes.

Yet these annotation maps carried a built-in flaw from the start. 

While any two people share 99.9 percent of their DNA, the remaining 0.1 percent carries the signature of deep human history: populations that separated tens of thousands of years ago and adapted to different climates, diets, diseases, and chance events.

Because the reference genome and most of the early annotation efforts relied heavily on samples from people of European descent, the distinctive genetic features that evolved in African, Asian, Oceanian, and American populations were largely left out of the final maps.

As a result, a significant portion of human molecular biology, the way cells actually switch genes on and off in different populations, has remained undocumented, effectively hidden from view and from medical research.

“Most gene sequencing so far has come from European individuals, so the reference catalogues we rely on may be missing genes or transcripts that exist only in non-European populations,” says Dr Roderic Guigo, senior co-author of the study and researcher at the Centre for Genomic Regulation and University Pompeu Fabra in Barcelona. 

“If a genetic variant falls in one of these missing genes, we assume it has no biological effect. In some cases, that assumption may simply be wrong,” he adds.

RelatedTRT World - Could trauma pass onto the genes of future generations? Experts find out

‘Tip of the iceberg’

To fill the gaps in current gene maps, researchers turned to long-read RNA sequencing, a technology that captures entire transcripts in one piece, something older short-read methods could never reliably do. 

They examined blood cells from 43 individuals representing eight diverse populations (Yoruba and Luhya from Africa, Mbuti from the Congo, Han Chinese, Indian Telugu, Peruvians, Ashkenazi Jewish, and Utah residents of European descent), all of whom are part of the well-characterised 1000 Genomes Project.

The results were striking: the team discovered more than 41,000 previously undocumented transcripts absent from the standard GENCODE annotations. Among those arising from known protein-coding genes, 41 percent are predicted to produce novel protein isoforms. 

They also identified 773 transcripts that appear to originate from entirely new gene locations that had not been catalogued before, and 2,267 transcripts that are unique to a single ancestral population, most of them completely unknown in non-European groups.

The researchers say that their work has just begun: “We firmly believe that any findings that we made here are really just the tip of the iceberg,” says Dr Fairlie Reese, postdoctoral researcher at the BSC.

The researchers admit that there is much more to be done, despite their initial efforts pointing towards a more complete map of human existence.

“We hope our study serves as a foundation and an invitation for the global scientific community to contribute data, methods, and diverse populations, Dr Marta Mele, senior co-author of the study and Group Leader at the BSC. 

“Only through a collective effort will we achieve a truly complete and inclusive map of human biology, which is essential for fair and accurate genomic medicine.”


SOURCE:TRT World