The significance of inclusion in clinical trials and medical research databases
Before you treat a man with a condition,
Know that not all cures can heal all people.
For the chemistry that works on one patient,
May not work for the next,
Because even medicine has its own
Conditions.
From THE MAXIMS OF MEDICINE by Suzy Kassem
Our digital reality has taught us about the imperative of being represented in the data, for one’s existence to be recognized and considered. When such data is the building block for cure, therapy, and wellness development – representation therein carries consequences for one’s health prospects. Accordingly, absence from clinical data and health datasets used for health and medical research, entails a lack of representativeness and a lack of diversity in research participants. This is known to have medical and social effects on individuals and communities alike.
The diversity of populations in developed countries (where most medical research is being conducted) that came with global migration movements and the resulting demographic changes, is not faithfully reflected in the composition of participants in clinical trials and in biomedical databases. To date, the majority of participants in clinical trials and medical databases are Caucasians – mostly males of European descent. It is estimated that 78% of the genetic and genomic information available today originates from this population, although the overall proportion of Europeans and their descendants in the world population is barely 16%.
In the United States, a dominant arena of medical research, there is a wide variety of racial and ethnic minority groups. These populations include: Blacks/African Americans, Hispanic/Latinos, Alaskan natives (Inuit), Asians, Indians, Native Hawaiians/Pacific Islanders. And although these populations together make up 39% of its population, their participation rate in clinical trials ranges between 2% and 16%. Thus, for example, the representation of Latinos, who constitute about 18% of the US population, in clinical studies – stands at a mere single percent. Women, too, are famously underrepresented in clinical trials, after being intentionally excluded in the past, allegedly “to ensure homogeneity of treatment effect and reduce potential maternal-fetal liability.”
An interesting study published in February 2021, examined the inclusion rate of underrepresented groups in vaccine clinical trials in the United States, between 2011 and 2020. The study found that while Caucasians were overrepresented (78%) in these studies, the following populations were underrepresented: Black or African American individuals (10.5%), American Indian or Alaska Native individuals (0.4%), Hispanic or Latinos (11.6%), Asians (5.7), and people over the age of 65 (12%). Surprisingly, it was also found that women were slightly overrepresented (56%), relative to their proportion in the population.
As a result of this reality, where whole populations – races, indigenous peoples, and diverse ethnic communities are absent from studies, data about them are not being collected nor biological samples taken from them, their genome is not being sequenced, and their distinctive biological and genetic characteristics – not studied. Consequently, conclusions regarding underrepresented marginal populations are not derived, making drugs, treatments, and medical technologies developed based mainly on studies in Caucasians –inappropriate, inaccurate, or somewhat irrelevant for them.
The COVID pandemic has illustrated this clearly. Medical data have shown disparities in COVID impacts among racially/ethnically marginalized groups. Ethnic minorities were found to be disproportionately hit by COVID, with a higher complication and mortality rate. And despite these disturbing data, a study published in the Lancet in June 2020, showed that out of 1,518 COVID-19 studies documented in the National Institutes of Health (NIH) clinical trial registry, only six collect ethnic data. This is somewhat surprising, given the fact that in recent years, there has been a growing awareness of the critical importance of participant inclusion and diversity in clinical trials in a way that faithfully reflects real-world populations, as will be shown below.
Inclusion in clinical trials and health databases used for medical research is both a personal and a group interest, ensuring the applicability of research products for the individuals and populations represented in them. This is particularly significant for preventive medicine and precision medicine, where a better understanding of population-specific genomic variability is required. It is also crucial for artificial intelligence-based medical technologies, trained and operating on big health data to create models for prediction and diagnosis of medical conditions.
Increasing diversity by providing adequate, equal representation and – depending on the need and circumstances – differential representation (for example, to promote rare/orphan diseases research), also has indirect societal benefits, such as the promotion of equality between populations, communities, and patient groups.
Respectively, non-representation and non-diversity in research carry personal and social negative consequences. Routine exclusion of populations from medical research and health databases may, first and foremost, frustrate the achievement of a faithful representation of drugs’ safety and efficacy. It can also cause biases that present a distorted and unreliable picture of peoples’ health, by misleading generalizations of research findings (typically based, as mentioned, on male participants of European descent) about populations that are underrepresented in research (e.g., women, African Americans, Asians, etc.). Also, missing out on the benefits of medical advances, naturally creates new health and social disparities between different population groups, as well as perpetuates existing ones.
The reasons for the underrepresentation of minority groups in clinical trials and health databases are complex, and are mostly the product of inherent social inequities, which have led researchers and physicians to disregard these populations – whether unintentionally, or as part of research design. Another cause for underrepresentation stems from a collective emotion of hesitancy in participating in clinical trials among these very populations, due to historic reasons. In the past, participation in such trials routinely involved discomfort, burden, and sometimes – risk. It was, therefore, easier for researchers to enroll underprivileged populations from marginalized ethnic groups. These populations (mostly Blacks), who participated in studies – either knowingly or without their knowledge and informed consent – occasionally suffered both physical and autonomy-related harms, as a result of their participation. This has led to a multi-generational distrust of researchers and the medical system among these populations, resulting in a low tendency to respond to recruitment for medical research.
Since then, however, the field of medical research, as well as research ethics principles, have undergone significant changes. Many medical studies can now be conducted by way of data analysis, and as such – do not involve inconvenience or significant risks for participants, save for the potential harms of stigmatization, and risks related to privacy and medical confidentiality (which can be mitigated and minimized by employing various privacy protection tools and measures.) At the same time, a rich fabric of ethical rules for interventional experiments and (non-interventional) health data research has developed and continues to evolve – restricting researchers in their actions and ensuring better protection of participants’ and data subjects’ rights, respectively.
However, it is not only the experience of medical research, but also the way research and participation therein are perceived, that has changed over the past few decades. Whereas in the past, participation in research was considered risky, exploitative or burdensome, today, participation – particularly in low-risk studies conducted on big health data or samples in biobanks – is conceived as a right, and some would go as far as perceiving it as a personal responsibility or a moral imperative (in the name of social solidarity and in view of the great social value of medical developments). This paradigm change of research participation has also transformed research participants from passive partakers – into active initiators. (Take for example, patient-driven medical research – a manifestation of ‘participatory citizenship’ – in under-researched areas such as orphan diseases.)
Returning to the reasons for under- or non-representation, we can identify additional, subjective barriers to equal inclusion in clinical trials and health databases. These include language barriers, common among people from immigrant communities; health illiteracy; personal and communal values, cultural perceptions and religious beliefs; and poor accessibility of remote and low-income populations to research projects and sites.
Another significant cause for underrepresentation stems from a quality of data problem. That is, when the characteristics of the participants’ ethnicity are incomplete in the information collected due to poor documentation, inconsistent, or inaccurate. This makes it difficult to perform extrapolations from existing non-representative data to those populations, in a way that will meaningfully advance our understanding of their unique characteristics in different health aspects. And so, for example, in 2020, the UK Health Data Research Alliance recognized the importance of ensuring the coding quality of ethnicity and other protected characteristics, and the inclusion of all population groups in the data collected, to assure that research outcomes are representative, inclusive, and bias-free.
Recognizing the problems and injustices that non-inclusion in health and clinical data creates, as well as some overdue awareness to the factors that led to it, brought on a variety of proactive initiatives and policy measures, aimed at rectifying the situation. These are some of them:
- The Food and Drug Administration (FDA) Guidelines for researchers and medical product sponsors, issued in late 2020, encourage inclusivity in medical product development. Calling for the application of inclusive trial design and enrollment practices, the guidelines offer various methodologies for increasing enrollment of underrepresented populations in clinical trials.
- The NIH Research Program – All of Us, seeking to promote the inclusion of a diverse and representative population in medical research. The goal of the program is to gather and analyze data from over one million people living in the United States, from different races, ethnicities, age groups and geographical areas, in an effort to build the most diverse health database in history. The stated values underlying the program are, transparency, diversity and inclusion, and protecting the data and privacy of participants. Enrollment for the program began in 2018 and it is expected to take about ten years. Notably, the program is a further policy step of the NIH, following the enactment of the NIH Revitalization Act in 1993, requiring the inclusion of women and ethnic minorities in clinical trials funded by it, and the 2017 NIH policies and guidelines published in accordance with the act.
- The Three Million African Genomes (3MAG) project, launched in Africa for sequencing the genomes of three million individuals from selected populations across the continent, covering a variety of regional, ethnolinguistic, and other groups. The project is designed to reveal new genetic and genomic information that is unique to populations and ethnic groups in Africa. Its ambitious goals have already started to show fruits, when it was recently shown that a small-scale genomic sequencing of 426 people from 50 different ethno-linguistic groups in Africa, has revealed over 3 million previously unknown genetic variants. The project is expected to take about a decade, with an annual cost of about $US450 million.
- The GenomeAsia 100K project, launched in Singapore in 2016, aimed at sequencing and analyzing the genomes of 100,000 Asian individuals, including groups of patients from the fields of oncology, neurology, autoimmune diseases, diabetes, cardiovascular diseases and rare hereditary diseases. Apparently, despite constituting over 40% of the world’s population – until recently, people of Asian descent’s share of genomic sequences was only 6%. The project is set to correct this distortion, by closing the knowledge gap regarding the distinctive genetic characteristics of the Asian population. It will do so by creating reference genomes for Asians and identifying rare and frequent alleles associated with this population. This will enable the acceleration of Asian population-specific medical advances and precision medicine. Merely a few years into the project, an international group of scientists found that while northern Europe has a single ancestral lineage, Asia has at least ten such lineages.
- The Pharmaceutical Research and Manufacturers of America (PhRMA) Principles on Conduct of Clinical Trials, issued in 2020, aimed at building trust with the African-American community in view of past clinical trials-related injustices caused to them; removing barriers to clinical trials enrolment; using real-world data to improve information about a variety of populations; and increasing diversity and inclusion in clinical trials.
Most of these schemes seem to be aimed at correcting the reality of (non-) inclusion in clinical trials. Assumingly, the rationale for improved inclusion and diversity applies even more so to health databases for research, given the relative ease of inclusion therein and the low risk and minimal burden associated with it.
In conclusion, the aforementioned steps, complemented by a campaign educating underrepresented populations about the significance of inclusion in health and medical research, can put us on the verge of change. Such change is expected to bring with it a tsunami of population-specific health information – the product of affirmative action-type initiatives, and a boost in public awareness.