AI-Based Next-Generation-Phenotyping for Rare Disease Diagnosis

by Adele Ruder  (contributor ) , Alexander Hustinx  (contributor ) , Louise von Stechow  (contributor )   •     

Disclaimer: All opinions expressed by Contributors are their own and do not represent those of their employers, or BiopharmaTrend.com.
Contributors are fully responsible for assuring they own any required copyright for any content they submit to BiopharmaTrend.com. This website and its owners shall not be liable for neither information and content submitted for publication by Contributors, nor its accuracy.

Share:   Share in LinkedIn  Share in Bluesky  Share in Reddit  Share in Hacker News  Share in X  Share in Facebook  Send by email   |  

Rare diseases, which affect more than 300 million people worldwide, are often difficult to diagnose. Many patients wait five years or more for a correct diagnosis and are misdiagnosed at least once along this diagnostic odyssey. In addition to being a significant burden to patients’ well-being, low diagnostic rates also impact research and development in the rare disease space and challenge the success of marketed orphan drugs.

Artificial Intelligence (AI) can help to address these challenges on various levels. One particularly interesting approach is Next-Generation-Phenotyping (NGP) where AI is used to recognize disease-specific phenotypic patterns, such as distinct facial features, that are associated with rare diseases, to predict disease diagnosis and inform genetic testing. NGP tools can thus benefit patients, physicians, and drug developers by enhancing speed and accuracy of diagnosis, deepening disease understanding, and accelerating recruitment for clinical trials.

This article marks the launch of a new monthly column by Dr. Louise von Stechow on how emerging technologies—artificial intelligence, gene therapies, next-generation phenotyping, and more—are helping address challenges in diagnosing, treating, and managing rare diseases.

Today's piece is co-authored by Alexander Hustinx, a PhD candidate and deep learning engineer, and Dr. Adele Ruder, a biomedical scientist and medical science liaison—both based at the Institute for Genomic Statistics and Bioinformatics (IGSB), University of Bonn, and contributors to the GestaltMatcher and Bone2Gene initiatives.

NGP for Rare Disease Diagnosis 

Around 80% of the over 6000 rare diseases have a genetic origin, many of which are manifested in distinct phenotypic patterns. For example, it is estimated that up to 40% of rare disease patients have distinct facial features. Similarly, unique skeletal patterns are often associated with rare skeletal disorders, of which over 700 distinct diseases have been classified. 

For a long time, physicians had to rely on visual cues alone, from looking at patients’ faces, body morphology, skeletal radiographs or other medical images to translate the observations of distinct phenotypes into potential diagnoses. While experienced dysmorphologists develop an expert eye for identifying certain rare genetic disorders based on subtle cues, the overwhelming number of rare diseases, their individual rarity and inter-individual variation make it hard for most human clinicians to identify distinguishing patterns.  

NGP tools can now help bridge those gaps, by AI-based analysis of medical images to detect patterns in a less subjective way than humans and help clinicians derive diagnoses faster and with higher accuracy. While originally mainly focused on facial recognition (FR) technologies, academic groups and biotech companies are now also applying NGP to other types of medical images such as skeletal radiographs and retinal images

However, FR remains the furthest advanced technology among the set of NGP applications for rare disease diagnosis. While clinical utility of FR for diagnosing rare diseases was already shown in 2014, more recently, the accuracy and applicability of FR tools have increased with advanced Deep Learning methods. 

FR-based analysis platforms include for example US-based FDNA’s Face2Gene platform and app (used in over 10,000 medical centers to analyze over 550,000 cases worldwide by April 2025, as listed on the company’s homepage) , Australian FaceMatch which directly targets parents of children with rare diseases and adults with rare diseases, UK-based ClinFace that focuses on 3D patient photos, and German GestaltMatcher which is available as a platform and an app for clinicians. 

Other NGP applications go beyond facial recognition and focus on other medical images or integration of different data sources. For example, Bone2Gene AI, funded by the German Federal Ministry of Education and Research, aims to identify the unique patterns in radiographic images linked to various bone disorders and support clinicians in the diagnostic process. Other academic applications include PhenoScore, an open-source AI framework that combines FR with Human Phenotype Ontology (HPO) data to quantify phenotypic similarity, to identify known and novel phenotypic subgroups linked to specific genetic variants. 

Clinical Use of NGP tools – The Example of GestaltMatcherAI

Generally, for rare disease diagnosis, FR tools compare a photo of the patient's face to a dataset of cases to find patterns within the facial features that can lead to a potential diagnosis and point toward genetic tests. For example, GestaltMatcher AI is available as an app, which allows clinicians to analyze patient photos directly from a phone or tablet. In the following steps, the tool provides a scored list of possible diagnoses that help guide the selection and interpretation of genetic tests. 

GestaltMatcher also provides an open-source code via Github repository and the GestaltMatcher Database, which contains curated case reports from more than 13,200 patients with different ethnic backgrounds, as well as other types of medical images such as MRIs, fundoscopies, and images of other body parts of patients with molecularly-confirmed diagnoses. 

The underlying AI model of GestaltMatcher produces vectors, which are high-dimensional representations of faces. Thus, patients are represented as points in a high-dimensional space, in which those with similar facial features are positioned closer together (Figure 1). Using this clustering approach, GestaltMatcher can identify features associated with known and even unknown syndromes—the latter revealed in clusters of patients with similar but undocumented facial features. This approach facilitates the discovery of syndromes that were not present in the training data. 

A diagram of a structure

AI-generated content may be incorrect.

Figure 1

After their seminal publication, the GestaltMatcher team replaced the core model with a modern face recognition network (iResNet with ArcFace) and tested various datasets for better transfer learning. They also introduced ensemble models to boost accuracy for diagnosing unseen ultra-rare disorders. In addition, they developed GestaltGAN in an attempt to increase privacy and explainability of decision-making. GestaltGAN generates synthetic patient portraits based on the shared facial features of real patients with the same disorder. A Generative Adversarial Network (GAN) creates a picture while a discriminator decides if the picture is real or synthetic. GestaltGAN was trained with over 3,200 medical images from the GestaltMatcher Database. This helps protect patient privacy as the generated faces cannot be traced back to an individual (Figure 2).

A diagram of a child's mind

AI-generated content may be incorrect.

Figure 2

Challenges for Wider Clinical Adoption 

While studies show that synergy between NGP and genetic testing can increase rare diseases diagnosis, several factors might challenge the wider clinical adoption of NGP tools. 

The training of AI algorithms generally relies on large and diverse datasets to make accurate predictions. However, for many rare diseases, there is simply not enough data available. Moreover, most training datasets primarily consist of patients from Western populations. This will likely impact the performance of NGP tools, especially for FR-based methods applied in underrepresented groups, as shown in studies that highlighted racial and gender bias of FR algorithms. Capturing data from diverse populations will be an important factor for making FR tools more widely applicable. 

Beyond the question of bias, the use of FR-based AI in healthcare also raises questions regarding data ownership, privacy, and the role of AI in decision making. Based on the EU AI Act, FR technology will likely be classified as high-risk AI systems, particularly when it influences clinical decision-making and impacts patient outcomes. Developers of high-risk classification have to comply with risk and safety assessments, transparent data governance (especially for biometric data), human oversight, and conformity assessments before market deployment. How this new regulation will influence the use of NGP tools in clinical diagnosis within the EU, as compared to the rest of the world, remains to be seen. It seems likely, however, that this will pose hurdles, especially for smaller developer companies, trying to develop NGP-based medical devices. 

This uncertainty around data regulatory frameworks and data privacy will likely mean that physicians and policy makers in Europe will adopt a more careful approach when it comes to implementing NGP tools for rare disease diagnosis. Training programs to build trust in the accuracy and data security of NGP solutions, as well as ease-of-use will be key to drive further adoption. This could be further enhanced by integrating FR-based tools into wider screening programs, such as routine pediatric checkups. 

A key role for driving innovation might fall to biotech companies. As drug makers in the rare disease space, biotech companies will depend on diagnosing patients for clinical trials and for marketed products and can help drive adoption of AI-based tools by centers and HCPs with whom they collaborate.

References

https://www.thelancet.com/journals/langlo/article/PIIS2214-109X(24)00056-1/fulltext https://www.nature.com/articles/s41431-024-01604-z 
https://pubmed.ncbi.nlm.nih.gov/19627523/ 
https://www.researchsquare.com/article/rs-2110140/v1 
https://elifesciences.org/articles/02020 
https://www.face2gene.com/ 
https://facematch.org.au/home 
https://cliniface.org/ 
https://www.gestaltmatcher.org/ 
https://bone2gene.org/ 
https://www.nature.com/articles/s41591-018-0279-0 
https://www.nature.com/articles/s10038-019-0619-z
https://www.nature.com/articles/s41588-021-01010-x
https://ieeexplore.ieee.org/document/10030218 
https://www.nature.com/articles/s41431-025-01787-z

https://doi.org/10.1007/s00112-024-02118-0
https://pubmed.ncbi.nlm.nih.gov/36779427/ 
https://www.nature.com/articles/s41588-023-01469-w 
https://www.medrxiv.org/content/10.1101/2023.06.06.23290887v4 
https://www.nature.com/articles/s41599-024-02894-w
https://www.nature.com/articles/d41586-022-03050-7 
https://www.nature.com/articles/s41746-024-01232-3 
Share:   Share in LinkedIn  Share in Bluesky  Share in Reddit  Share in Hacker News  Share in X  Share in Facebook  Send by email