Ocular Surface Disease
Read Time: 6 mins

Machine Learning Models for the Diagnosis of Dry Eyes Using Real-World Clinical Data

Copy Link
Published Online: May 28th 2024 touchREVIEWS in Ophthalmology. 2024;18(1):46-48
Authors: Tamer N Jarada, Karl Stonecipher, Olivia Perez, Ahmed R Al-Ghoul
Quick Links:
Article Information


Dry eye disease (DED) is a common ocular condition marked by discomfort and tear film instability. Machine learning (ML) techniques are increasingly recognized for their potential to transform healthcare, particularly in ophthalmology.


This study aims to develop ML models for predicting the severity and type of DED using demographic and clinical data. Real-world clinical data collected over 18 months served as the basis for creating two diagnostic models. The data set comprised 1,313 well-structured samples, each annotated by domain experts, showcasing multiple demographic and clinical features. A correlation feature selection technique was used to eliminate redundant and irrelevant features. Through stratified 10-fold cross-validation, two support vector machine models – dry eye severity model (SM) and dry eye type model (TM) – were developed to predict the severity and type of DED.


The SM achieved a moderate performance in predicting the severity of dry eye cases, with an area under the receiver operating characteristic (AUC-ROC) of 0.79 and an area under the precision-recall (AUC-PR) of 0.61. Furthermore, the TM demonstrated effectiveness in predicting different dry eye types, yielding an AUC-ROC of 0.91 and an AUC-PR of 0.83. We also verified the robustness of both SM and TM by comparing their performance with nine baseline ML methods. Both SM and TM consistently outperformed the baseline methods in terms of AUC-ROC and AUC-PR.


The potential application of these models lies in improving health outcomes and offering early alerts to potentially prevent the progression of DED.


dry eyedry eye diseasedry eye typedry eye severitymachine learningclinical diagnosticsreal-world data


Dry eye disease (DED) is a prevalent ocular condition characterized by discomfort, visual disturbances and tear film instability due to insufficient tear production or rapid tear evaporation.1 With an increasing global incidence (i.e. up to 50%) and its significant impact on the quality of life, there is a growing need for accurate and efficient diagnostic methods to aid in its timely identification and management.2 Traditional diagnostic approaches rely on subjective assessments and clinical evaluations, often leading to variability and suboptimal accuracy.

In recent years, machine learning (ML) techniques have emerged as potential tools in revolutionizing healthcare, including in ophthalmology.3 ML models offer the possibility of improving accuracy, objectivity and speed in medical diagnosis by leveraging extensive real-world clinical data.4 These algorithms analyze complex patterns and relationships within data sets, extracting valuable insights not readily apparent to human clinicians. This potential could significantly benefit the diagnosis and management of DED.

Recent studies have showcased the potential of ML models in ophthalmology. For instance, Ting et al. developed a deep learning system for the diagnosis of diabetic retinopathy.3 While not specific to dry eye, this study illustrated how ML models can effectively use real-world clinical data to classify and diagnose complex eye conditions. Leveraging a large data set of retinal images, the authors achieved an AUC-ROC of 0.936, underscoring the power of ML techniques in medical image analysis.

In this study, we explore the application of ML models for the diagnosis of DED using real-world clinical data. We examine various approaches and methodologies used by researchers to harness the potential of ML in identifying and classifying DED. The insights gained from this study have the potential to contribute to the evolution of the diagnosis of dry eyes, providing more accurate and efficient methods that may lead to improved patient outcomes and an enhanced quality of life.


The study aims to enhance diagnostics in DED by specifically examining the type and severity of the disease using ML in a typical clinical setting. The objectives include the use of real-world clinical data to create ML models for assessing the type and severity of DED, focussing on facilitating the implementation by clinicians in a clinical setting.

Real-world clinical data collected over 18 months were used to develop two ML diagnostic models for the severity and type of dry eyes. The data set from a Canadian clinic comprised 1,313 well-structured samples, each annotated by domain experts, showcasing numerous features.


Real-world data was collected in Canadian clinics for over 28 months from December 1, 2018 to March 31, 2021 and then anonymized. These data were used to develop two ML diagnostic models for assessing the severity and type of DED. The data set consisted of 1,313 patient samples, each carefully annotated by domain specialists, showcasing various features. The use of such extensive real-world clinical data aims to enhance the generalizability and relevance of the models in practical settings.

To optimize the performance of the ML models, we applied a correlation feature selection technique.5 This method systematically identified and eliminated redundant and irrelevant features, refining the input variables to focus on the most pertinent aspects for accurate diagnosis. This step streamlined the models and improved their interpretability.

To assess the performance and robustness of the models comprehensively, we used a stratified 10-fold cross-validation methodology.6 This technique divides the data set into subsets, maintaining the distribution of the original data to minimize bias and generate a more reliable assessment of the predictive capabilities of the models.

Among the various ML techniques, support vector machines (SVMs) demonstrated proficiency in handling the intricacies of the data set.7 SVMs excel in categorizing the data into distinct groups, making them well-suited for discerning different levels of severity and types of dry eyes.

We developed and fine-tuned two distinct SVM models, the dry eye severity model (SM) and the dry eye type model (TM). We used the p-value as a statistical measure to validate our hypothesis against the observed data and ensure that our proposed models achieve sufficient statistical power. The SM quantifies the severity of DED, offering insights into its progression and intensity. The TM classifies dry eye types, aiding clinicians in making more precise treatment decisions. Importantly, both models use the same set of features for predictions, ensuring consistency and comparability between severity and type assessments. This uniformity in feature selection enhances the applicability of the models and simplifies their integration into clinical practice.

This study was performed in accordance with the Helsinki Declaration of 1964 and its later amendments. Written informed consent was received from the patients to participate in this study and for the publication of this article. This article does not contain any identifying information about the participants.


The SM demonstrated satisfactory performance in predicting the severity of dry eyes in various cases, with an area under the receiver operating characteristic (AUC-ROC) curve of 0.79 (p-value = 1.41e−05) (Figure 1) and an area under the precision-recall (AUC-PR) curve of 0.61 (p-value = 1.34e−03) (Figure 2). These metrics offer a comprehensive evaluation of the capability of the models to differentiate between different severity levels and effectively identify true-positive cases while managing false positives.

Figure 1: The area under the receiver operating characteristic curve per class of the dry eye severity model

Figure 1:The area under the receiver operating characteristic curve per class of the dry eye severity model

AUC-ROC = area under the receiver operating characteristic.

Figure 2: The precision-recall micro-averaged curve per class of the dry eye severity model

On the other hand, the TM showed reliable predictive capabilities in distinguishing various dry eye types, as indicated by an AUC-ROC value of 0.91 (p-value = 1.28e−04) (Figure 3) and an AUC-PR value of 0.83 (p-value = 1.37e−04) (Figure 4). These values suggest the effectiveness of the model in accurately identifying different dry eye classifications. The notable metrics highlight the strength and capacity of the models to capture true-positive cases while minimizing false positives.

Figure 3: The area under the receiver operating characteristic curve per class of the dry eye type model

Figure 3: The area under the receiver operating characteristic curve per class of the dry eye type model

AUC-ROC = area under the receiver operating characteristic.

Figure 4: The precision-recall micro-averaged curve per class of the dry eye type model

Figure 4: The precision-recall micro-averaged curve per class of the dry eye type model

The robustness and performance of both the SM and TM were confirmed through a thorough comparison with nine baseline ML methods. These baseline methods included various traditional algorithms commonly used in classification tasks. The results of the comparative analysis emphasized the consistent performance of the SM and TM. Across both AUC-ROC and AUC-PR metrics, both models consistently outperformed all baseline methods, indicating their effectiveness in terms of predictive accuracy, sensitivity and specificity.


The AUC-ROC and AUC-PR values achieved by the SM and TM, coupled with their consistent outperformance over baseline methods, emphasize their usefulness in accurately diagnosing the severity and type of dry eyes. These findings suggest the potential of ML techniques to contribute to ophthalmology and refine diagnostic procedures.

In the realm of the diagnosis of dry eyes, the ML models introduced in this study serve as a validated prediction framework. Their ability to predict both severity and type of dry eye reflects their robustness and potential impact. By providing clinicians with a tool for reliably anticipating dry eye conditions, these models have the potential to improve health outcomes. Their predictive capabilities may play a role in initiating timely interventions, potentially preventing the progression of DED.

Beyond diagnostic accuracy, these models could proactively contribute to patient care by offering early alerts. This could enable healthcare professionals to implement preventive measures or tailor treatment strategies promptly, potentially mitigating the discomfort and impairment associated with dry eyes and enhancing the quality of life of patients.

While already demonstrating predictive capabilities, the trajectory of these ML models is set for further refinement and augmentation. The integration of future data, encompassing a broader array of clinical scenarios and patient profiles, has the potential to enhance these models further. As these models continue to learn from diverse data sets, their precision and reliability are expected to improve, allowing them to adapt to the nuanced complexities inherent in DED.

To implement the findings of this study in clinical practice, we have taken a proactive step by seamlessly integrating the SM and TM with the widely adopted CSI Dry Eye Software (CSI Dry Eye Inc.CalgaryAlbertaCanada). This strategic integration enables a streamlined incorporation of our ML models into the existing workflows of clinicians globally who rely on the CSI Dry Eye Software. By leveraging this established platform, we aim to facilitate the accessibility and adoption of our predictive models, ensuring that clinicians worldwide can readily incorporate these models into their routine diagnostic practices. This collaborative approach not only enhances the practical applicability of our models but also fosters a more efficient and widespread integration of advanced diagnostic capabilities into real-world clinical settings. As a result, clinicians using the CSI Dry Eye Software can now benefit from the enhanced predictive accuracy and tailored insights provided by our SM and TM, contributing to more precise and personalized management of DED for patients worldwide.


By using ML and incorporating real-world clinical data, the models presented in this study contribute to the field of dry eye diagnosis and highlight the potential of ML to support medical decision-making processes. The validation of these ML models as the initial framework for predicting the severity and type of dry eyes marks a significant advancement. With an ongoing focus on improvement through the integration of future data, these models have the potential to bring changes to the landscape of dry eye diagnosis and treatment.

Article Information:

Tamer N Jarada is the Data Science Lead at CSI Dry Eye Inc. Karl Stonecipher does research, speaks and consults for CSI Dry Eye. Ahmed R Al-Ghoul is the Founder and CEO of CSI Dry Eye Inc. Olivia Perez has no financial or non-financial relationships or activities to declare in relation to this article.

Compliance With Ethics

This article is a retrospective study and is not subject to Health Research Ethics Board of Alberta (HREBA) approval. It was conducted in accordance with the principles outlined in the Helsinki Declaration of 1964 and its later amendments. Written informed consent to participate in the study and for the publication of this article was obtained from all study participants, or their parent/guardian or next of kin if the participant was deceased or unable to provide consent, and the article does not contain any identifying information about the participants.

Review Process

Double-blind peer review.


The named authors meet the International Committee of Medical Journal Editors (ICMJE) criteria for authorship of this manuscript, take responsibility for the integrity of the work as a whole, and have given final approval for the version to be published.


Karl Stonecipher1002 N. Church Street, Suite 101 GreensboroNC


No funding was received in the publication of this article.


This article is freely accessible at © Touch Medical Media 2024.

Data Availability

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.




1. Wolffsohn JSArita RChalmers Ret alTFOS DEWS II diagnostic methodology reportOcul Surf2017;15:53974. DOI10.1016/j.jtos.2017.05.001.

2. Storås AMStrümke IRiegler MAet alArtificial intelligence in dry eye diseaseOcul Surf2022;23:7486. DOI10.1016/j.jtos.2021.11.004.

3. Ting DSWCheung CY-LLim Get alDevelopment and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. 2017;318:221123. DOI10.1001/jama.2017.18152.

4. Burlina PMJoshi NPekala Met alAutomated grading of age-related macular degeneration from color fundus images using deep convolutional neural networksJAMA Ophthalmol2017;135:11706. DOI10.1001/jamaophthalmol.2017.3782.

5. Guyon IElisseeff AAn introduction to variable and feature selectionJ Mach Learn Res2003;3:115782.

6. Kohavi RA study of cross-validation and bootstrap for accuracy estimation and model selectionInternational Joint Conference on Artificial Intelligence1995;14:113745.

7. Cortes CVapnik VSupport-vector networks. Mach Learn. 1995;20:27397. DOI10.1007/BF00994018.

Further Resources

Share this Article
Related Content In Ocular Surface Disease
  • Copied to clipboard!
    accredited arrow-down-editablearrow-downarrow_leftarrow-right-bluearrow-right-dark-bluearrow-right-greenarrow-right-greyarrow-right-orangearrow-right-whitearrow-right-bluearrow-up-orangeavatarcalendarchevron-down consultant-pathologist-nurseconsultant-pathologistcrosscrossdownloademailexclaimationfeedbackfiltergraph-arrowinterviewslinkmdt_iconmenumore_dots nurse-consultantpadlock patient-advocate-pathologistpatient-consultantpatientperson pharmacist-nurseplay_buttonplay-colour-tmcplay-colourAsset 1podcastprinter scenerysearch share single-doctor social_facebooksocial_googleplussocial_instagramsocial_linkedin_altsocial_linkedin_altsocial_pinterestlogo-twitter-glyph-32social_youtubeshape-star (1)tick-bluetick-orangetick-red tick-whiteticktimetranscriptup-arrowwebinar Sponsored Department Location NEW TMM Corporate Services Icons-07NEW TMM Corporate Services Icons-08NEW TMM Corporate Services Icons-09NEW TMM Corporate Services Icons-10NEW TMM Corporate Services Icons-11NEW TMM Corporate Services Icons-12Salary £ TMM-Corp-Site-Icons-01TMM-Corp-Site-Icons-02TMM-Corp-Site-Icons-03TMM-Corp-Site-Icons-04TMM-Corp-Site-Icons-05TMM-Corp-Site-Icons-06TMM-Corp-Site-Icons-07TMM-Corp-Site-Icons-08TMM-Corp-Site-Icons-09TMM-Corp-Site-Icons-10TMM-Corp-Site-Icons-11TMM-Corp-Site-Icons-12TMM-Corp-Site-Icons-13TMM-Corp-Site-Icons-14TMM-Corp-Site-Icons-15TMM-Corp-Site-Icons-16TMM-Corp-Site-Icons-17TMM-Corp-Site-Icons-18TMM-Corp-Site-Icons-19TMM-Corp-Site-Icons-20TMM-Corp-Site-Icons-21TMM-Corp-Site-Icons-22TMM-Corp-Site-Icons-23TMM-Corp-Site-Icons-24TMM-Corp-Site-Icons-25TMM-Corp-Site-Icons-26TMM-Corp-Site-Icons-27TMM-Corp-Site-Icons-28TMM-Corp-Site-Icons-29TMM-Corp-Site-Icons-30TMM-Corp-Site-Icons-31TMM-Corp-Site-Icons-32TMM-Corp-Site-Icons-33TMM-Corp-Site-Icons-34TMM-Corp-Site-Icons-35TMM-Corp-Site-Icons-36TMM-Corp-Site-Icons-37TMM-Corp-Site-Icons-38TMM-Corp-Site-Icons-39TMM-Corp-Site-Icons-40TMM-Corp-Site-Icons-41TMM-Corp-Site-Icons-42TMM-Corp-Site-Icons-43TMM-Corp-Site-Icons-44TMM-Corp-Site-Icons-45TMM-Corp-Site-Icons-46TMM-Corp-Site-Icons-47TMM-Corp-Site-Icons-48TMM-Corp-Site-Icons-49TMM-Corp-Site-Icons-50TMM-Corp-Site-Icons-51TMM-Corp-Site-Icons-52TMM-Corp-Site-Icons-53TMM-Corp-Site-Icons-54TMM-Corp-Site-Icons-55TMM-Corp-Site-Icons-56TMM-Corp-Site-Icons-57TMM-Corp-Site-Icons-58TMM-Corp-Site-Icons-59TMM-Corp-Site-Icons-60TMM-Corp-Site-Icons-61TMM-Corp-Site-Icons-62TMM-Corp-Site-Icons-63TMM-Corp-Site-Icons-64TMM-Corp-Site-Icons-65TMM-Corp-Site-Icons-66TMM-Corp-Site-Icons-67TMM-Corp-Site-Icons-68TMM-Corp-Site-Icons-69TMM-Corp-Site-Icons-70TMM-Corp-Site-Icons-71TMM-Corp-Site-Icons-72