1,721,021 research outputs found
Untargeted Lipidomic Biomarkers for Liver Cancer Diagnosis: A Tree-Based Machine Learning Model Enhanced by Explainable Artificial Intelligence
Background and Objectives: Liver cancer ranks among the leading causes of cancer-related mortality, necessitating the development of novel diagnostic methods. Deregulated lipid metabolism, a hallmark of hepatocarcinogenesis, offers compelling prospects for biomarker identification. This study aims to employ explainable artificial intelligence (XAI) to identify lipidomic biomarkers for liver cancer and to develop a robust predictive model for early diagnosis. Materials and Methods: This study included 219 patients diagnosed with liver cancer and 219 healthy controls. Serum samples underwent untargeted lipidomic analysis with LC-QTOF-MS. Lipidomic data underwent univariate and multivariate analyses, including fold change (FC), t-tests, PLS-DA, and Elastic Network feature selection, to identify significant biomarker candidate lipids. Machine learning models (AdaBoost, Random Forest, Gradient Boosting) were developed and evaluated utilizing these biomarkers to differentiate liver cancer. The AUC metric was employed to identify the optimal predictive model, whereas SHAP was utilized to achieve interpretability of the model's predictive decisions. Results: Notable alterations in lipid profiles were observed: decreased sphingomyelins (SM d39:2, SM d41:2) and increased fatty acids (FA 14:1, FA 22:2) and phosphatidylcholines (PC 34:1, PC 32:1). AdaBoost exhibited a superior classification performance, achieving an AUC of 0.875. SHAP identified PC 40:4 as the most efficacious lipid for model predictions. The SM d41:2 and SM d36:3 lipids were specifically associated with an increased risk of low-onset cancer and elevated levels of the PC 40:4 lipid. Conclusions: This study demonstrates that untargeted lipidomics, in conjunction with explainable artificial intelligence (XAI) and machine learning, may effectively identify biomarkers for the early detection of liver cancer. The results suggest that alterations in lipid metabolism are crucial to the progression of liver cancer and provide valuable insights for incorporating lipidomics into precision oncology
Hybrid Explainable Artificial Intelligence Models for Targeted Metabolomics Analysis of Diabetic Retinopathy
Background: Diabetic retinopathy (DR) is a prevalent microvascular complication of diabetes mellitus, and early detection is crucial for effective management. Metabolomics profiling has emerged as a promising approach for identifying potential biomarkers associated with DR progression. This study aimed to develop a hybrid explainable artificial intelligence (XAI) model for targeted metabolomics analysis of patients with DR, utilizing a focused approach to identify specific metabolites exhibiting varying concentrations among individuals without DR (NDR), those with non-proliferative DR (NPDR), and individuals with proliferative DR (PDR) who have type 2 diabetes mellitus (T2DM). Methods: A total of 317 T2DM patients, including 143 NDR, 123 NPDR, and 51 PDR cases, were included in the study. Serum samples underwent targeted metabolomics analysis using liquid chromatography and mass spectrometry. Several machine learning models, including Support Vector Machines (SVC), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), and Multilayer Perceptrons (MLP), were implemented as solo models and in a two-stage ensemble hybrid approach. The models were trained and validated using 10-fold cross-validation. SHapley Additive exPlanations (SHAP) were employed to interpret the contributions of each feature to the model predictions. Statistical analyses were conducted using the Shapiro-Wilk test for normality, the Kruskal-Wallis H test for group differences, and the Mann-Whitney U test with Bonferroni correction for post-hoc comparisons. Results: The hybrid SVC + MLP model achieved the highest performance, with an accuracy of 89.58%, a precision of 87.18%, an F1-score of 88.20%, and an F-beta score of 87.55%. SHAP analysis revealed that glucose, glycine, and age were consistently important features across all DR classes, while creatinine and various phosphatidylcholines exhibited higher importance in the PDR class, suggesting their potential as biomarkers for severe DR. Conclusion: The hybrid XAI models, particularly the SVC + MLP ensemble, demonstrated superior performance in predicting DR progression compared to solo models. The application of SHAP facilitates the interpretation of feature importance, providing valuable insights into the metabolic and physiological markers associated with different stages of DR. These findings highlight the potential of hybrid XAI models combined with explainable techniques for early detection, targeted interventions, and personalized treatment strategies in DR management
Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy
Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework—leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints—to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC–AUC, precision–recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC–AUC and precision–recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning—particularly dANNs and SVM—in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders
Proposed Comprehensive Methodology Integrated with Explainable Artificial Intelligence for Prediction of Possible Biomarkers in Metabolomics Panel of Plasma Samples for Breast Cancer Detection
Aim: Breast cancer (BC) is the most common type of cancer in women, accounting for more than 30% of new female cancers each year. Although various treatments are available for BC, most cancer-related deaths are due to incurable metastases. Therefore, the early diagnosis and treatment of BC are crucial before metastasis. Mammography and ultrasonography are primarily used in the clinic for the initial identification and staging of BC; these methods are useful for general screening but have limitations in terms of sensitivity and specificity. Omics-based biomarkers, like metabolomics, can make early diagnosis much more accurate, make tracking the disease’s progression more accurate, and help make personalized treatment plans that are tailored to each tumor’s specific molecular profile. Metabolomics technology is a feasible and comprehensive method for early disease detection and biomarker identification at the molecular level. This research aimed to establish an interpretable predictive artificial intelligence (AI) model using plasma-based metabolomics panel data to identify potential biomarkers that distinguish BC individuals from healthy controls. Methods: A cohort of 138 BC patients and 76 healthy controls were studied. Plasma metabolites were examined using LC-TOFMS and GC-TOFMS techniques. Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Random Forest (RF) were evaluated using performance metrics such as Receiver Operating Characteristic-Area Under the Curve (ROC AUC), accuracy, sensitivity, specificity, and F1 score. ROC and Precision-Recall (PR) curves were generated for comparative analysis. The SHapley Additive Descriptions (SHAP) analysis evaluated the optimal prediction model for interpretability. Results: The RF algorithm showed improved accuracy (0.963 ± 0.043) and sensitivity (0.977 ± 0.051); however, LightGBM achieved the highest ROC AUC (0.983 ± 0.028). RF also achieved the best Precision-Recall Area under the Curve (PR AUC) at 0.989. SHAP search found glycerophosphocholine and pentosidine as the most significant discriminatory metabolites. Uracil, glutamine, and butyrylcarnitine were also among the significant metabolites. Conclusions: Metabolomics biomarkers and an explainable AI (XAI)-based prediction model showed significant diagnostic accuracy and sensitivity in the detection of BC. The proposed XAI system using interpretable metabolite data can serve as a clinical decision support tool to improve early diagnosis processes
Explainable Boosting Machines Identify Key Metabolomic Biomarkers in Rheumatoid Arthritis
Background and Objectives: Rheumatoid arthritis (RA) is a chronic autoimmune disease characterised by joint inflammation and pain. Metabolomics approaches, which are high-throughput profiling of small molecule metabolites in plasma or serum in RA patients, have so far provided biomarker discovery in the literature for clinical subgroups, risk factors, and predictors of treatment response using classical statistical approaches or machine learning models. Despite these recent developments, an explainable artificial intelligence (XAI)-based methodology has not been used to identify RA metabolomic biomarkers and distinguish patients with RA. This study constructed a XAI-based EBM model using global plasma metabolomics profiling to identify metabolites predictive of RA patients and to develop a classification model that can distinguish RA patients from healthy controls. Materials and Methods: Global plasma metabolomics data were analysed from RA patients (49 samples) and healthy individuals (10 samples). SMOTE technique was used for class imbalance in data preprocessing. EBM, LightGBM, and AdaBoost algorithms were applied to generate a discriminatory model between RA and controls. Comprehensive performance metrics were calculated, and the interpretability of the optimal model was assessed using global and local feature descriptions. Results: A total of 59 samples were analysed, 49 from RA patients, and 10 from healthy subjects. The EBM generated better results than LightGBM and AdaBoost by attaining an AUC of 0.901 (95% CI: 0.847–0.955) with 87.8% sensitivity which helps prevent false negative early RA diagnosis. The primary biomarkers EBM-based XAI identified were N-acetyleucine, pyruvic acid, and glycerol-3-phosphate. EBM global explanation analysis indicated that elevated pyruvic acid levels were significantly correlated with RA, whereas N-acetyleucine exhibited a nonlinear relationship, implying possible protective effects at specific concentrations. Conclusions: This study underscores the promise of XAI and evidence-based medicine methodology in developing biomarkers for RA through metabolomics. The discovered metabolites offer significant insights into RA pathophysiology and may function as diagnostic biomarkers or therapeutic targets. Incorporating EBM methodologies integrated with XAI improves model transparency and increases the therapeutic applicability of predictive models for RA diagnosis/management. Furthermore, the transparent structure of the EBM model empowers clinicians to understand and verify the reasoning behind each prediction, thereby fostering trust in AI-assisted decision-making and facilitating the integration of metabolomic insights into routine clinical practice
Platelet Metabolites as Candidate Biomarkers in Sepsis Diagnosis and Management Using the Proposed Explainable Artificial Intelligence Approach
Background: Sepsis is characterized by an atypical immune response to infection and is a dangerous health problem leading to significant mortality. Current diagnostic methods exhibit insufficient sensitivity and specificity and require the discovery of precise biomarkers for the early diagnosis and treatment of sepsis. Platelets, known for their hemostatic abilities, also play an important role in immunological responses. This study aims to develop a model integrating machine learning and explainable artificial intelligence (XAI) to identify novel platelet metabolomics markers of sepsis. Methods: A total of 39 participants, 25 diagnosed with sepsis and 14 control subjects, were included in the study. The profiles of platelet metabolites were analyzed using quantitative 1H-nuclear magnetic resonance (NMR) technology. Data were processed using the synthetic minority oversampling method (SMOTE)-Tomek to address the issue of class imbalance. In addition, missing data were filled using a technique based on random forests. Three machine learning models, namely extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and kernel tree boosting (KTBoost), were used for sepsis prediction. The models were validated using cross-validation. Clinical annotations of the optimal sepsis prediction model were analyzed using SHapley Additive exPlanations (SHAP), an XAI technique. Results: The results showed that the KTBoost model (0.900 accuracy and 0.943 AUC) achieved better performance than the other models in sepsis diagnosis. SHAP results revealed that metabolites such as carnitine, glutamate, and myo-inositol are important biomarkers in sepsis prediction and intuitively explained the prediction decisions of the model. Conclusion: Platelet metabolites identified by the KTBoost model and XAI have significant potential for the early diagnosis and monitoring of sepsis and improving patient outcomes
Metabolomics Biomarker Discovery to Optimize Hepatocellular Carcinoma Diagnosis: Methodology Integrating AutoML and Explainable Artificial Intelligence
Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We investigated publicly accessible data encompassing HCC patients and cirrhotic controls. The TPOT tool, which is an AutoML tool, was used to optimize the preparation of features and data, as well as to select the most suitable machine learning model. The TreeSHAP approach, which is a type of XAI, was used to interpret the model by assessing each metabolite's individual contribution to the categorization process. Results: TPOT had superior performance in distinguishing between HCC and cirrhosis compared to other AutoML approaches AutoSKlearn and H2O AutoML, in addition to traditional machine learning models such as random forest, support vector machine, and k-nearest neighbor. The TPOT technique attained an AUC value of 0.81, showcasing superior accuracy, sensitivity, and specificity in comparison to the other models. Key metabolites, including L-valine, glycine, and DL-isoleucine, were identified as essential by TPOT and subsequently verified by TreeSHAP analysis. TreeSHAP provided a comprehensive explanation of the contribution of these metabolites to the model's predictions, thereby increasing the interpretability and dependability of the results. This thorough assessment highlights the strength and reliability of the AutoML framework in the development of clinical biomarkers. Conclusions: This study shows that AutoML and XAI can be used together to create metabolomic biomarkers that are specific to HCC. The exceptional performance of TPOT in comparison to traditional models highlights its capacity to identify biomarkers. Furthermore, TreeSHAP boosted model transparency by highlighting the relevance of certain metabolites. This comprehensive method has the potential to enhance the identification of biomarkers and generate precise, easily understandable, AI-driven solutions for diagnosing HCC
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
- …
