1,720,973 research outputs found
Genetic polymorphisms in inflammatory genes and pancreatic cancer risk: a two-phase study on more than 14 000 individuals
There is overwhelming evidence that inflammation plays a key role in the pathogenesis of cancer and its progression. Inflammation is regulated through a complex network of genes and polymorphic variants in these genes have been found to be associated to risk of various human cancers, alone or in combination with environmental variables. Despite this, not much is known on the genetic variability of genes that regulate inflammation and risk of pancreatic ductal adenocarcinoma (PDAC). We performed a two-phase association study considering the genetic variability of 76 genes that are key players in inflammatory response. We analysed tagging single nucleotide polymorphisms (SNPs) and regulatory SNPs on 7207 PDAC cases and 7063 controls and observed several associations with PDAC risk. The most significant association was between the carriers of the A allele of the CCL4-rs1719217 polymorphism, which was reported to be also associated with the expression level of the CCL4 gene, and increased risk of developing PDAC (odds ratio = 1.12, 95% confidence interval = 1.06-1.18, P = 3.34 × 10-5). This association was significant also after correction for multiple testing, highlighting the importance of using potentially functional SNPs in order to discover more genetic variants associated with PDAC risk
Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the UK Biobank
Background: Predicting the risk of developing pancreatic ductal adenocarcinoma (PDAC) is of paramount importance, given its high mortality rate. Current PDAC risk prediction models rely on a limited number of variables, do not include genetics, and have a modest accuracy. Aim: This study aimed to develop an interpretable PDAC risk prediction model, based on machine learning (ML). Methods: Five ML models (Adaptive Boosting, eXtreme Gradient Boosting, CatBoost, Deep Forest and Random Forest) built on 56 exposome variables and a polygenic risk score (PRS) were tested in 654 PDAC cases and 1,308 controls of the UK Biobank. Additionally, SHapley Additive exPlanation (SHAP) and Global model Interpretation via the Recursive Partitioning (Girp) were employed to explain the models. Results: All models provided similar performance, but based on recall the best was CatBoost (77.10 %). SHAP highlighted age and the PRS as primary contributors across all models. Girp developed rules to discern cases from controls, identifying age, PRS, and pancreatitis in most of the rules. Conclusion: The predictive models tested have exhibited good performance, indicating their potential application in the clinical field in the near future, with the PRS playing a key role in identifying high-risk individuals as demonstrated by the explainers
Regression and machine learning approaches identify potential risk factors for glioblastoma multiforme
Glioblastoma multiforme is a lethal disease, with a 5-year survival rate of <10%. The identification of risk factors for glioblastoma multiforme is essential for the understanding of this disease and could facilitate more effective stratification of high-risk individuals. However, our current knowledge of glioblastoma multiforme risk factors is limited. Given the complexity and heterogeneity of the disease, traditional epidemiological approaches may be insufficient to study risk factors for glioblastoma multiforme. The combination of traditional approaches with machine learning models could prove effective in identifying relevant factors for glioblastoma multiforme risk. In this study, we developed glioblastoma multiformerisk models in the UK Biobank cohort using 576 glioblastoma multiforme cases and 302 602 controls. First, 369 exposures were tested with traditional regression models in a case–control study and significant associations were identified. Subsequently, significant features were filtered based on their completion rate and correlation. The selected exposures were then used to develop two machine learning models: a support vector machine and a Multi-Layer Perceptron. To address the imbalance within the subpopulation, two controls per case with full data were selected, resulting in 442 glioblastoma multiforme cases and 884 controls being analysed with the machine learning models. Relevant factors for glioblastoma multiforme risk were identified by explaining the results of the two models with Shapley Additive explanations. Traditional regression methods identified 38 significant associations between environmental exposures and glioblastoma multiforme risk under the Bonferroni threshold (P < 1.35 × 10−4). Subsequent filtration results in the selection of 12 exposures, which were then analysed with age, sex and a polygenic score using the two machine learning models. Support vector machine and the multi-layer perceptron demonstrated a good sensitivity (0.91 and 0.82, respectively). In addition to age and genetics, Shapley Additive explanations demonstrated significant contributions of insulin-like growth factor 1 blood levels and the right-hand grip strength on the predictions made by the models, with the latter effect potentially being confounded by endogenous testosterone levels. The integration of machine learning with traditional models has the potential to enhance the identification of risk factors for glioblastoma multiforme
Artificial intelligence to predict cancer risk, are we there yet? A comprehensive review across cancer types
Cancer remains the second leading cause of death worldwide, representing a substantial challenge to global health. Although traditional risk prediction models have played a crucial role in epidemiology of several cancer types, they have limitations especially in the ability to process complex and multidimensional data. In contrast, artificial intelligence (AI) approaches represent a promising solution to overcome this limitation. AI techniques have the potential to identify complex patterns and relationships in data that traditional methods might overlook, making them especially useful for handling large and heterogeneous datasets analysed in cancer research. This review first examines the current state of the art of AI techniques, highlighting their differences and suitability for various data types. Then, offers a comprehensive analysis of the literature, focusing on the application of AI approaches in nineteen cancer types (bladder cancer, breast cancer, cervical cancer, colorectal cancer, endometrial cancer, esophageal cancer, gastric cancer, gynaecological cancers, head and neck cancer, haematological cancers, kidney cancer, liver cancer, lung cancer, melanoma, ovarian cancer, pancreatic cancer, prostate cancer, thyroid cancer and overall cancer), evaluating the models, metrics, and exposure variables used. Finally, the review discusses the application of AI in the clinical practice, along with an assessment of its potential limitations and future directions
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
- …
