University of Szeged
SZTE Doktori Értekezések Repozitórium (SZTE Repository of Dissertations)Not a member yet
7511 research outputs found
Sort by
Periodontitis: Causal Links and Treatment in the Context of Diabetes, Smoking, and In-Stent Restenosis
Enhancing Missense Variant Classification with AlphaFold2-Generated Mutant Structures
Genetic variants, particularly missense variants, play a significant role in human disease, contributing to the development of both monogenic disorders and cancer. Typically, these mutations can impair protein function by disrupting protein stability. Therefore, given their immense impact on human health, their identification and classification are a priority in clinical diagnostics and personalized medicine.
Recent advancements in next-generation sequencing have significantly reduced sequencing costs. This has accelerated its integration into routine clinical diagnostics, resulting in the identification of thousands of missense variants, many of which have unknown impacts on protein function. These variants, known as variants of uncertain significance (VUS), present a major diagnostic challenge for medical professionals and patients.
To address the increasing number of reported VUS, numerous in-silico methods have been developed over the past decade. However, their clinical application remains limited as they are currently accepted only as supportive evidence. Given their potential to swiftly prioritize and classify thousands of variants, continuous research is conducted to improve their predictive performance.
Structural information has long been considered a valuable resource that could enhance the performance of these predictive models, but its integration has been largely hindered by the limited availability of protein structures. The development of AlphaFold2 has made it possible to access the structures of thousands of proteins, creating new opportunities to incorporate structural information into variant classification.
In this thesis, we present a comprehensive overview of the performance of some of the most widely used variant effect predictors and explore the efficiency of structural features derived from mutated protein structures in improving missense variant classification.
The first part of the study concentrated on the evaluation of the performance of ten variant effect predictors, PROVEAN, META-SNP, SIFT, PolyPhen-2 (HumDiv and HumVar), SNPs&GO, PredictSNP, PhD-SNP, PANTHER-PSEP, and PMut using general and gene-specific datasets. We have demonstrated that the performance of the benchmarked variant effect predictors varies considerably across different datasets, with some exhibiting gene-specific behavior. When analyzed within the framework of guidelines for missense variant classification we show that this performance influences the outcome of the computational analysis. Based on a set of criteria, we have determined the best-performing variant effect predictors for BRCA1 and BRCA2, which we recommend for variant prioritization and classification in these two genes. Additionally, we have highlighted the impact of type 1 circularity in the selection of the best-performing variant effect predictors, noting that failure to account for it can alter their ranking.
In the second part of this study, we developed a large-scale protein structure prediction pipeline, along with a high-performance computing-optimized job submission strategy, to predict the structures of 77,713 proteins, including 65,612 variant models and 12,101 wild-type structures. With 70.1% of the generated structures predicted with high confidence, we present the largest collection of mutated protein structures to date. These structures may serve as a foundation for future studies in various areas in variant classification such as feature development and engineering as well as in structural bioinformatics, including studies on protein-protein interactions.
Lastly, in part 3 we shifted our focus on exploring the potential of the generated mutated protein structures to enhance missense variant classification. By capturing differences between the wild-type and mutated, pathogenic, and benign structures five distinct features were developed: Alpha carbon distance (Cα-Dist), Alpha carbon delta pLDDT score (Cα-ΔpLDDT), Delta SASA Normalized (ΔSASA Normalized), Miyazawa-Jernigan Potential of the mutant (MJ-Mutant), and dRMS Local. These features were used to train three machine learning models, SIESTA, SIESTA-Str, and SIESTA-Seq. We have shown that while structure-derived features alone did not outperform sequence-based information, they have the potential to play a complementary role, as evidenced by the improved performance of SIESTA, which integrates both structural and sequence-based features.
These findings along with the extended collection of mutated protein structures lay the groundwork for advancing future research in variant classification, with the potential to improve the classification and reclassification of variants of uncertain significance
Dispute Settlement Systems of International Investment Law: Analyses of the Systems and Reform Proposals
Mesterségesen generált és légköri nanorészecskék méreteloszlás spektrumának vizsgálata és gyakorlati alkalmazása
The origin and composition of the "Forgotten people": genetic analysis of the Sarmatian-period population of the Carpathian Basin
The western part of the Carpathian Basin was integrated into the Roman Empire at the beginning of the 1st century. The territories east of the Danube were however occupied by Sarmatian nomads, who - according to historical and archaeological sources - migrated from the Pontic-Caspian Steppes. They remained key players in the political landscape of the region until the arrival of the Huns in the 4th century, after which they vanished from the historical records. The large number of archaeological findings left behind by the Sarmatians indicate a considerable population size during their occupation with a possible enduring impact on the population history of the region.
The genetic composition of the Sarmatian groups living on the Russian Steppe have been relatively well documented, while the Sarmatians living on the Great Hungarian Plain are understudied, and their genetic origin and relations to their neighbours are still to be investigated. To fill this gap, we generated whole genome shotgun sequences from 118 individuals dated to the Sarmatian Period of the Carpathian Basin. We also sequenced 17 Sarmatian individuals excavated outside the Carpathian Basin in Romania to assess the genetic changes the Sarmatians may have encountered during their alleged westward migration. Additionally, we analysed 21 new genomes from the 4-5th century of the Carpathian Basin to investigate the likely population changes caused by the arrival of the Huns and the possible survival of the Sarmatian populations after this period.
Our data shows that the Sarmatians of the Carpathian Basin display clear genetic connections towards the Sarmatians of the Volga-Ural region with both classical population genetic and IBD analyses. However, their steppe-related genetic affinity is strongly depleted compared to the Sarmatians excavated in Romania. Furthermore, the individuals dated to the 4-5th century do not form a distinct population, rather fit into the genetic landscape defined by the previous periods. Finally, the individuals from the Sarmatian Period (the Sarmatians from Romania included) show strong genetic connectedness based on their IBD sharing indicating a lively and mobile population which projects its genealogies well into the subsequent Avar and even Hungarian Conquer Periods
Absolute beat-to-beat variability and instability parameters of ECG intervals predict ischemia-induced ventricular fibrillation
Background: ECG interval measurement is possible during arrhythmias. Beat-to-beat variability and instability (BVI) of ECG intervals measured irrespective of rhythm (absolute BVI) predict drug-induced torsades de pointes (TdP) more accurately than the same variables derived exclusively during sinus rhythm (sinus BVI) in rabbits. We have tested whether this approach predicts another stochastic arrhythmia event, ventricular fibrillation (VF), in a different pathophysiological setting.
Methods and Results: Langendorff perfused rat hearts were subjected to regional ischemia for 15 min. Absolute BVI parameters were derived from ECG intervals measured in 40 consecutive ventricular complexes (irrespective of the rhythm) immediately preceding VF onset and compared with values in time-matched ECGs in hearts that did not express VF. Increased frequency of non-sinus beats and ‘R on T’ arrhythmic beats, shortened mean RR and electrical diastolic intervals, and increased BVI of cycle length and repolarization were associated with VF occurrence. Absolute BVI parameters that quantify variability of repolarization (e.g. ‘short-term variability’ of QT interval) had the best predictive power with very high sensitivity and specificity. In contrast, VF was not predicted by any BVI parameter derived exclusively from sinus rhythm.
Conclusions: The novel absolute BVI parameters that predicted TdP liability in rabbits also predict VF liability during regional ischemia in rat hearts, indicating a diagnostic and mechanistic congruence. Repolarization inhomogeneity appears to play a pivotal role in ischemic VF induction since absolute BVI parameters that quantify repolarization variability had outstanding predictive power