1,720,970 research outputs found

    Prediksi Penyakit Diabetes menggunakan Teknik Imputasi Missforest dan Klasifikasi LightGBM

    Full text link
    AbstrakDiabetes adalah salah satu penyakit kronis dengan grafik prevalensinya meningkat secara global. Penyakit ini disebabkan oleh gangguan metabolisme tubuh yang memengaruhi kadar gula darah, dan jika tidak ditangani sejak dini dapat menimbulkan komplikasi serius seperti stroke, gagal ginjal, kebutaan, hingga kematian. Penelitian ini mengembangkan model prediksi risiko diabetes berbasis klasifikasi biner menggunakan algoritma LightGBM yang dikombinasikan dengan teknik imputasi Missforest untuk menangani data yang hilang. Dataset yang digunakan berasal dari Pima Indian, tersedia secara publik di Kaggle. Tahapan pre-processing mencakup imputasi data hilang, penanganan outlier dengan Isolution Forest, pembagian data menjadi 80:20. Evaluasi model menunjukkan hasil akurasi sebesar 91,84% dan ROC AUC 0.9614. BMI menjadi faktor paling berpengaruh dalam prediksi yang diikuti oleh DiabetesPedigreeFunction dan Glucose.Kata kunci: diabetes melitus, data mining, klasifikasi, LightGBM, missforestAbstractDiabetes mellitus is one of the most common chronic diseases, with a globally increasing prevalence. It is caused by metabolic disorders that affect blood glucose levels and, if not treated early, can lead to serious complications such as stroke, kidney failure, blindness, and even death. This research develops a diabetes risk prediction model based on binary classification using the LightGBM algorithm combined with the Missforest imputation technique to handle missing data. The dataset used is the publicly available Pima Indian dataset from Kaggle. The pre-processing stages include missing value imputation, outlier handling using Isolution Forest, an 80:20 data split. Model evaluation shows an accuracy of 91.84% and a ROC AUC 0.9614. BMI was found to be the most influential factor in the prediction, followed by DiabetesPedigreeFunction and Glucose.Keywords: diabetes mellitus, data mining, classification, LightGBM, missfores

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Peningkatan Kemampuan Pengenalan Emosi Melalui Suara dalam Bahasa Indonesia

    Full text link
    AbstrakInteraksi manusia dengan komputer merupakan fenomena yang terus berkembang diikuti oleh meningkatnya penggunaan komputer yang sering digunakan dalam ranah sosial manusia. Manusia saling berinteraksi dengan melibatkan emosi untuk memahami seseorang. Emosi manusia seringkali terwakili melalui cara berbicara. Penelitian tentang pengenalan emosi melalui suara telah banyak dilakukan, namun terdapat upaya peningkatan pengenalan emosi melalui suara, terutama masalah korpus yang menjadi salah satu faktor yang menjadikan pengenalan emosi ini belum menghasilkan akurasi pengenalan yang optimal, khususnya berkaitan dengan imbalance data. Penelitian ini dilakukan untuk meningkatkan performa pengenalan emosi untuk mengenali lima kelas emosi yaitu senang, marah, sedih dan kepuasan serta netral menggunakan algoritma boosting. Selain itu, digunakan pula metode seperti CNN dan RNN untuk dapat dilakukan perbandingan serta penerapan SMOTE untuk korpusnya. Setelah eksperimen, dapat dihasilkan akurasi pengenalan mencapai 65% untuk akurasi untuk data tes berdasarkan konfigurasi 22050 Hz sebagai sampling rate, MFCCs dan oversampling SMOTE.Kata kunci: Imbalance data, Algoritma Boosting, CNN, RNN, SMOTEAbstractHuman interaction with computers are a growing phenomenon followed by the increasing use of computers which are often utilized in human social activities. Humans interact with one another by involving emotions. Plenty of research on speech emotion recognition has been established. Nevertheless, there are still efforts to enhance speech emotion recognition, especially the corpus problem which is one of the factors that the model does not in an optimal performance, especially about imbalance data. This study was conducted to enhance the performance of emotion recognition to recognize five class emotions: happiness, angry, sadness, contentment, and neutral. Furthermore, we employed CNN, RNN, and Boosting Algorithms. Lastly, we applied SMOTE to the corpus. After the experiment, the accuracy reached 65% with 22050 Hz configuration as rate, MFCCs, and SMOTE oversampling.Keywords: Data Imbalance, Boosting Algorithms, CNN, RNN, SMOT

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    A BiLSTM-Based Approach For Speech Emotion Recognition In Conversational Indonesian Audio using SMOTE

    Full text link
    Speech Emotion Recognition (SER) identifies human emotions through voice signal analysis, focusing on pitch, intonation, and tempo. This study determines the optimal sampling rate of 48,000 Hz, following the Nyquist-Shannon theorem, ensuring accurate signal reconstruction. Audio features are extracted using Mel-Frequency Cepstral Coefficients (MFCC) to capture frequency and rhythm changes in temporal signals. To address data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) generates synthetic data for the minority class, enabling more balanced model training. A One-vs-All (OvA) approach is applied in emotion classification, constructing separate models for each emotion to enhance detection. The model is trained using Bidirectional Long Short-Term Memory (BiLSTM), capturing contextual information from both directions, improving understanding of complex speech patterns. To optimize the model, Nadam (Nesterov-accelerated Adaptive Moment Estimation) is used to accelerate convergence and stabilize weight updates. Bagging (Bootstrap Aggregating) techniques are implemented to reduce overfitting and improve prediction accuracy. The results show that this combination of techniques achieves 78% accuracy in classifying voice emotions, contributing significantly to improving emotion detection systems, especially for under-resourced languages

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    MESIN PENTERJEMAH BAHASA INDONESIA-BAHASA SUNDA MENGGUNAKAN RECURRENT NEURAL NETWORKS

    Full text link
    Penterjemah merupakan suatu proses dimana suatu bahasa diubah ke dalam bahasa lain. Penterjemah pada Penelitian lalu dilakukan dengan menggunakan pendekatan Phrase-based Statistical Machine Translation (PSMT). Penelitian ini membangun sebuah penerjemah Bahasa Indonesia ke Bahasa Sunda. Adapun tahapan yang digunakan dimulai dari pra proses menggunakan text preprocessing dan word embedding Word2Vec dan pendekatan yang digunakan yaitu Neural Machine Translation (NMT) dengan arsitektur Encoder-Decoder yang didalamnya terdapat sebuah Recurrent Neural Network (RNN). Pengujian pada penelitian menghasilkan nilai optimal oleh GRU sebesar 99,17%. Model dengan menggunakan Attention mendapat 99.94%. Penggunaan model optimasi mendapat hasil optimal oleh Adam 99.35% dan hasil BLEU Score dengan optimal bleu 92.63% dan brievity penalty 0.929. Hasil dari mesin penterjemah menghasilkan prediksi pelatihan dari Bahasa Indonesia ke Bahasa Sunda apabila input kalimat sesuai dengan korpus dan hasil terjemahan kurang sesuai ketika input kalimat berbeda dari korpus

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods

    Author Index

    No full text
    Nao informado
    corecore