1,720,977 research outputs found
Affect-oriented fake news detection using machine learning
Among all other media platforms, online social media plays an important role in sharing news and information along with user opinion. This quick propagation and accumulation of information form a data deluge where it is very hard to believe all the pieces of information eventhough it appears to be very realistic
Toward exploring fairness in visual transformer based natural and GAN image detection systems
Image forensics research has recently witnessed a lot of advancements towards developing computational models capable of accurately detecting natural images captured by cameras and GAN generated images. However, it is also important to ensure whether these computational models are fair enough and do not produce biased outcomes that could eventually harm certain societal groups or cause serious security threats. Exploring fairness in image forensic algorithms is an initial step towards mitigating these biases. This study explores bias in visual transformer based image forensic algorithms that classify natural and GAN images, since visual transformers are recently being widely used in image classification based tasks, including in the area of image forensics. The proposed study procures bias evaluation corpora to analyze bias in gender, racial, affective, and intersectional domains using a wide set of individual and pairwise bias evaluation measures. Since the robustness of the algorithms against image compression is an important factor to be considered in forensic tasks, this study also analyzes the impact of image compression on model bias. Hence, to study the impact of image compression on model bias, a two-phase evaluation setting is followed, where the experiments are carried out in uncompressed and compressed evaluation settings. The study could identify bias existences in the visual transformer based models distinguishing natural and GAN images, and also observes that image compression impacts model biases, predominantly amplifying the presence of biases in class GAN predictions
Cross-domain sentiment analysis on social media interactions using senti-lexicon based hybrid features
Analyzing the sentiment information from the social media interactions is a rapidly growing research area. Several studies in the literature focus on modeling the sentiment information using linguistics, generic word counts and even the contextual information, including the presence of punctuations, elongated words, emoticons, etc. In this paper, we experiment on the effectiveness of lexicon information in combination with other information, for the effective analysis of sentiment in social interactions. The objective of this study is to experimentally verify how senti-lexicons can take part in the process of modeling the sentiment information even in cross-domain sentiment analysis. In general, this paper explores the effectiveness of several feature vectors including the generic Bag of Word (BoW), linguistic (N-Gram and Part-of-Speech (POS)) and the lexicon features (number of positive and negative words). Other than the traditional features we generate hybrid features by combining the lexicon features with the BoW and linguistic features. We conduct the experiments on sentiment classification using supervised models like Linear SVC (L-SVC), Multi-Layer Perceptron (MLP), Multinomial Naïve Bayes (MNB) and Decision Tree (DT). The experiments are conducted on three different types of sentiment document datasets - the Amazon food review dataset, student opinion tweet dataset, and the Large Movie Review Dataset v1.0. We also verify the efficacy of these features in cross-domain sentiment analysis. Experiments show that hybridizing the BoW, linguistic N-Gram and POS method with lexicon features improves the accuracy of sentiment classification even for cross-domain sentiment analysis
Indexing and retrieval of Malayalam news videos based on word image matching
News videos store a huge amount of information and are a source of historical archives. The amount of news data is growing rapidly and unpredictably, hence video indexing on news videos is a tedious job. Manual indexing even though effective, it is slow and most expensive for a massive volume of data. Content Based Indexing and Retrieval (CBIR) is a solution for this problem. Textual modality based on ticker texts is powerful enough to represent a news video since it highlights all the topics in a news bulletin. Searching and retrieval from Malayalam news videos are challenging due to the lack of effective tools for automatic content based indexing and retrieval from massive database analyzing the semantics of the news videos. The ticker texts are extracted automatically using mathematical morphology and region clustering and indexing and retrieval based on text or word image matching is implemented. Different methods like Dynamic Time Warping (DTW), Exclusive-OR (XOR), and Correlation are performed for word image matching. The features Discrete Cosine Transform (DCT) and Normalized Vertical Projection Profile (nvpp) are found to give better results
Leveraging heterogeneous data for fake news detection
Nowadays, a plenty of social media platforms are available to exchange information rapidly. Such a rapid propagation and cumulation of information form a deluge, in which it is hard to believe all the pieces of information since it appears to be very realistic. In this context, characterizing and recognizing misinformation, especially, fake news, is a highly recommended computational task. News fabrication mostly happens through the textual and visual content comprised in the news article. People spreading fake news have been intentionally modifying the content of a news with some partially true information or use fully manipulated information, newly fabricated stories, etc., which could mislead others. Fake news characterization and detection are the computational studies that focus to get rid of the highly malicious news creation and propagation. The textual and visual content-related features, temporal and propagation patterns of the network, that use traditional and deep neural computations are the methods to identify fake news generation and spread. This chapter discusses the methods to leverage heterogeneous data to curb the fake news generation and propagation. We present an extensive review of the state-of-the-art fake news detection systems, in the context of different modalities emphasizing the content-based approaches including text and image modality and also discuss briefly the network, temporal, and knowledge base approaches. This study also extends to discuss the available datasets in this area, the open issues, and future directions of research
Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features
The development of world wide web with easy access to massive information sources anywhere and anytime paves way for more people to rely on online news media rather than print media. The scenario expedites rapid growth of online news industries and leads to substantial competitive pressure. In this work, we propose a set of hybrid features for online news popularity prediction before publication. Two categories of features extracted from news articles, the first being conventional features comprising metadata, temporal, contextual, and embedding vector features, and the second being enhanced features comprising readability, emotion, and psycholinguistics features are extracted from the articles. Apart from analyzing the effectiveness of conventional and enhanced features, we combine these features to come up with a set of hybrid features. We curate an Indian news dataset consisting of news articles from the most rated Indian news websites for the study and also contribute the dataset for future research. Evaluations are performed over the Indian news dataset (IND) and compared with the performance over the benchmark mashable dataset using various supervised machine learning models. Our results indicate that the proposed hybrid of enhanced features with conventional features are highly effective for online news popularity prediction before publication
Temperature prediction using machine learning approaches
Weather prediction is one of the most important research areas due to its applicability in real-world problems like meteorology, agricultural studies, etc. We propose a method for temperature prediction using three machine learning models - Multiple Linear Regression (MLR), Artificial Neural Network (ANN) and Support Vector Machine (SVM), through a comparative analysis using the weather data collected from Central Kerala during the period 2007 to 2015. The experimental results are evaluated using Mean Error (ME), Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and Correlation Coefficients (CC). The error metrics and the CC shows that MLR is a more precise model for temperature prediction than ANN and SVM
Distinguishing natural and computer generated images using multi-colorspace fused EfficientNet
The problem of distinguishing natural images from photo-realistic computer generated ones either addresses natural images versus computer graphics or natural images versus GAN images at a time. But in a real-world image forensic scenario, it is highly essential to consider all categories of image generation since in most cases image generation is unknown. We for the first time to our best knowledge, approach the problem of distinguishing natural images from photo-realistic computer generated images as a three-class classification task classifying natural, computer graphics and GAN images. For the task, we propose a Multi-Colorspace fused EfficientNet model by parallelly fusing three EfficientNet networks that follow transfer learning methodology where each of the three networks operates in a different colorspace, one in RGB, the other in LCH and the last in HSV that are chosen after analyzing the efficacy of various colorspace transformations in this image forensics problem. Our model outperforms the baselines in terms of accuracy, robustness towards post-processing and generalizability towards other datasets. We conduct psychophysics experiments to understand how accurately humans can distinguish natural, computer graphics and GAN images where we could observe that humans find difficulty in classifying these images, particularly the computer generated images, indicating the necessity of computational algorithms for the task. We also analyze the behavior of our model through visual explanations to understand salient regions that contribute to model’s decision making and compare with manual explanations provided by human participants in the form of region markings where we could observe similarities in both the explanations indicating powerful nature of our model to take the decisions meaningfully
Mathematical morphology and region clustering based text information extraction from Malayalam news videos
Innovations in technologies like improved internet data transfer, advanced digital data compression algorithms, enhancements in web technology, etc. enabled the exponential growth in digital multimedia data. Among the massive multimedia data, news videos are of higher priority due to its rich up-to-date information and historical evidences. This data is rapidly growing in an unpredictable fashion which requires an efficient and powerful method to index and retrieve such massive data. Even though manual indexing is the most effective, it is the slowest and most expensive. Hence automatic video indexing is considered as an important research problem to be addressed uniquely. In this work, we propose a Mathematical Morphology and Region Clustering based Text Information Extraction (TIE) from Malayalam news videos for Content Based Video Indexing and Retrieval (CBVIR). Morphological gradient acts as an edge detector, by enhancing the intensity variations for detecting the text regions. Further an agglomerative clustering is performed to select the significant text regions. The precision, recall and F1-measure obtained for the proposed approach are 87.45%, 94.85% and 0.91 respectively
Emotion Cognizance Improves Health Fake News Identification
Identifying fake news is increasingly being recognized as an important computational task with high potential social impact. Misinformation is routinely injected into almost every domain of news including politics, health, science, business, etc., among which, the fake news in the health domain poses serious risk and harm to health and well-being in modern societies. In this paper, we consider the utility of the affective character of news articles for fake news identification in the health domain and present evidence that emotion cognizant representations are significantly more suited for the task. We outline a simple technique that works by leveraging emotion intensity lexicons to develop emotion-amplified text representations and evaluate the utility of such a representation for identifying fake news relating to health in various supervised and unsupervised scenarios. The consistent and notable empirical gains that we observe over a range of technique types and parameter settings establish the utility of the emotional information in news articles, an often overlooked aspect, for the task of misinformation identification in the health domain
- …
