1,721,006 research outputs found
AS new swarm intelligence coordination model inspired by collective prey retrieval and its application to image alignment
The impact of modeling overall argumentation with tree kernels
Several approaches have been proposed to model either the explicit sequential structure of an argumentative text or its implicit hierarchical structure. So far, the adequacy of these models of overall argumentation remains unclear. This paper asks what type of structure is actually important to tackle downstream tasks in computational argumentation. We analyze patterns in the overall argumentation of texts from three corpora. Then, we adapt the idea of positional tree kernels in order to capture sequential and hierarchical argumentative structure together for the first time. In systematic experiments for three text classification tasks, we find strong evidence for the impact of both types of structure. Our results suggest that either of them is necessary while their combination may be beneficial
Fake News, Disinformation, Propaganda, and Media Bias
The rise of Internet and social media changed not only how we consume information, but it also democratized the process of content creation and dissemination, thus making it easily available to anybody. Despite the hugely positive impact, this situation has the downside that the public was left unprotected against biased, deceptive, and disinformative content, which could now travel online at breaking-news speed and allegedly influence major events such as political elections, or disturb the efforts of governments and health officials to fight the ongoing COVID-19 pandemic. The research community responded to the issue, proposing a number of inter-connected research directions such as fact-checking, disinformation, misinformation, fake news, propaganda, and media bias detection. Below, we cover the mainstream research, and we also pay attention to less popular, but emerging research directions, such as propaganda detection, check-worthiness estimation, detecting previously fact-checked claims, and multimodality, which are of interest to human fact-checkers and journalists. We further cover relevant topics such as stance detection, source reliability estimation, detection of persuasion techniques in text and memes, and detecting malicious users in social media. Moreover, we discuss large-scale pre-trained language models, and the challenges and opportunities they offer for generating and for defending against neural fake news. Finally, we explore some recent efforts aiming at flattening the curve of the COVID-19 infodemic
"Kernelized" Self-Organizing Maps for Structured Data
The suitability of the well known kernels for trees, and the l esser known Self- Organizing Map for Structures for categorization tasks on structured data is inves- tigated in this paper. It is shown that a suitable combinatio n of the two approaches, by defining new kernels on the activation map of a Self-Organizing Map for Structures, can result in a system that is significantly more accur ate for categorization tasks on structured data. The effectiveness of the proposed approach is demon- strated experimentally on a relatively large corpus of XML formatted data
Fact-Checking, Fake News, Propaganda, Media Bias, and the COVID-19 Infodemic
Social media have democratized content creation and have made it easy for anybody to spread information online. However, stripping traditional media from their gate-keeping role has left the public unprotected against biased, deceptive and disinformative content, which could now travel online at breaking-news speed and influence major public events. For example, during the COVID-19 pandemic, a new blending of medical and political disinformation has given rise to the first global infodemic. We offer an overview of the emerging and inter-connected research areas of fact-checking, disinformation, "fake news'', propaganda, and media bias detection. We explore the general fact-checking pipeline and important elements thereof such as check-worthiness estimation, spotting previously fact-checked claims, stance detection, source reliability estimation, detection of persuasion techniques, and detecting malicious users in social media. We also cover large-scale pre-trained language models, and the challenges and opportunities they offer for generating and for defending against neural fake news. Finally, we discuss the ongoing COVID-19 infodemic
A new swarm intelligence coordination model inspired by collective prey retrieval and its application to image alignment
Proppy: Organizing the news based on their propagandistic content
Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda
Selecting sentences versus selecting tree constituents for automatic question ranking
Community question answering (cQA) websites are focused on users who query questions onto an online forum, expecting for other users to provide them answers or suggestions. Unlike other social media, the length of the posted queries has no limits and queries tend to be multi-sentence elaborations combining context, actual questions, and irrelevant information. We approach the problem of question ranking: given a user's new question, to retrieve those previously-posted questions which could be equivalent, or highly relevant. This could prevent the posting of nearly-duplicate questions and provide the user with instantaneous answers. For the first time in cQA, we address the selection of relevant text-both at sentence- and at constituent-level- for parse-tree-based representations. Our supervised models for text selection boost the performance of a tree kernel-based machine learning model, allowing it to overtake the current state of the art on a recently released cQA evaluation framework
Learning to re-rank questions in community question answering using advanced features
We study the impact of different types of features for question ranking in community Question Answering: bag-of-words models (BoW), syntactic tree kernels (TKs) and rank features. It should be noted that structural kernels have never been applied to the question reranking task, i.e., question to question similarity, where they have to model paraphrase relations. Additionally, the informal text, typically present in forums, poses new challenges to the use of TKs. We compare our learning to rank (L2R) algorithms against a strong baseline given by the Google rank (GR). The results show that (i) our shallow structures used in TKs are robust enough to noisy data and (ii) improving GR requires effective BoW features and TKs along with an accurate model of GR features in the used L2R algorithm
- …
