Search CORE

1,721,022 research outputs found

Semi-supervised learning for classification of Nordic news articles

Author: Fossåen Nils Magne
Publication venue
Publication date: 01/01/2020
Field of study

Master's thesis in Computer scienceSemi-supervised learning defines the techniques that fall in between supervised and unsupervised learning. It is commonly used in classification settings where one has a lesser amount of labeled data compared to unlabeled. The goal is to extract extra learning from the unlabeled data to improve on the supervised classification. We will explore some of the approaches to semi-supervised learning to improve on the classification of Nordic news articles in the corpus provided. We will be exploring the methods of self-training in several different configurations and methods of feature extraction and engineering. We will also provide some background and baseline using common supervised methods for improving results as well as different document representations like word-embedding so that we will be able to compare and put our semi-supervised results in relation to these methods. We will see that while some of the methods explored did not succeed, others did and in relation to some of the supervised methods their performance is comparable. We will also see some promising approaches for countering the imbalance problem when considering confident pseudo-labels

NORA - Norwegian Open Research Archives

UiS Brage

Generative adversarial networks for bias flipping

Author: Le Nguyen Khoa
Publication venue
Publication date: 01/01/2020
Field of study

Master's thesis in Computer scienceThe disinformation news in media channels such as social media websites or online newspapers has become a big challenge for many organizations, governments, and scientific researchers. In connection to fake news, the political bias (left-wing or right-wing) of the news articles are recently receiving more attention. In this thesis, we leverage the Adversarially Regularized AutoEncoder (ARAE) model, which enhances the adversarial autoencoder (AAE) by learning a parameterized prior as a Generative Adversarial Networks (GAN) to generate bias-flipped headlines. We perform the experiments with multiple datasets then discuss how these approaches contribute to the bias flipping and detecting problems

NORA - Norwegian Open Research Archives

UiS Brage

Scaling Network Embeddings

Author: Maksyk Vladyslav
Publication venue
Publication date: 01/01/2020
Field of study

Master's thesis in Computer ScienceA Recommendation System is an intelligent machine learning system that seeks to predict a customer ranked set of personalized products from a dynamic pool of diverse choices. We can define the main objective of such systems as ranking edges in an undirected unweighted graph consisting of user and item nodes. Deep Graph embeddings have recently attracted the interests of both academia and industry, mainly because of its simplicity and effectiveness in a variety of applications. This thesis's primary purpose is to perform research on the existing graph embeddings methods for recommendation algorithms. We aim to transform undirected unweighted graphs into vectors, also known as graph embeddings, to make a representation that would be suitable for different machine learning algorithms. At first, we introduce the reader to some existing and conventional approaches that allow us to create such embeddings. We then present several modifications and improvements to the existing methods. Finally, we use several evaluation metrics to showcase the performance evaluations of such modifications

NORA - Norwegian Open Research Archives

UiS Brage

Graph-based Entity Recognition & Inference and Link Prediction in static Network

Author: Alam Junaid
Publication venue
Publication date: 15/06/2018
Field of study

The size of data we are producing is exponentially increasing every year. According to former Google CEO Eric Schmidt, we produce as much information in two days now as we did from the dawn of mankind through 2003. The Oil & Gas industries produce millions of linked data each day. However, a vast majority of the data are unstructured or semi-structured data. To make a good decision, it is very important that we know our data. Many industries rely on the insights of their data to take any further action. Therefore, it is very important for the advancement of a company or an institution to have an overall view of the data they are producing. For this thesis, we studied some data produced by Oil & Gas industries that are provided to us by LOOPS, and we found that the data are usually linked data. Two linked data can be interlinked with each other and become more useful through semantic queries. However, due to poor presentation of the data, the benefit that can be achieved from linked data is lacking. In this thesis, we devised a system that extracts the meaningful information from the semi-structured data and visualizes the data using the power of graph. We then use the graph to have the insights of the data. The system can recognize entities in the graph and give important feedbacks by inferring more knowledge about the recognized entities. As we said, the data are interlinked with other data. However, usually in liked data, some of the links between the data might be missing. The more the data are linked, the more useful information we can learn from it. Therefore, we invested a significant portion of our research in predicting the possible missing links between data using supervised and unsupervised link prediction approach.submittedVersio

UiS Brage

Analyse og presentasjon av Mars rover data

Author: Fjellheim Markus
Publication venue
Publication date: 01/01/2020
Field of study

Master's thesis in Computer scienceSpace-crafts and their instruments tend to collect way more data on their missions than what can be transmitted back to earth in a timely manner. This leads to the need to prioritize what data is to be downloaded and what is not. This writing focuses on automatic caption generation of images from the surface of Mars taken by rovers that are sent to Earth as a way to save bandwidth and making the images searchable. The high latency and low bandwidth between Earth and spacecrafts complicates communication. High latency makes it difficult to control the crafts as you don't see the results before hours later. The low bandwidth does not help either as downloading data from the crafts takes a long time. As an example, if you want to download the music video of the song "Never Gonna Give You Up" by Rick Astley, it would take about 3.17 hours. Had it not been for the bandwidth limitation, almost all the surface of Mars would already have been photographed by the Mars orbiter, but as of now, only 3% is. It is therefore important to be able to choose what images to download and which images to not. Currently, the methods used to decide on what images are of interest is by downloading thumbnail versions of the images and/or highly compressed versions used to make decisions as to whether the image is of interest. Another method is to use pixel value difference between the images to prevent identical or too similar images from being downloaded. Future Mars missions might have modern radiation safe GPUs on board, like the Snapdragon 820/855, for the purpose of data analysis. This will allow for some of the data analysis to be done on board the rover without the need to transfer the data back to Earth, only the abstracted features. This allows for better determining similarity between images and decisions to what images are worth spending the valuable bandwidth on. Machine learning capabilities and machine vision also increase opportunities for autonomic control of the rover. Automatic caption generation of geological features in the images allow geologists on Earth to search for geological features and choose the images of interest based on the captions

NORA - Norwegian Open Research Archives

UiS Brage

Smart tekst redigering/Utvidelse for faktasjekk

Author: Ratdal Kevin
Hersi Mustafa
Publication venue
Publication date: 01/01/2021
Field of study

Chrome utvidelse, falske nyheter, cosinus likhet, fakta sjekking, RESTful AP

NORA - Norwegian Open Research Archives

UiS Brage

Detecting Fake News and Rumors in Twitter Using Deep Neural Networks

Author: Mjaaland Henrik
Publication venue
Publication date: 01/01/2020
Field of study

Master's thesis in Computer Science.The scope of this thesis is to detect fake news by classifying them as either real or fake based on article content, metadata, tweets and retweets of news articles from the Politifact dataset using graph neural networks. Fake news generally spread exponentially and more rapid than real news. This is most likely because fake news are usually more novel or dramatic and contain more superlatives than real news. Fake tweets also tend to have more rumor path propagation hops than real news, meaning tweets of fake news are retweeted more than real news. Tweets of real news articles on the other hand, tend to have a constant and slow spread, and does not reach as many people overall. There are generally two characteristics that are used for detecting fake news: article content and rumor path propagation. Most existing works have presented models based solely on one of these characteristics, which has its advantages (e.g. reduced training time), but is also reflected by poor performance results. This thesis proposes a hybrid model that takes metadata and both of the above mentioned characteristics (article content and rumor path propagation in the form of a temporal pattern) as input using bidirectional LSTM with the Keras Sequential model. Article content is word embedded using pre trained GloVe vectors. The metadata, which is continuous, is normalized and discretized. The rumor path propagation time series is computed using dates from metadata related to tweets and retweets. Some other deep learning and machine learning models are also implemented and tested for comparison. Experimental results demonstrated that the proposed model performs significantly better than all of these models.submittedVersio

NORA - Norwegian Open Research Archives

UiS Brage

Deep neural models to represent news events

Author: Chechelnytskyy Denys
Publication venue
Publication date: 15/05/2018
Field of study

Master's thesis in Computer scienceThe thesis is dedicated to the background linking tasks for news articles, utilizing the deep neural network models. The goal is to retrieve similar articles based on the news story currently viewed. We examined neural and non-neural representations for raw text and discussed notions of similarity a good model should identify and retrieve. We covered various deep neural network models and highlighted their advantages and disadvantages. Inspired by deep neural architectures in the area of Information Retrieval we adjusted the Deep Semantic Similarity model to the background linking task. Our refactored DSSM architecture employs a convolutional neural network with multiple filters and regularization techniques. This convolutional network acts as an auto-encoder and learns the compressed representations of news articles and news stories. Cosine similarity is used as the proximity metric to retrieve related news articles. Experimental results prove that our adjusted DSSM model is applicable for the background linking task, and overperforms the baseline SVM model. We discovered that corpora distributions affect the performance of the model. A model trained on news corpus containing mostly political and social news will perform poorly on news corpus about sport and entertainment news. Grid search and hyperparameter tuning are also important. Deep neural network architectures are powerful tools which can be used to solve complicated tasks and approximate nearly any function. Having a good quality dataset is half of the success. The DSSM model is planned to be adjusted to various news corpora and applied to different tasks; such as automatic linking of news articles to Wikipedia pages and linking news articles to news events. We assume this model can be extended to learn representations of a sequence of events for the task of linking background events

UiS Brage

Human-Guided Phasic Policy Gradient in Minecraft: Exploring Deep Reinforcement Learning with Human Preferences in Complex Environments

Author: Valvik Dag Hermann
Publication venue
Publication date: 01/01/2023
Field of study

This study presents a novel approach to enhancing the performance of artificial agents in complex environments like Minecraft, where traditional reward-based learning strategies can be challenging to apply. To improve the efficacy and efficiency of fine-tuning a foundation model for complex tasks, we propose the Human-Guided Phasic Policy Gradient (HPPG) algorithm, which combines human preference learning with the Phasic Policy Gradient technique. Our key contributions include validating the use of behavioral cloning to improve agent performance and introducing the HPPG algorithm, which employs a reward predictor network to estimate rewards based on human preferences. We further explore the challenges associated with the HPPG algorithm and propose strategies to mitigate its limitations. Through our experiments, we demonstrate significant improvements in the agent’s performance when executing complex tasks in Minecraft, laying the groundwork for future developments in reinforcement learning algorithms for complex, real-world tasks without defined rewards. Our findings contribute to the broader goal of bridging the gap between artificial agents and human-like intelligence

UiS Brage

Adaptive Selection and Delivery of Rich Media Notifications to Mobile Users

Author: Jonassen Mats
Publication venue
Publication date: 2018
Field of study

Master's thesis in Computer scienceThe ongoing increase in cellular network coverage is steadily increasing the availability of individuals all around the world. This availability enables notification solutions to achieve their goals of announcing new content with great success. Recently notification systems have evolved to include media content. In cases where this content is of substantial quality and frequency, it may induce large resource consumption. We aim to limit the total resource consumption of media notifications while preserving the user experience. We achieve this by implementing established techniques for measuring the quality of content. We enable the utilization of these techniques by implementing a working systems tackling the practical issues of notification generation, device communication and notification scheduling. We test the system using Spotify as our notification provider and compare our results to the standard FIFO approach of notifications. Using these tests we find that correctly prioritizing content can massively increase a users utilization of notification content

UiS Brage