1,721,022 research outputs found
Semi-supervised learning for classification of Nordic news articles
Master's thesis in Computer scienceSemi-supervised learning defines the techniques that fall in between supervised and unsupervised learning. It is commonly used in classification settings where one has a lesser amount of labeled data compared to unlabeled. The goal is to extract extra learning from the unlabeled data to improve on the supervised classification.
We will explore some of the approaches to semi-supervised learning to improve on the classification of Nordic news articles in the corpus provided. We will be exploring the methods of self-training in several different configurations and methods of feature extraction and engineering.
We will also provide some background and baseline using common supervised methods for improving results as well as different document representations like word-embedding so that we will be able to compare and put our semi-supervised results in relation to these methods.
We will see that while some of the methods explored did not succeed, others did and in relation to some of the supervised methods their performance is comparable. We will also see some promising approaches for countering the imbalance problem when considering confident pseudo-labels
Generative adversarial networks for bias flipping
Master's thesis in Computer scienceThe disinformation news in media channels such as social media websites or online newspapers has become a big challenge for many organizations, governments, and scientific researchers. In connection to fake news, the political bias (left-wing or right-wing) of the news articles are recently receiving more attention. In this thesis, we leverage the Adversarially Regularized AutoEncoder (ARAE) model, which enhances the adversarial autoencoder (AAE) by learning a parameterized prior as a Generative Adversarial Networks (GAN) to generate bias-flipped headlines. We perform the experiments with multiple datasets then discuss how these approaches contribute to the bias flipping and detecting problems
Scaling Network Embeddings
Master's thesis in Computer ScienceA Recommendation System is an intelligent machine learning system that seeks to predict a customer ranked set of personalized products from a dynamic pool of diverse choices. We can define the main objective of such systems as ranking edges in an undirected unweighted graph consisting of user and item nodes.
Deep Graph embeddings have recently attracted the interests of both academia and industry, mainly because of its simplicity and effectiveness in a variety of applications. This thesis's primary purpose is to perform research on the existing graph embeddings methods for recommendation algorithms. We aim to transform undirected unweighted graphs into vectors, also known as graph embeddings, to make a representation that would be suitable for different machine learning algorithms. At first, we introduce the reader to some existing and conventional approaches that allow us to create such embeddings. We then present several modifications and improvements to the existing methods. Finally, we use several evaluation metrics to showcase the performance evaluations of such modifications
Graph-based Entity Recognition & Inference and Link Prediction in static Network
The size of data we are producing is exponentially increasing every year. According to
former Google CEO Eric Schmidt, we produce as much information in two days now
as we did from the dawn of mankind through 2003. The Oil & Gas industries produce
millions of linked data each day. However, a vast majority of the data are unstructured
or semi-structured data. To make a good decision, it is very important that we know
our data. Many industries rely on the insights of their data to take any further action.
Therefore, it is very important for the advancement of a company or an institution to
have an overall view of the data they are producing.
For this thesis, we studied some data produced by Oil & Gas industries that are provided
to us by LOOPS, and we found that the data are usually linked data. Two linked data
can be interlinked with each other and become more useful through semantic queries.
However, due to poor presentation of the data, the benefit that can be achieved from
linked data is lacking.
In this thesis, we devised a system that extracts the meaningful information from the
semi-structured data and visualizes the data using the power of graph. We then use the
graph to have the insights of the data. The system can recognize entities in the graph
and give important feedbacks by inferring more knowledge about the recognized entities.
As we said, the data are interlinked with other data. However, usually in liked data,
some of the links between the data might be missing. The more the data are linked, the
more useful information we can learn from it. Therefore, we invested a significant portion
of our research in predicting the possible missing links between data using supervised
and unsupervised link prediction approach.submittedVersio
Analyse og presentasjon av Mars rover data
Master's thesis in Computer scienceSpace-crafts and their instruments tend to collect way more data on their missions than what can be transmitted back to earth in a timely manner. This leads to the need to prioritize what data is to be downloaded and what is not. This writing focuses on automatic caption generation of images from the surface of Mars taken by rovers that are sent to Earth as a way to save bandwidth and making the images searchable.
The high latency and low bandwidth between Earth and spacecrafts complicates communication. High latency makes it difficult to control the crafts as you don't see the results before hours later. The low bandwidth does not help either as downloading data from the crafts takes a long time. As an example, if you want to download the music video of the song "Never Gonna Give You Up" by Rick Astley, it would take about 3.17 hours. Had it not been for the bandwidth limitation, almost all the surface of Mars would already have been photographed by the Mars orbiter, but as of now, only 3% is. It is therefore important to be able to choose what images to download and which images to not. Currently, the methods used to decide on what images are of interest is by downloading thumbnail versions of the images and/or highly compressed versions used to make decisions as to whether the image is of interest. Another method is to use pixel value difference between the images to prevent identical or too similar images from being downloaded. Future Mars missions might have modern radiation safe GPUs on board, like the Snapdragon 820/855, for the purpose of data analysis. This will allow for some of the data analysis to be done on board the rover without the need to transfer the data back to Earth, only the abstracted features. This allows for better determining similarity between images and decisions to what images are worth spending the valuable bandwidth on. Machine learning capabilities and machine vision also increase opportunities for autonomic control of the rover. Automatic caption generation of geological features in the images allow geologists on Earth to search for geological features and choose the images of interest based on the captions
Smart tekst redigering/Utvidelse for faktasjekk
Chrome utvidelse, falske nyheter, cosinus likhet, fakta sjekking, RESTful AP
Detecting Fake News and Rumors in Twitter Using Deep Neural Networks
Master's thesis in Computer Science.The scope of this thesis is to detect fake news by classifying them as either real or fake based on article content, metadata, tweets and retweets of news articles from the Politifact dataset using graph neural networks.
Fake news generally spread exponentially and more rapid than real news. This is most likely because fake news are usually more novel or dramatic and contain more superlatives than real news. Fake tweets also tend to have more rumor path propagation hops than real news, meaning tweets of fake news are retweeted more than real news. Tweets of real news articles on the other hand, tend to have a constant and slow spread, and does not reach as many people overall.
There are generally two characteristics that are used for detecting fake news: article content and rumor path propagation. Most existing works have presented models based solely on one of these characteristics, which has its advantages (e.g. reduced training time), but is also reflected by poor performance results.
This thesis proposes a hybrid model that takes metadata and both of the above mentioned characteristics (article content and rumor path propagation in the form of a temporal pattern) as input using bidirectional LSTM with the Keras Sequential model. Article content is word embedded using pre trained GloVe vectors. The metadata, which is continuous, is normalized and discretized. The rumor path propagation time series is computed using dates from metadata related to tweets and retweets.
Some other deep learning and machine learning models are also implemented and tested for comparison. Experimental results demonstrated that the proposed model performs significantly better than all of these models.submittedVersio
Deep neural models to represent news events
Master's thesis in Computer scienceThe thesis is dedicated to the background linking tasks for news articles, utilizing the deep neural network models. The goal is to retrieve similar articles based on the news story currently viewed. We examined neural and non-neural representations for raw text and discussed notions of similarity a good model should identify and retrieve. We covered various deep neural network models and highlighted their advantages and disadvantages.
Inspired by deep neural architectures in the area of Information Retrieval we adjusted the Deep Semantic Similarity model to the background linking task. Our refactored DSSM architecture employs a convolutional neural network with multiple filters and regularization techniques. This convolutional network acts as an auto-encoder and learns the compressed representations of news articles and news stories. Cosine similarity is used as the proximity metric to retrieve related news articles. Experimental results prove that our adjusted DSSM model is applicable for the background linking task, and overperforms the baseline SVM model.
We discovered that corpora distributions affect the performance of the model. A model trained on news corpus containing mostly political and social news will perform poorly on news corpus about sport and entertainment news. Grid search and hyperparameter tuning are also important. Deep neural network architectures are powerful tools which can be used to solve complicated tasks and approximate nearly any function. Having a good quality dataset is half of the success. The DSSM model is planned to be adjusted to various news corpora and applied to different tasks; such as automatic linking of news articles to Wikipedia pages and linking news articles to news events. We assume this model can be extended to learn representations of a sequence of events for the task of linking background events
Human-Guided Phasic Policy Gradient in Minecraft: Exploring Deep Reinforcement Learning with Human Preferences in Complex Environments
This study presents a novel approach to enhancing the performance of artificial agents in
complex environments like Minecraft, where traditional reward-based learning strategies
can be challenging to apply. To improve the efficacy and efficiency of fine-tuning a
foundation model for complex tasks, we propose the Human-Guided Phasic Policy
Gradient (HPPG) algorithm, which combines human preference learning with the Phasic
Policy Gradient technique. Our key contributions include validating the use of behavioral
cloning to improve agent performance and introducing the HPPG algorithm, which
employs a reward predictor network to estimate rewards based on human preferences.
We further explore the challenges associated with the HPPG algorithm and propose
strategies to mitigate its limitations. Through our experiments, we demonstrate significant
improvements in the agent’s performance when executing complex tasks in Minecraft,
laying the groundwork for future developments in reinforcement learning algorithms
for complex, real-world tasks without defined rewards. Our findings contribute to the
broader goal of bridging the gap between artificial agents and human-like intelligence
Adaptive Selection and Delivery of Rich Media Notifications to Mobile Users
Master's thesis in Computer scienceThe ongoing increase in cellular network coverage is steadily increasing the availability
of individuals all around the world. This availability enables notification
solutions to achieve their goals of announcing new content with great success.
Recently notification systems have evolved to include media content. In cases
where this content is of substantial quality and frequency, it may induce large
resource consumption.
We aim to limit the total resource consumption of media notifications while
preserving the user experience. We achieve this by implementing established
techniques for measuring the quality of content. We enable the utilization of these
techniques by implementing a working systems tackling the practical issues of
notification generation, device communication and notification scheduling. We
test the system using Spotify as our notification provider and compare our results
to the standard FIFO approach of notifications. Using these tests we find that
correctly prioritizing content can massively increase a users utilization of notification
content
- …
