1,720,967 research outputs found
Real-Time Application Of Deepfake For De-Identification Privacy Preservation And Data Protection
In an era marked by mounting concerns over data privacy and protection, conventionalregulatory measures have proven inadequate against cyber-attacks, complicating data sharingfor research and development. Meanwhile, traditional face de-identification methods oftenresult in the complete erasure of facial information, hampering facial behaviour analysis. Thisthesis addresses these challenges by proposing a real-time deepfake deidentification for privacypreservation and data protection. Leveraging a first-order motion model and Mediapipe modelof deidentification, the study investigates methods to accurately identify multiple faces withina single image, crucial for comprehensive deepfake models. Three distinct models weredeveloped and tested to achieve deidentification of created deepfakes. Experimentation fromvarious angles revealed differing levels of success, with considerations such as processingpower, model openness, and training data quality influencing outcomes. Despite challenges,the study demonstrates the feasibility of real-time deepfake technology for privacy preservationand data protection. The proposed pipeline offers potential solutions to ethical concernsassociated with data sharing, with implications extending to healthcare, autonomous vehicles,and unmanned aerial vehicle technology
Real Time Crime Prediction Using Social Media
There is no doubt that crime is on the increase and has a detrimental influence on a nation's economy despite several attempts of studies on crime prediction to minimise crime rates. Historically, data mining techniques for crime prediction models often rely on historical information and its mostly country specific. In fact, only a few of the earlier studies on crime prediction follow standard data mining procedure. Hence, considering the current worldwide crime trend in which criminals routinely publish their criminal intent on social media and ask others to see and/or engage in different crimes, an alternative, and more dynamic strategy is needed. The goal of this research is to improve the performance of crime prediction models. Thus, this thesis explores the potential of using information on social media (Twitter) for crime prediction in combination with historical crime data. It also figures out, using data mining techniques, the most relevant feature engineering needed for United Kingdom dataset which could improve crime prediction model performance. Additionally, this study presents a function that could be used by every state in the United Kingdom for data cleansing, pre-processing and feature engineering. A shinny App was also use to display the tweets sentiment trends to prevent crime in near-real time.Exploratory analysis is essential for revealing the necessary data pre-processing and feature engineering needed prior to feeding the data into the machine learning model for efficient result. Based on earlier documented studies available, this is the first research to do a full exploratory analysis of historical British crime statistics using stop and search historical dataset. Also, based on the findings from the exploratory study, an algorithm was created to clean the data, and prepare it for further analysis and model creation. This is an enormous success because it provides a perfect dataset for future research, particularly for non-experts to utilise in constructing models to forecast crime or conducting investigations in around 32 police districts of the United Kingdom.Moreover, this study is the first study to present a complete collection of geo-spatial parameters for training a crime prediction model by combining demographic data from the same source in the United Kingdom with hourly sentiment polarity that was not restricted to Twitter keyword search. Six unique base models that were frequently mentioned in the previous literature was selected and used to train stop-and-search historical crime dataset and evaluated on test data and finally validated with dataset from London and Kent crime datasets.Two different datasets were created from twitter and historical data (historical crime data with twitter sentiment score and historical data without twitter sentiment score). Six of the most prevalent machine learning classifiers (Random Forest, Decision Tree, K-nearest model, support vector machine, neural network and naïve bayes) were trained and tested on these datasets. Additionally, hyperparameters of each of the six models developed were tweaked using random grid search. Voting classifiers and logistic regression stacked ensemble of different models were also trained and tested on the same datasets to enhance the individual model performance.In addition, two combinations of stack ensembles of multiple models were constructed to enhance and choose the most suitable models for crime prediction, and based on their performance, the appropriate prediction model for the UK dataset would be selected. In terms of how the research may be interpreted, it differs from most earlier studies that employed Twitter data in that several methodologies were used to show how each attribute contributed to the construction of the model, and the findings were discussed and interpreted in the context of the study. Further, a shiny app visualisation tool was designed to display the tweets’ sentiment score, the text, the users’ screen name, and the tweets’ vicinity which allows the investigation of any criminal actions in near-real time. The evaluation of the models revealed that Random Forest, Decision Tree, and K nearest neighbour outperformed other models. However, decision trees and Random Forests perform better consistently when evaluated on test data
A Decentralised Peer-to-Peer Energy Trading Platform for Residential Homes
To achieve a sustainable and low-carbon energy system, it is necessary to develop novel solutions for the way household energy is consumed. Homes that have solar photovoltaic (PV) systems, electric vehicles (EVs), and microgrids can potentially transform the energy landscape by participating in decentralised energy market. All the previous blockchain-related work focuses on the general business use case and management; it does not provide the technical feasibility, bidding strategies and practical value of the renewable market. Therefore, in this Research, SolarChain, a proposed blockchain model for storing and accessing Peer-to-Peer (P2P) transaction in a secured manner. This study demonstrates an experimental blockchain platform developed on Ethereum that is being implemented to exchange electricity. The demonstration replicates a P2P network, including microgrids, solar-powered homes, and Vehicle-to-Grid (V2G) user nodes. User cases for P2P trading, smart contracts, tracking buyer-and-seller exchanges, and comprehensive implementation process information are all included in the implementation. The use of Smart Grids for dynamic pricing to balance supply and demand in microgrids, setting interval periods and token prices, automated and autonomous operation, market clearing prices(MCP), experimentation on a testbed using Node.js and web3.js API, and frontend user simulation with virtual consumers and prosumers derived from benchmarks are notable features.The proposed architecture is validated using realistic user interface (UI) provides 10 default smart contract buttons that users can utilise to run the simulation and Ethereum Virtual Machine (EVM) environment of Ropten Test Network. The research also looks at the use case for Ethereum's constraints in the application at hand. P2P platforms can lower infrastructure and transmission costs by promoting p2p local energy community can reach cost efficiency and self-sufficiency. Keywords: Blockchain, Ethereum, Prosumers. Energy trading, Peer-to-peer(P2P), Smart Micro-grid, HOMERs, Electric vehicle(EVs), Solar PV
An Improved Optimized Link State Routing Protocol for Optimum Quality of Service Device-to Device Routing and Energy Efficiency
The Fifth generation (5G) and Beyond 5G (B5G) cellular network's traffic load is certain toexpand significantly in the near future as a result of its flexibility, high speed, increasedbandwidth, better connectivity and low latency. One such technique is D2D communication inD2D cellular network, which allows two or more devices to communicate with each otherdirectly without transversing the cellular network. The trio optimized link state routing protocol(OLSR) functionalities that are responsible for the challenges when implementing OLSR arelink state processing, unsuitable multipoint relay (MPR) nodes and information baseshousekeeping. This research presents structured and energy efficient link sensing, MPRselection mechanism and database maintenance techniques. The research also considers theclassification of the nodes using a supervised machine learning (ML) algorithm. The researchconsiders the contributions (weights) of all four parameters and develops a mathematical modelfor the relay selection process. It replaces OLSR’s HELLO, HELLO, and TC message sequencewith a new sequence that interweaves HELLO and TC messages with three database (DB)chores and LISTEN states. The new message sequence is DB_SELF, LISTEN, HELLO,DB_NEIGHBORS_RCVD, HI, DB_NEIGHBORS_BOTH, TC and DB_ROUTES sequence.LISTEN sequence occurs when new nodes are about to join the network, such nodes listen forongoing communication before emitting a HELLO message. DB_CHORE sequence aredurations during which nodes refrain from any form of broadcast, this period is used tocomplete necessary database updates. The proposed modification mandated nodes to alwaysrespond to a HELLO message with the new non-forwarding HI message. Multi-point relays(MPRs) are expected to immediately forward TC messages on behalf of their selectors afterthe expiration of DB_NEIGHBORS_BOTH. MPRs nodes are not mandated to broadcast TCmessages if the number of nodes and their OLSRv2 addresses remain unchanged aftersubsequence broadcasts of HELLO and HI messages, or if no node reported 2-hops symmetricconnections. In addition, this research proposes MPR selection mechanism that allows nodesto consider four (4) parameters, namely nodes battery level (BL), mobility speed (MB), nodesdegree (ND) and connection to base station (CBS) for optimum relay selection. The MPRselection mechanism combines the multiple criteria into a single metric to reduce controloverhead and energy dissipation. The proposed scheme was implemented in NS-3 simulatorand validated using mathematical model where both results shows that the proposed routingprotocol performs better than OLSR and OLSRv2 in terms of energy consumption, routingoverhead, packet delivery ratio and end-to-end delay
Improving Predictive Process Analytics with Deep Learning and XAI
In this doctoral thesis, we explore the innovative application of the Tab Transformerarchitecture in the realm of predictive process mining, marking a significant advancement inforecasting subsequent events within activity sequences. Utilising the PM2 methodology,known for its structured approach in process mining, this study rigorously handles dataprocessing, model development, and validation. This methodological choice is pivotal inleveraging the unique capabilities of the Tab Transformer, particularly its proficiency inprocessing multiple categorical features, a dimension often overlooked in previous research.The empirical analysis encompassed a novel dataset and extended to three additional publiclyavailable datasets: MIMIC-IV Emergency Department (ED) Data, BPIC 2012, BPIC 2013, BPIC2017, and BPIC Road Traffic. The model's performance was exemplary, achieving accuraciesof 0.69, 0.812, 0.7301, 0.8766, and 0.78, and F1 scores of 0.67, 0.77, 0.70, 0.8533, and 0.734in these datasets, respectively.A major contribution of this research is the introduction of the Tab Transformer to processmining, a first in the field. This approach not only demonstrates the model’s versatility acrossvarious data forms but also highlights the importance of integrating categorical features inprocess mining, providing a more nuanced understanding of the influencing factors in activitysequences.The thesis further distinguishes itself through the application of Explainable ArtificialIntelligence (XAI) techniques, particularly SHAP and LIME. These tools were instrumental indemystifying the model’s decision-making processes, thereby enhancing its transparency,and fostering trust in AI systems. This integration challenges the notion of AI as impenetrable"black boxes," paving the way for AI systems that are not only effective but also interpretableand trustworthy.In conclusion, this thesis contributes significantly to the field of predictive process mining bypioneering the use of the Tab Transformer, emphasizing the role of categorical features, andadvancing the cause of transparency in AI through the application of XAI. The findings andmethodologies established in this study represent a benchmark for future research in thisevolving domain
Optimizing Pandemic Control Strategies: A Deep Reinforcement Learning Approach in Public Health Management
COVID-19, also known as the SARS-CoV-2 coronavirus, has paralysed the world andforced people to change their lifestyles. Since COVID-19 deaths are increasing daily, thedisease has become a global public health issue. Different countries used different publichealth guidelines to avoid human-to-human transmission. Personal hygiene, hand washingand sanitization, face masks for social distance, comprehensive testing, and, in the worstcase, a lockdown and travel restriction are rules.This research seeks the optimal lockdowns and border control approach for timely lockdownand travel limitations. This thesis attempts to use UK data from the global pandemicdataset. The data was trained using DRL algorithm to determine lockout and travel limitationtiming. This is the first study to use deep reinforcement learning to determinethe best UK lockdown and border control method. A unique base model, Duelling DoubleDeep Q-Network (D3QN), a variation of the Deep Q-Network algorithm (DQN), wasused to train COVID-19 epidemic dataset and evaluated on test data. Public health andgovernment will be able to execute prompt and appropriate lockdown and border controlpolicies to minimise the disease’s spread, improving people’s quality of life and loweringcosts.Initial lockdown and travel restrictions reduced COVID-19 load. However, our agencyadvised the UK to lock down or restrict travel before or on the index case (the first deceased recoded). Moreover, the agent frequently called for a full lockdown, border closures, travelrestrictions, and more harsh security measures than public health. This study assesses thepositive effects of preventing COVID-19’s spread on population health while considering itsnegative economic and social effects. Finally, average moving reward was used to comparebaselines
Developing a Framework to Identify Professional Skills Required for Banking Sector Employee in UK using Natural Language Processing (NLP) Techniques
The banking sector is changing dramatically, and new studies reveal that many financial institutions are having challenges keeping up with technology advancements and an acute shortage of skilled workers. The banking industry is changing into a dynamic field where success requires a wide range of talents. For the industry to properly analyses, match, and develop personnel, a strong skill identification process is needed. The objective of this research is to establish a framework for determining the competencies needed by banking industry experts through data extraction from job postings on UK websites.Data is extracted from job vacancy websites leveraging web-based annotation tools and Natural Language Processing (NLP) techniques. This study starts by conducting a thorough examination of the literature to investigate the theoretical underpinnings of NLP techniques, its applications in talent management and human resources within the banking industry, and its potential for skill identification. Next, textual data from job ads is processed using NLP techniques to extract and categorize talents unique to these categories. Advanced algorithms and approaches are used in the NLP-based development process to automatically extract skills from unstructured textual material, guaranteeing that the skills gathered are accurate and most relevant to the needs of the banking industry. To make sure the NLP techniques-driven skill identification is accurate and up to date, the extracted skills are verified by expert feedback. In the final phase, machine learning models are employed to predict the skills required for banking sector employees. This study delves into various machine learning techniques, which are implemented within the framework. By preprocessing and training on skills extracted from job advertisements, these models undergo evaluation to assess their effectiveness in skill prediction. The results offer a detailed analysis of each model's performance, with metrics such as recall, precision, and F1-score being used for assessment. This comprehensive examination underscores the potential of machine learning in skill identification and highlights its relevance in the banking sector.Key Words: Machine Learning, Banking Sector, Employability, Data Mining, NLP, Semantic analysis, Skill assessment, Skill Recognition, Talent managemen
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
- …
