Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)
Not a member yet
1506 research outputs found
Sort by
Modified Convolutional Neural Network for Speaker Age and Gender Classification
Identifying a person's age and gender from speech signal characteristics poses a significant challenge in personal identity recognition systems, particularly when security considerations are involved. In signal processing applications such as speaker recognition, biometric identification, human-machine interface (HMI), and telecommunication, the estimation of age and gender from voice is a crucial and demanding problem. In several signal processing domains, deep learning models have demonstrated remarkable effectiveness. In this paper, we proposed a modified convolutional neural network to identify the age and gender of the speaker using the characteristics of the MFCC speech. We also included techniques to reduce the dimensionality of the speech feature set. We tested modified one-dimensional convolutional neural networks (1D-CNN) and machine learning models such as support vector classification (SVC), decision trees (DT), and random forests (RF). The modified 1D-CNN based on deep learning, along with dimensionality reduction, random seeding, and cross-validation, is proposed for the recognition of age and gender in speech. We applied different dimensionality reduction techniques such as principal component analysis (PCA) and independent component analysis (ICA) along with random seed and various sets of cross-validations. In this study, we used the Children speech recording dataset, Biometric Visions and Computing (BVC) and the Mozilla Common Voice speech datasets for estimating age and gender from speech. The proposed 1D-CNN model exhibits a promising performance compared to the state-of-the-art (SOTA) approaches. The models were evaluated and compared with evaluation metrics, such as accuracy. The dimensionality reduction techniques, selection of speech features, and seeding show a significant impact on the performance of the suggested model
Self-Supervised Learning for 3D Action Prediction Based on Past Completeness and Future Trend
The goal of the 3D action prediction task is to predict the action label corresponding to an incomplete 3D skeleton sequence. Existing studies are limited to the supervised framework. To eliminate the dependence of supervised learning on expensive labels, we propose a self-supervised learning method for 3D action prediction. We use three self-supervised tasks of action completeness perception, motion prediction, and global regularization to allow the network to learn the past and future information embedded in the sequence of unfinished actions, i.e., the action completeness that has occurred and the future motion trend, and to optimize the feature space learned by the model. Some models ignore the past and future information embedded in partial sequences, which is the key to action prediction by humans. Based on our self-supervised method, we design two modules, an action completeness perceptron, and a motion predictor, to complete missing information in partial inputs. And a novel network structure is proposed to fuse partial and complete prediction to achieve more reasonable action prediction. We have conducted extensive experiments on different datasets, and the results validate the effectiveness of our proposed method
An Approach Based on Coloured Petri Net and NSGA-II to Improve the Emergency Department
Health care sector faces major challenges in light of the appearance of new diseases. The emergency department (ED) is an important area in the hospital, where it plays a major role in the presence of these challenges. The 'ED' is a complex system due to the random flow of patients and the complex nature of its resources. In this paper, the authors model and simulate the 'ED' by means of coloured Petri nets, and to determine the appropriate amount of resources, the NSGA-II algorithm is developed. After determining the appropriate amount of resources through the NSGA-II algorithm, the simulation model of the current system is modified with the amount of new resources obtained through the NSGA-II algorithm. The results are compared between the current system and the obtained system. This study was conducted in Hassani Abdelkader Hospital, located in the city of Sidi Bel Abbes, in western Algeria
Self-Supervised Learning for 3D Action Prediction with Graph Convolutional Recurrent Network
In view of the dependence of existing 3D action prediction research on labels, we propose a graph convolutional recurrent 3D action prediction method based on state discrimination and spatio-temporal self-supervised contrast learning. In the state discrimination task, cross-sample sampling and relative action completeness perception are used to train the model for generalized state information learning across instances and classes. In the spatio-temporal contrast task, spatio-temporal consistency information is introduced into the feature representation to enrich action semantics in features. Additionally, in order to fully extract spatio-temporal information in 3D action sequences, a spatio-temporal feature extraction network (STFEN) based on graph convolution recurrent network is designed. The experimental results on public datasets demonstrate the efficiency of the proposed methods
Resource-Efficient Model for Deep Kernel Learning
According to Hughes phenomenon, the major challenges encountered in computations with learning models come from the scale of complexity, e.g. the so-called curse of dimensionality. Approaches for accelerated learning computations range from model- to implementation-level. The first type is rarely used in its basic form. Perhaps, this is due to the theoretical understanding of mathematical insights. We describe a model-level decomposition approach that combines both the decomposition of the objective function and of data. We perform a feasibility analysis of the resulting algorithm, both in terms of accuracy and scalability
DeliteSeg: A Real-Time Semantic Segmentation Model for Predicting Small Objects and Object Contours
Semantic segmentation is one of the key technologies in the development of autonomous vehicles. Practical applications are increasingly pursuing a balance between effectiveness and efficiency. Many lightweight segmentation models nowadays have some problems, often making it difficult to predict small objects and edges between different objects. In this work, we propose a model of encoder-decoder structure, DeliteSeg. Firstly, we added deformable convolutional layers to the encoder, leveraging the advantages of deformable convolution to enable the model to better predict object edges. Then we proposed a new deep context aggregation module DLPPM, which improves the context information aggregation ability by fusing low-resolution feature maps of different scales multiple times, enabling the model to better predict small objects. Finally, we designed a new lightweight attention decoder (LMD) that utilizes a spatial channel attention mechanism to refine feature maps at different levels, effectively recovering information. After extensive experiments, our network achieved 73.6 % mIou and 123.7 FPS on the Cityscapes dataset and 73.9 % mIou and 116.4 FPS on the CamVid dataset. The experimental results confirm that our proposed model can make appropriate trade-offs between accuracy and real-time performance
New Family of Linear 3-Erasure Correcting Block Codes with Possible Application in Storage Systems
A construction of a new family of three erasure correcting linear block codes over GF(q) with characteristic two together with their syndrome decoding procedures are presented in this paper. The designed code distance of four was confirmed by demonstrating a decoding algorithm capable of correcting three erasures. The second confirmation was obtained from the weight spectra of selected codes, which were calculated using Krawtchouck polynomials derived from the weight spectra of their dual codes
Efficient Drone Detection Method Based on YOLOv8s Improvement
Combating illegal drone activities is an important task for national defense and security. How to spot drones quickly and accurately is the key. While there are many ways to detect drones, their reasoning is generally slow and complex. Therefore, in this work, we propose an improved and efficient UAV detection method YOLOv8s-C3AS based on YOLOv8s. There are three main improvements to this approach: First, we propose a new Coordinate Channel Spatial Attention Module (CCSM) and add it to the backbone of the model to enable better feature extraction. Secondly, in order to solve the scale inconsistency problem of YOLOv8s PANet, we propose a new adaptive fusion feature network (PANet-AF), which enables the model to fuse the features of the three scales better, which enables the model to better integrate features of different scales. Third, we use a more reasonable bounding box regression loss function SIoU, which improves the detection accuracy of the model without cost. Finally, we refined and made public the drone dataset and conducted a series of experiments combined with the PASCCOL VOC dataset. Our proposed approach achieves 77.2 % mAP, 98.9 % mAP_50, 87.1 % mAP_75 and 120.5 FPS on the drone dataset. Experiments demonstrate that our proposed method outperforms other methods by achieving high detection accuracies while maintaining faster inference speed and lower model parameters. The drone datasets used for this research has been uploaded to kaggle: https://www.kaggle.com/datasets/zhangtutu123/drone-dataset123/dat
Triple-GCN: Enhanced Multi-Feature Graph Convolutional Network for Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA) aims to predict the sentiment polarity of the given aspect word within the sentence. Recent studies frequently treat syntactic and semantic features as independent representations, thereby overlooking their intrinsic correlation. Concurrently, most of the existing methods largely neglect the significance of dependency types, which eventually impacts the accuracy of sentiment analysis. Research based on cognitive theory indicates a mutual influence between syntax and semantics. Based on this, we propose an ABSA model based on enhanced multi-feature graph convolutional network (Triple-GCN). Firstly, a shared enhanced graph convolutional module is proposed to integrate syntactic and semantic information. Following this, a thorough fusion of this syntactic and semantic information is carried out. Besides, relation and adjacency matrices are utilized for the innovative reconstruction of hidden state vectors. Syntactic graph convolution module dynamically fuses hidden state vectors and dependency features. Additionally, a position weight encoding function is designed to comprehend sentiment dependencies by drawing attention to aspect-near words. On the semantic side, dynamic semantic graphs are constructed, enabling the capture of semantic features. The model has been evaluated on three public datasets: Twitter, Laptop14, and Restaurant14. Compared to existing baseline models, the effectiveness of this model has noticeably improved
Optimizing Security and Performance in Blockchain-Enhanced Federated Learning Through Participant Selection with Role Determination
Federated learning (FL) allows distributed devices to jointly train a global model while safeguarding the privacy of their local data. However, selecting and securing clients, especially in environments with potentially malicious participants, remains a critical challenge. This study proposes an innovative participant selection method to enhance both security and efficiency in centralized and decentralized FL frameworks. In the centralized framework, this method effectively excludes clients with weak privacy protections and optimization capabilities, thus increasing overall system security. For decentralized FL, a blockchain-supported approach is introduced, which further strengthens the robustness of the system. Using a dynamic role assignment algorithm, roles such as worker, validator, and miner are allocated based on security and performance metrics for each training round. The findings show that this method performs on a par with the scenarios free of malicious clients, demonstrating the value of blockchain technology in improving FL protocols. By addressing security vulnerabilities and improving training efficiency, this research contributes to the development of more secure and efficient FL systems, underscoring the importance of advanced participant selection and role assignment strategies