Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)
Not a member yet
    1506 research outputs found

    Ensemble Prediction of Business Process Remaining Time Based on Random Forest and XGBoost

    Full text link
    The business processes in the information system are complex and diverse, and a single machine learning method often relies excessively on the noise or specific patterns in the training data. When dealing with large datasets, the calculation amount of the model is heavy, resulting in poor performance on new data, and it is difficult to achieve accurate monitoring and prediction of business processes. For this reason, a two-layer machine learning framework is presented using stacking technology – Serial Stacking Framework. Based on the event log, the method carries out random grouping sampling with placement, trains the multi-objective regression model, and applies multiple machine learning models to predict in series. Generally speaking, it is to use the prediction results of the previous model to generate training data and use it for the prediction of the latter model, in order to achieve the sequential accumulation of the prediction efficiency of multiple models. Random Forest and XGBoost are used as specific stack ensemble models for prediction, and the proposed method is evaluated against the existing advanced method through experiments. The results show that the average absolute error of the model built by the serial stacking framework with random group sampling and multi-objective regression is at least 2.14 % lower than that of the single machine learning model, the conventional stacking frameworks and the latest methods

    Enhancing Large-Scale Code Understanding Through Goal Structuring Notation and Large Language Models

    No full text
    Large language models (LLMs) aid programmers in understanding code but are limited by input length when handling large codebases. To address this, we propose using Goal Structuring Notation (GSN) – originally developed for articulating assurance cases in complex engineering projects – to represent and break down large codebases. We introduce a tool that leverages LLMs to automatically convert large code into GSN. The generated GSN provides an overview that simplifies code comprehension and enhances communication among programmers. Experimental results demonstrate that our approach significantly increases programmers’ confidence levels and reduces task completion times

    Few-Shot Semantic Segmentation with Frequency Prototype Learning

    Full text link
    Few-shot semantic segmentation is a challenging task aimed at segmenting new objects in the query image with only a few annotated support images. Most advanced methods for this task mainly focus on either global or local prototype learning through global average pooling (GAP) or clustering. However, due to the limitation of average and cluster operation, these methods still fail to exploit the object information from support images entirely. To address these limitations, we propose a generalization of prototype learning in the frequency domain through multi-frequency pooling (MFP) to mine both local and global object information. Based on the MFP, we further build a Frequency Prototype Network (FPNet) consisting of three novel designs. Firstly, the Frequency Prototype Generation Module (FPGM) extracts frequency prototypes by MFP in the DCT domain to provide complete object guidance information. Then, the Prior Attention Mask Module (PAMM) produces a prior attention mask to identify a query target more precisely and retain high generalization. Finally, the Frequency Prototype Selection Module (FPSM) selects the most effective support prototypes to reduce redundancy. Extensive experiments on PASCAL-5i and COCO-20i demonstrate that our model achieves state-of-the-art performances in both 1-shot and 5-shot settings

    Comparative Visualisation of Algorithms and Data Structures

    Full text link
    Algorithms and data structures are principal parts of computer science education. For many students, however, it is not easy to master them due to their diversity and inherent complexity. The application of algorithm visualizations is a widely adopted approach, which can help to mitigate this difficulty. Within this work, we aim to improve the efficiency of the learning process in the field of algorithms and data structures. The main directions we use in this work to reach this goal are the introduction of comparative algorithm visualization and the implementation of the visualization tool based on contemporary standards. We analyze and compare several of the available solutions for algorithms and data structure visualization and evaluate them according to the provided functionalities. Further, we define a list of requirements, including the capability to compare selected algorithms visually. The practical outcome of this work is a web application that allows us to visualize and compare different algorithms and data structures in terms of their operation and efficiency. At the end of the paper, the proposed solution is evaluated in several ways

    Exploring Performance and Energy Optimization in Serverless Computing: A Review

    No full text
    Serverless computing brings another revolution to cloud computing as function-as-a-service (FaaS) where the applications are abstracted as a group of functions. Serverless applications are cost-effective and manage resources efficiently but the lack of performance modeling and energy optimization affects the potential users' broad adoption of serverless computing. Performance enhancement and energy optimization are necessary to guarantee the serverless applications' service level agreement (SLA). This review paper presents various performance metrics in serverless computing, including cost, scalability, latency, energy consumption, resource utilization, fault tolerance, and response time. Based on these metrics, various performance modeling and energy optimization techniques have been explored to reduce energy consumption and improve system efficiency. Furthermore, the review investigates software platforms for implementing serverless computing, including AWS Lambda, Apache OpenWhisk, Azure Functions, and Google Cloud Functions, highlighting key findings and limitations. This comprehensive review serves as a guide for researchers, directing them toward new and promising research directions in the field

    Enhancing Real-Time Rumor Detection on Weibo Through User and Content Feature Integration

    No full text
    Weibo has emerged as a vital platform for Chinese netizens to share information, but it has also given rise to numerous rumors. Real-time detection methods that do not rely on propagation features are the most effective way to curb the spread of these rumors. Currently, real-time detection methods that mine semantic features of rumor text based on deep learning lack sufficient generalization ability. Therefore, we propose a real-time rumor detection method integrating multiple user and content features. In addition to standard user basic features, our approach utilizes the user's historical posting data to extract two deep-level features: user rationality and professionalism. Regarding content features, in addition to standard statistical features, we use a graph attention network that considers edge weights to learn deep semantic features of the content. The user and content features are concatenated and fed into a multi-layer perceptron for classification. The experimental results on a real Weibo dataset show that the accuracy of the proposed method achieves 92.6%, which outperforms all the compared baseline methods

    Fault Feature Extraction in Rolling Bearings Using Time-Frequency Analysis and Optimized Variational Mode Decomposition

    Full text link
    This paper focuses on the analysis of rolling bearing vibration signal, presenting a comprehensive investigation into vibration signal analysis and fault signal feature extraction methods. The research primarily investigates a Variational Modal Decomposition (VMD) method, with enhancements made through the Tucked Swarm Algorithm (TSA) optimization and the use of Maximum Correlated Kurtosis Deconvolution (MCKD). It proposes a method for identifying the optimal parameter configurations for VMD. The proposed method is applied to analyze the rolling bearing vibration signal, and its efficacy in feature extraction has been validated through comparative analysis. This study employs a feature extraction methodology using kurtosis, envelope spectral kurtosis, and other indicators as basic features of vibration signals. It constructs a multi-feature feature vector dataset and utilizes the Least Squares Support Vector Machine (LSSVM) as a fault type classifier to validate the effectiveness of the proposed feature extraction method. The results demonstrate that the fault identification accuracy achieved by the proposed method consistently exceeds 96 %

    Research on Dense Detection Algorithm for Brown Mushroom Based on Improved YOLOv7

    Full text link
    In the complex environment of industrialized brown mushroom cultivation, a dense brown mushroom detection algorithm based on improved YOLOv7 is proposed to address the issues of low real-time detection accuracy and speed, and the high false detection rate of picking robots in densely grown brown mushroom clusters. To prevent network degradation, improve the detection accuracy and speed of the network, and reduce the network's computational cost, the ELAN_PS module is introduced to replace the original ELAN module. The AFPN network is used to replace the original network's Neck part for multi-scale fusion, allocating different spatial weights to feature maps to enhance the model's ability to separate dense targets. The MDIoU loss function is introduced as the algorithm's bounding box loss function to optimize the convergence speed of network training and improve the detection accuracy of dense occluded brown mushroom individuals. The improved algorithm is trained and tested on a self-built industrialized brown mushroom dataset. Compared to the original YOLOv7, the model's detection speed has increased by 15.5 %, detection accuracy has increased by 6.4 %, and average precision [email protected] has increased by 6.9 %

    Abilities of Contrastive Soft Prompting for Open Domain Rhetorical Question Detection

    No full text
    In this work, we start by demonstrating experimentally that rhetorical question detection is still a challenging task, even for the state-of-the-art Large Language Models (LLMs). We propose an approach that boosts the performances of such LLMs by training a soft prompt in a way that enables building a joint embedding space from multiple loosely related corpora. The advantages of using a soft-prompt compared to fine-tuning is to limit the training costs and combat overfitting and forgetting. Soft prompting is often viewed as a way to guide the model towards a specific known task, or to introduce new knowledge into the model through in-context learning. We show that soft prompting may also be used to modify the geometry of the embedding space, so that the distance between embeddings becomes semantically relevant for a target task, similarly to what is commonly achieved with contrastive fine-tuning. We exploit this property to combat data scarcity for the task of rhetorical question detection by merging several datasets into a joint semantic embedding space. On the standard Switchboard dataset we demonstrate that the resulting BERT-based model nearly divides by 2 the number of errors as compared to Flan-T5-XXL with only 5 few-shot labeled samples, thanks to this joint embedding space. We have chosen in our experiments a BERT model because it has already been shown with S-BERT that contrastive fine-tuning of BERT leads to semantically meaningful representations. Therefore, we also show that this property of BERT nicely transfers to the soft-prompting paradigm. Finally, we qualitatively analyze the resulting embedding space and propose a few heuristic criteria to select appropriate related tasks for inclusion into the pool of training datasets

    ADOCIL: Enhancing Image Classification with Attention Distillation for Online Class-Incremental Learning

    No full text
    Catastrophic forgetting is a major challenge for online class-incremental learning. Existing replay-based methods have achieved a certain degree of effectiveness, but are limited by not considering the quality of the samples and the key semantic information in a single-pass data stream. To address these issues, we proposed the framework of Online Class-Incremental Learning based on Attention Distillation (ADOCIL), which consists of three parts. A two-stage sampling method is used in the replay stage to improve the quality of the samples taken. Meanwhile, we introduced the Attention-based Dual-View Consistency (ADVC), which enables the model to fully explore the critical semantic information within a single-pass data stream. In addition, to further mitigate the problem of catastrophic forgetting, we introduced attention distillation to map the attentional map of the teacher model to the student model, thus solving the problem of forgetting historical tasks. Extensive experiments demonstrated the effectiveness of ADOCIL

    0

    full texts

    0

    metadata records
    Updated in last 30 days.
    Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇