KTU Open Journal Systems (Kaunas University of technology)
Not a member yet
13395 research outputs found
Sort by
Image Enhancement Model for Open-Pit Mine Monitoring Based on Parallel Multi-Scale Feature Fusion
The workspace in open-pit mining systems often suffers from insufficient or uneven illumination due to spatial constraints and obstructions caused by large equipment or geotechnical structures, leading to degraded surveillance imagery and consequently impacting safety monitoring efforts. This study designed an open-pit mine surveillance image enhancement model based on a parallel multi-scale feature fusion Transformer to address the degradation of surveillance video images and leverage the superior expressive power of Transformer networks in visual image processing compared to other networks. The network architecture mainly processes and integrates full-size feature maps and various levels of downsampled feature maps in parallel, preserving both the semantic relationships of image elements and their overall structure. The downsampling process of the network aims to maximize the extraction and restoration of the luminance features of small-sized objects from low-resolution images. By integrating features from downsampling, full-size image processing effectively restores illumination, thereby enhancing the accuracy of the images. To reduce the computational demands of the Transformer structure and facilitate its application in monitoring imagery, we employed an orthogonal self-attention mechanism along both the rows and columns of the image to be processed. This mechanism shifts the network\u27s computational demand from exponential to linear growth. During the training phase, the network model was trained using a dataset collected on-site to enhance the model\u27s adaptability to field conditions. SSIM and PSNR test results confirm that this model performs exceptionally well in open-pit mining production systems
NAP-CycleGAN: A New Cyclegan-Based CT Images Synthesis Model For Clinical Image Reconstruction Using Brain MR Images
The intricate structure of the brain often necessitates the combined use of magnetic resonance (MR) and computed tomography (CT) imaging for comprehensive diagnostics in clinical care. However, certain patients cannot be exposed to radiation-intensive CT scans, leading to data scarcity and affecting subsequent treatment. In this regard, this paper proposes a new model noise-attention-pix2pix-CycleGAN (NAP-CycleGAN), replacing the generator with pix2pixHD utilizing multi-scale strategies and context-aware modules. By integrating channel attention, the model effectively extracts relevant image features, allowing adaptive weight assignment and handling of long-range dependencies. Additionally, Gaussian noise is introduced to the discriminator to counteract adversarial sample attacks and prevent gradient vanishing. Furthermore, feature matching loss and cycle consistency loss are integrated to reduce image detail distortion. To verify the model validity, it is compared with seven state-of-the-art methods. The experimental results on the public brain dataset brain01 show that the proposed model outperforms these methods, it yields the best, and the synthetic CTs of the proposed model are closest to the original CT (RCT) images.
A Prediction Method for Highway Traffic Flow Based on the IHPO-VMD-LSTM-Informer Model
Accurate and timely predictions of highway traffic flow are crucial for implementing intelligent highway management. This paper introduces a novel prediction approach for highway traffic flow by employing the IHPO-VMD-LSTM-Informer model, aiming at enhancing prediction accuracy. Initially, key indicators measuring highway traffic are identified, and Nonlinear Principal Component Analysis (NPCA) is applied to minimize the dimensionality and interdependence among these indicators. This reduction process replaces the original complex indicators with fewer numbers of principal components, thereby simplifying the feature matrix\u27s structure. Subsequently, Variational Modal Decomposition (VMD) processes historical highway traffic flow data, enhanced by the strategically improved Hunter-Prey Optimization (HPO) algorithm. This optimization facilitates adaptive parameter adjustments for the VMD, enabling effective decomposition of highway traffic flow time series data. The Sample Entropy (SE) of Intrinsic Modal Functions (IMFs) from this decomposition is used with the substantial indicators to form a comprehensive feature matrix. Then, the predictive module combines a Long Short-Term Memory (LSTM) network with the Informer architecture to accurately predict highway traffic flow from the feature matrix. The effectiveness of the proposed model is verified using a public motorway traffic dataset KDD CUP 2017. The results indicate that the proposed model outperforms available ones in terms of prediction accuracy, where MAPE and RMSE have 8.09 and 2,84, thus significantly advancing intelligent highway management
An Early Warning Model for Industrial Network Security Issues: A Crafted Strategy for High Accuracy Based on Machine Learning Approach
An industrial network has become an important infrastructure. As industrial networks develop, their cybersecurity problems become more and more prominent. The attacks currently realized to networks turn out to be advancing quicker than ever, and their destructive force also continuously gets bigger. Thus, the available early warning technology for industrial network security issues requires more accuracy and timeliness since a serious amount of delays occurs in real cases. The article proposes a strategy with high accuracy based on a machine-learning algorithm. Nonlinear high-dimensional data with different feature characteristics in cyber-attacks and low training efficiency of conventional early warning models to predict attacks are underlined as a significant part of the problem to deal with. Thus, the manuscript suggests a feature selection method based on the Tuna Swarm Optimization (TSO) algorithm to filter out redundant features and reduce the data’s dimensionality. Then, the Extreme Learning Machine (ELM) and Auto-Encoder (AE) are combined to construct the model called Extreme Learning Machine-Auto Encoder (ELM-AE) to be implemented as the basis of the early warning model for industrial network security. Afterward, the improved Whale Optimization Algorithm (I-WOA) is used to optimize the parameters of the ELM, to construct the obtained optimization model. Finally, the obtained optimization model is applied to detect attacks on industrial cyber security systems as an early warning method. Eventually, the proposed model is tested by constructing an evaluation index system on how effective the early warning system functions. The experimental results show that the proposed warning model for industrial network security issues has high warning accuracy and efficiency concurrently, which provides an advanced early warning model for network attacks. The proposed model with 92.64% precision and 51.84 s average execution time excels over other methods
Driver Fatigue Detection Based on Multiple Physiological Signals and an Improved Deep Belief Network
In order to accurately discriminate the driver fatigue, multiple physiological signals of 10 drivers were collected by a wireless body area network in actual driving, including neck electromyography (EMG) and electroencephalography (EEG). Then, the noises of signals were removed by several denoising methods, and 22 features were extracted, including energy entropy, multiscale entropy, and other relevant features. Subsequently, a deep belief network (DBN) was used to further extract multi-domain features. And then, a grey wolf optimization algorithm was used to optimize the performance of the DBN. The results showed that the accuracy of the model built in the present work was up to 96% in discriminating the fatigue states
Bi-Encoder Polyp Net: A Novel Architecture for Enhanced Polyp Segmentation in Endoscopic Images
Automatic polyp segmentation in endoscopic images holds critical clinical value for early colorectal cancer diagnosis. While existing segmentation models have achieved notable progress, two key challenges persist in algorithmic performance improvement. First, dynamic adjustments of colonoscope tip orientation during examinations induce viewpoint variations, which amplify polyp appearance diversity and hinder robust feature learning. Second, the inherent similarity between polyps and surrounding tissues leads to blurred boundaries. Although convolutional neural networks (CNNs) have demonstrated significant advancements, their limitations in modeling global dependencies and reliance on aggressive downsampling operations often cause redundant network structures and local detail loss. To address these bottlenecks, we propose Bi-Encoder Polyp Net – a novel parallel architecture integrating Pyramid Vision Transformer and ResNet. This dual-branch design effectively captures global contextual dependencies while preserving low-level spatial details. A feature alignment module bridges the semantic gap between dual-branch feature maps, and an iterative semantic embedding unit further injects high-level semantic information into aligned low-level features. Extensive experiments across five public polyp segmentation benchmarks validate the network’s effectiveness, demonstrating superior capability in processing real-world colonoscopy images
REWeather: A Unified Detection Framework for Automatic Driving Images Restoration and Enhancement in Adverse Weather
Recently, with the rapid development of autonomous driving technology, it prompts the vehicle detection technology to continuously improve its accuracy, stability and reliability to better meet the needs of self-driving. However, due to the interference of adverse factors in adverse weather, the decrease of detection accuracy of autonomous vehicle is led to the phenomenon of missing and wrong detection, which has a serious impact on the safety of autonomous vehicles. Therefore, we propose REWeather to solve such problems of autonomous vehicles in multiple adverse weather conditions. Firstly, to classify the types of adverse weather, distinguishing among foggy, rainy and snowy weather, Broad Learning System (BLS) which is simple and efficient is used in REWeather. Due to the impact of these adverse weathers on sensors, simple dark channel and guided filtering methods is used to preprocess foggy and rainy images, respectively. Then, we put the processed images into the Real-Enhanced Super-Resolution Generative Adversarial Networks (Real-ESRGAN) for further denoising and enhancing the details of detected objects, enabling the sensor to recognize other targets on the road faster and better in adverse weather. To ensure the best detection results, we also use latest Realtime Detection Transformer (RT-DETR) as the detector to validate our work and the final model is deployed on the edge device. Moreover, we use several public datasets and our own collected data to make a real world dataset containing a variety of adverse weathers to train and test our proposed framework, which makes it closer to the real situation. The results demonstrate that our framework achieves a 3.8% improvement in mAP, significantly enhancing the detection capability of autonomous vehicles under adverse weather conditions.
A CLIP-Based Cross-Modal Matching Model for Image-Text Retrieval
In recent years, the demand for multimodal data retrieval has been growing rapidly. As two major modalities for information transmission, images and texts exhibit significant differences in feature distribution. To address challenges in image-text retrieval—such as balancing efficiency with performance and enhancing semantic modelling—this paper proposes an efficient cross-modal feature matching model based on the CLIP framework, including two parts: feature extraction and contrastive learning. During feature extraction, pre-trained VIT and BERT models are used to capture deep semantic features of images and texts, which achieve significant improvements in Feature Entropy (text: 4.27 vs. 3.62; image: 4.13 vs. 3.47) and Mutual Information (28.3% for text, 31.5% for image) compared with the baseline, indicating stronger semantic expressiveness and alignment. Through contrastive learning with the cosine-based loss function and Adam optimization, the model ensures stable convergence. Furthermore, preprocessing innovations such as removing redundant text tokens and Base64 image encoding boost training efficiency. Experiments on a dataset of 50,000 image-text pairs demonstrate that our model achieves high and stable retrieval performance with R@1, R@5, and R@10 scores ranging from 80% to 90%. Compared to the classic DeViSE model, our approach yields improvements of 12.9%, 10.0%, and 9.0% across the three metrics, confirming the model’s superior accuracy and generalization in large-scale retrieval scenarios. Finally, the model is evaluated on image-text retrieval tasks, where it consistently demonstrates strong cross-modal matching capabilities and accurately captures the semantic associations between images and texts
Event Sentiment and Cross-Country Herding Spillover Effects Using Machine Learning
This research study investigates herding behaviour and its cross-country spillover effects in the UK, US, China, and Pakistan stock markets in the presence of event sentiment. We used three machine learning models for the empirical investigation: support vector regression, single-layer neural networks, and multi-layer neural networks. The daily data set of the listed stocks has been used. The results suggest a significant predictability of the Twitter sentiment of Brexit 2016 and COVID-19. Cross-country herding spillover is also evident from the UK to Pakistan and the US in the case of Brexit 2016. Similarly, there is a herding spillover effect from China to Pakistan and UK stock markets. The overall results of machine learning models are more significant than linear regression models. Furthermore, the event sentiment improves the performance of the machine learning models. The study provides a deep insight for individual and institutional investors to take care of unpredicted events while constructing their international portfolios in these stock markets
Sustaining in Uncertain Time: Investigating Pension Fund Performance during Market Stress
The academic discourse on stress in the global economy and financial markets has ignited discussions regarding regulatory oversight of pension fund management and investment strategies. This study investigates how pension funds (PF) respond to short-term and long-term risks, as well as their recovery periods following market shocks. To address these inquiries, we classify financial market stress, considering both short-term and long-term risks. Utilizing the change point detection technique and Bayesian average (Zhao et al., 2019), we analyse shifts in the dynamics of PF values managed by SEB and Swedbank from 2004 to 2023. The research explores not only timings and the number of change points but also their likelihood over time. Drawdowns, recovery rates, and timing ratios are particularly insightful for assessing PF performance during crises and market disturbances. These findings contribute to the understanding of PF behaviour in various market conditions and underscore the significance of adaptive investment strategies in navigating financial uncertainties