International Journal of Computer (IJC - Global Society of Scientific Research and Researchers, GSSRR)
Not a member yet
459 research outputs found
Sort by
PostgreSQL JSONB-based vs. Typed-column Indexing: A Benchmark for Read Queries
A document storage pattern is used in a wide range of business applications, particularly in systems with strong requirements for immutability and auditability. Using the JSONB data type is a common approach to achieving document storage in PostgreSQL, which supports indexing for JSONB. However, building an efficient system for such structures remains a challenge in terms of read latency, mainly due to payload size, recheck cost and value extraction.
In this paper, the comparative performance of regular typed-column indexes and JSONB-based indexes is evaluated across ten queries typical of a production application.
Generated metrics are used to visualize latency of indexes on JSONB expressions and typed-column indexes and for a one-sided statistical test.
The findings show that JSONB-based indexes are not an optimal solution in terms of read performance, and overall typed-column indexes are expected to be at least 20% faster for tables with 1 million records or more.
This benchmark is important to consider when designing an efficient data storage for documents
Paradigms of ELT Data Pipeline Architectures for LLM Training
This article presents a systematic analysis of ELT pipeline architectures used in the training of large language models. The study is based on an interdisciplinary approach that integrates engineering principles of data infrastructure design, theoretical foundations of transformer architectures, and data flow automation practices under conditions of high source variability. Particular attention is given to the content analysis of scientific and applied publications addressing the role of LLMs in transformation loops, the implementation of agent-oriented solutions, and the support of multimodal adaptive pipelines. Key ELT architecture types are identified, including prompt-driven, agent-based, high-throughput, and cognitively enhanced solutions, reflecting varying levels of model involvement in data processing. The analysis shows that architectural shifts toward feedback integration and dynamic routing enable the creation of robust and adaptive solutions suited to contemporary training scenarios. Special emphasis is placed on issues related to data stream instability, the lack of benchmarks for agent-based systems, and insufficient integration of pipelines with model evaluation mechanisms. The paper proposes a conceptual classification of ELT paradigms and an outline of their adaptive evolution toward building scalable and logically coherent infrastructures. The article will be of interest to researchers in machine learning systems, LLM infrastructure developers, data platform architects, and professionals in digitalization and automation of AI training workflows
Best Practices for Personal Data Protection in Scalable Enterprise Applications
This article examines existing methods for protecting personal data in corporate applications, considering modern challenges in dynamic and distributed cloud environments. The study includes an extensive analysis of encryption techniques, such as proxy encryption, DNA encryption, dual encryption with fragmentation, and homomorphic encryption, as well as mechanisms for ensuring data integrity and access control. These mechanisms include cryptographic hash functions, digital signatures, message authentication codes (MAC), role-based and attribute-based access control, federated authentication, and multi-factor authentication. The research also reviews publicly available studies found on the Internet. Particular attention is given to the specifics of data protection in cloud infrastructures, where high intensity, data fragmentation, and the lack of physical security control necessitate architectural solutions such as Trusted Virtual Data Centre (TVDc) and Tera Architecture, along with the integration of security measures into the software development lifecycle (SDLC). The materials presented in this study are relevant to researchers, system architects, and corporate IT infrastructure practitioners seeking to synthesize theoretical and empirical approaches to achieve a high level of information security in the face of rapidly evolving threats
Modern Approaches to Automating QA Processes in the Context of Digital Transformation
The article examines approaches to automating quality assurance (QA) processes within the framework of digital transformation. Based on an extensive analysis of publicly available literature, the work describes how the transition from traditional manual methods to flexible automated solutions contributes to reducing the time required for developing test scripts, enhancing the accuracy of defect detection, and improving the overall efficiency of QA processes. The author’s hypothesis is that the integration of AI methods into QA processes not only shortens the time needed for test script development but also increases defect detection accuracy by optimizing test scenarios and employing flexible analysis algorithms. The scientific novelty of the article lies in the development of a new perspective on the use of automation methods in QA processes, made possible by the literature review. The material will be useful for other researchers as well as for professionals working in the fields of information technology, digital transformation management, and process automation who aim to integrate advanced testing methods into the infrastructure of modern IT systems. It is particularly valuable for academic teams, strategic analysts, and top managers seeking scientifically substantiated solutions for the optimization and sustainable development of QA processes in the dynamically evolving digital economy
Performance Benchmarking of Traditional Machine Learning and Transformer Models for Multi-Class Text Classification
Text classification is a fundamental task in natural language processing (NLP), widely applied in areas such as spam detection, sentiment analysis, and text categorization. This study presents a comparative analysis of three distinct machine learning paradigms—traditional machine learning algorithms (like Random Forest, XGBoost, support vector machine and Naive Bayes), a custom-built transformer architecture, and transfer learning or pre-trained transformer models (BERT, DistilBERT, RoBERTa, ELECTRA)—on the multi-class news classification dataset. While traditional models provided competitive baselines with up to 90.47% accuracy, modern transformer architecture surpassed them, achieving 91% accuracy when trained from scratch. The highest performance was observed with transfer learning using pre-trained models, where RoBERTa achieved 94.54% accuracy, DistillBERT achieved 94.32% accuracy, BERT achieved 94.07% accuracy and ELECTRA achieved 93.66%. These findings highlight the significance of contextual embeddings and large-scale pretraining in advancing text classification performance
Interrogating the preparedness of Uganda’s Higher Education and the Role of Information and Communication Technology in the era of COVID-19 Pandemic and its Aftermath
This article is based on a qualitative study conducted to establish the preparedness of Uganda’s Higher Education and the role of Information and Communication Technology (ICT) during COVID-19 pandemic and its aftermath. Coincidentally, the dynamics of COVID-19 pandemic predicated the need for urgent adoption and scale up of ICT in education. However, universities in Uganda were strategically ill-prepared to adopt and sustain to digital and online methods of engagement, sighting problems ranging from policy and curriculum deficit, to lack of staff and student preparedness. Ideally, the inability for universities to adopt and sustain to digital and virtual mode of operations demonstrated a strategic irregularity between ICT and higher education systems. In response, this study was intended to address the salient gap by identifying the key pedagogical challenges encountered and recommend appropriate strategic options based on the global best practice in ICT and higher education. Procedurally, the study adopted qualitative study approach, using scoping literature review techniques and content analysis methods of extracting evidence.
By way of reflective analysis and scrutiny, the study sought to establish how various universities coped with COVID-19 lockdown, while identifying the relevant information in the domain of ICT and higher education systems. Hence, data analysis followed a descriptive approach, linking the identified gaps with the global best practices. Subsequently, the general implication of the study finding is that there was a strategic mismatch between ICT and higher education systems. Thus, the key challenges identified include lack of pedagogical flexibility and inefficient social interaction among learners and instructors, lack of self-directed and independent learning, restricted mode of assessment, and lack of staff competence in ICT and pedagogical approaches. Hence, this study provides a strategic guidelines to higher education planners to formally integrate ICT functions and education systems
The Impact of Cross-industry Collaboration on the Effectiveness of Devops Practices
This article examines the influence of interdepartmental and cross-industry collaboration on the effectiveness of DevOps practices. As the demand for faster release cycles increases, organizations face the challenge of simultaneously enhancing system stability and reducing error rates. The study is based on materials from DORA, reports by Deloitte, Atlassian, Brainhub, DevMio, and publications by Ahmad et al., Dryka et al., Offerman, and Smith. A comparative analysis is conducted on internal collaboration models involving development, operations, testing, and information security teams, as well as on knowledge exchange between IT organizations and representatives of the banking, telecommunications, and manufacturing sectors. The goal is to assess the impact of collaboration formats on release frequency, time to incident recovery, and the proportion of failed changes. The research methodology includes literature systematization, statistical analysis of performance metrics, and synthesis of empirical evidence. The results show that integrating professionals from different domains accelerates delivery cycles, lowers failure rates, and improves team satisfaction. The article is relevant to project managers, DevOps team leads, digital transformation experts, and consultants responsible for optimizing the development and operations of software solutions
Topic Modeling for Web Page using LDA Algorithm and Web Content Mining: Testing and Evaluation
In recent years, the content of websites has become useful and is increasing rapidly, this information plays an important role in discovering various knowledge on the web. This paper aims to test and evaluate our previous work with the new dataset. The previous system applied the LDA Algorithm for Topic Modelling in Web content mining, which was tested and discussed on: different science content, a large dataset, and similarity value. According to the results on our new dataset (No. of rows: 298, No. of columns: 6, Computer, Mathematical, Physics, Chemistry Sciences), the system approves that the LDA algorithm is the best on the web content mining dataset
A Method for Identifying and Assessing Phishing Attacks in Communication Messages
Phishing attacks have become a significant threat in online communication platforms. These attacks exploit human vulnerabilities by using deceptive messages to steal sensitive information or distribute malicious content. This paper presents a comprehensive phishing detection system, leveraging machine learning and multi-layered analysis of URLs, files, and message content. The proposed system integrates URL analysis, file analysis, and text analysis services to identify potential threats effectively. Experimental results demonstrate the efficacy of the approach, achieving high accuracy in detecting phishing attempts. This research contributes to the field of cybersecurity by providing a robust framework for identifying and mitigating phishing risks in real-time communication
Integrating Python Into Power BI for Analyzing and Predicting Digital Development: Case Study – Balkan Countries
The modern era with emerging technologies leads to the generation of an abundance of data, in the correct interpretation of which visualization plays a key role. The accelerated growth of the volume and complexity of data emphasizes the need for advanced ways of their transformation, visualization and analysis. By applying quantitative, empirical and qualitative methods, this scientific paper investigates the time required to perform complex data transformations in the Power BI visualization tool, in the case when they are provided manually and by applying Python code, i.e. integrating a Python script, and analyzes the efficiency and practicality in both cases through a specific example of analysis and prediction of the digital development of selected Balkan countries. The research is based on processing secondary data on the determinants of digital progress taken from official sources, empirical research and observation of IT and BI professionals for whom the time of performing assigned tasks was measured and subjective assessment of personal perceptions through an interview. All results pointed to the fact that the use of Python scripts within Power BI significantly reduces the required work time and increases efficiency while improving the accuracy, user experience and practicality of this tool, which is an important step towards adopting new advanced practices in visual data analysis