NII Repository (National Institute of Informatics)
Not a member yet
    2035 research outputs found

    WhiteME at U4 Shared Task: Hybrid Retrieval with Table-Structured Clues for Economic Table QA

    Full text link
    Recently, Large Language Models (LLMs) are gaining increased attention in the domain of Table Question Answering (TQA), particularly for extracting data from tables in documents. However, directly entering entire tables as long text into LLMs often leads to incorrect answers because most LLMs cannot inherently capture complex table structures. In this paper, we propose a cell extraction method for TQA without manual identification, even for complex table headers. Our approach estimates table headers by computing similarities between a given question and individual cells via a hybrid retrieval mechanism that integrates a language model and TF-IDF. We then select as the answer the cells at the intersection of the most relevant row and column. Furthermore, the language model is trained using contrastive learning on a small dataset of question-header pairs to enhance performance. We evaluated our approach in the TQA dataset from the shared task "Unifying, Understanding, and Utilizing Unstructured Data in Financial Reports" (U4) held in the NTCIR-18 conference, which our team (WhiteME) participated in. The experimental results show that our pipeline achieves an accuracy of 74.6%, outperforming existing LLMs such as GPT-4o mini (63.9%). In summary, we found that focusing on the header relationships through our hybrid retrieval strategy effectively addresses structural uncertainties in complex tables.conference pape

    SMM at the NTCIR-18 U4 Task

    Full text link
    This paper presents the methods and results of Team SMM for the U4 task at NTCIR-18. In the Table Retrieval subtask, we designed methods for table retrieval using a cell-level multi-vector retriever and a single-vector retriever to enhance retrieval accuracy. The retriever first narrows down candidate tables to the top 10 based on retrieval score. Then, a cross-encoder-based reranker classifies these candidates into three categories: positive, negative, and hard negative. Finally, the table with the highest probability of being positive is selected as the final retrieved result. For the Table Question Answering subtask, we employ a T5-based model for answer generation to produce multiple candidate answers and introduce a Cell ID Estimator that identifies which cells in the table were used as the basis for generating each candidate answer by leveraging cell, row, and column embeddings. The estimator then selects the final answer based on the highest supporting cell score. The test set is divided into public and private splits, inspired by Kaggle's evaluation methodology. The public split is used for leaderboard updates, while the private split ensures robustness by preventing models from overfitting to leaderboard data. Final evaluations include both splits to provide a more reliable assessment of model performance. In the formal run, our method achieved an accuracy of 97.70\% (public) and 97.55\% (private) for Table Retrieval (ID 62), and for Table Question Answering, 86.34\% and 86.57\% on cell ID and value prediction, respectively, on the public split, with corresponding accuracies of 82.76\% and 81.94\% on the private split.conference pape

    UOM at the NTCIR-18 RadNLP Task

    Full text link
    The RadNLP 2024 (Natural Language Processing for Radiology) shared task at the international conference NTCIR-18 (English track) focuses on document classification for lung cancer staging, aiming to automatically determine the stage (i.e., the degree of progression) of lung cancer from radiology reports. Our approach involved data preprocessing, stratified data augmentation, and fine-tuning RadBERT—a transformer model pre-trained on radiology-specific text. We employed back-translation for data augmentation and 5-fold cross-validation to improve model robustness and address class imbalance. The results demonstrated that data augmentation significantly improved validation performance, with T accuracy increasing from 39.39% to 94.05% during K-fold validation and reaching 100% on the task validation set. However, a substantial performance gap was observed on the task test set, with joint accuracy dropping from 96.3% on the task validation set to 12.35%. This highlights challenges in model generalization due to limited dataset diversity and domain-specific language variability. This report details our methodology, results, and discusses the challenges encountered, highlighting the need for further research to improve the robustness and generalizability of automated lung cancer staging from limited radiology reports.conference pape

    UTY at the NTCIR-18 RadNLP 2024 Task: Possibilities and Limitations of a Hybrid Rule-Based and LLM Approach for Lung Cancer TNM Classification

    Full text link
    Automated extraction of TNM staging information from radiology reports is a challenging task that requires understanding complex clinical language and applying detailed staging criteria. In this paper, we present our approach to the NTCIR-18 RadNLP 2024 shared task on automated lung cancer staging from Japanese radiology reports. We developed a hybrid system that combines large language models (LLMs) with rule-based processing in a two-stage pipeline: first extracting structured information from reports using GPT-4o models, then applying classification rules to determine the appropriate TNM stages. Our approach employed different strategies for each classification component: a rule-based method for the complex T classification and a more flexible LLM-based approach for N and M classifications. Evaluation results showed strong performance on the validation dataset (joint accuracy of 0.8148) but revealed a significant drop in T classification performance on the test dataset (from 0.8704 to 0.4769), while N and M classifications maintained high accuracy levels. This performance disparity highlights the trade-offs between rule-based precision and LLM flexibility in clinical NLP systems. Our findings suggest that balancing these approaches and leveraging larger development datasets could improve the robustness of automated cancer staging systems for real-world clinical applications.conference pape

    STMK24 NTCIR18 U4 Table QA Submission

    Full text link
    This paper reports the methods, results and analysis of STMK24 for the NTCIR-U4 Table QA (TQA) task. STMK24 approaches TQA as a Visual Document Understanding task, and tables are transformed into three different modalities: image, text, and layout of the content. To simply comprehend the structures of the tables, our model is trained to infer the cell IDs of the tables, and the cell values are automatically extracted through rule-based conversion. We investigated the impact of each modality on Table QA performance and confirmed that the model achieves high cell ID inference accuracy when utilizing all modalities.conference pape

    SPARC Japan セミナー2024 「オープンアクセス義務化の先にあるもの:来るべき世界に向けて」 オープンアクセス時代の情報リテラシー 発表資料

    Full text link
    SPARC Japan セミナー2024「オープンアクセス義務化の先にあるもの:来るべき世界に向けて」 開催場所:オンライン開催 日時:2025年1月30日(木)13:00~17:00conference presentatio

    SPARC Japan セミナー2024 「オープンアクセス義務化の先にあるもの:来るべき世界に向けて」 オープンアクセス義務化後の大学図書館の姿としての『2030デジタル・ライブラリー』 ドキュメント

    Full text link
    SPARC Japan セミナー2024「オープンアクセス義務化の先にあるもの:来るべき世界に向けて」 開催場所:オンライン開催 日時:2025年1月30日(木)13:00~17:00conference presentatio

    大学図書館員のためのIT総合研修 2025 実習シナリオ

    No full text
    研修名:2025年度大学図書館員のためのIT総合研修 開催期間:2025年8月20日(水)~8月22日(金) 主催:国立情報学研究所conference presentatio

    令和6年度第3回研究データ基盤運営委員会議事録

    Full text link
    conference outpu

    FTRI at the NTCIR-18 FinArg-2 Task: Identify Temporal Reference in Earnings Conference Calls

    Full text link
    FinArg-2 is part of the NTCIR Financial Argument shared task series which aims to improve argument understanding in financial analysis. FinArg-2 aims to introduce "Temporal Inference of Financial Arguments" focusing on the assessment of temporal information, which is a distinct phenomenon in financial opinions. FTRI participates in FinArg-2 on the Earnings Conference Calls (ECC) subtask, where models must identify the temporal reference associated with an argument. At the initial stage we conducted experiments on variation of transformers models using several configurations at the preprocessing and training stages. BERT-Base-Uncased, BERT-Large-Uncased, and RoBERTa-Base-Uncased showed slightly superior performance compared to the other models. So, in the overall model that we created, we only fine-tuned those models as our baseline model. Our first model’s output FTRI_ECC_1, we use a transformer encoder approach with BERT-Large, resulting in 71.43% Micro F1 and 68.58% Macro F1. Our second model’s output FTRI_ECC_2, we use attention mask in Claim, Premise, and (Year + Quarter) approach with BERT-Base, resulting in 69.05% Micro F1 and 65.76% Macro F1. Our third model’s output FTRI_ECC_3, we use TF-IDF (Claim + Premise) + One-hot encoding (Year + Quarter) approach with BERT-Base, resulting in 77.38% Micro F1 and 75.07% Macro F1, which is the best results in this ECC Subtask. The evaluation results show that the 3 output models we created are in the top 4 among other participants based on Micro and Macro F1.conference pape

    2,022

    full texts

    2,035

    metadata records
    Updated in last 30 days.
    NII Repository (National Institute of Informatics)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇