NII Repository (National Institute of Informatics)

第29回大学図書館と国立情報学研究所との連携・協力推進会議議事次第

Author
Publication venue: 大学図書館と国立情報学研究所との連携・協力推進会議
Publication date: 12/02/2025
Field of study

会議名：第29回大学図書館と国立情報学研究所との連携・協力推進会議開催場所：オンライン日時：2025年2月12日（水）15:00～17:00conference outpu

令和7年度第2回研究データ基盤運営委員会議事録

Author
Publication venue: 研究データ基盤運営委員会
Publication date: 10/11/2025
Field of study

以下の議事の質疑応答は非公開である 2. 【報告事項】GakuNin RDMの受益者負担モデルについてconference outpu

SQLで凝ったことをやってみましょう

Author: 花原稔
Publication venue: 国立情報学研究所
Publication date: 21/08/2025
Field of study

研修名：2025年度大学図書館員のためのIT総合研修開催期間：2025年8月20日（水）～8月22日（金）主催：国立情報学研究所conference presentatio

Overview of the NTCIR-18 FairWeb-2 Task

Author: Sijie Tao
Tetsuya Sakai
Junjie Wang
Hanpei Fang
Yuxiang Zhang
Haitao Li
Yiteng Tu
Nuo Chen
Maria Maistro
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper provides an overview of the NTCIR-18 FairWeb-2 Task. Our task considers not only document relevance but also group fairness. We designed two subtasks: the Web Search Subtask, and the Conversational Search Subtask. We designed three types of search topics for this task: researchers (R), movies (M), and Youtube contents (Y). For each topic type, attribute sets are defined for considering group fairness. For the Web Search Subtask, we received 23 runs from five teams, including six runs from the organisers team. For the Conversational Search Subtask, we received four runs from two teams, including one run from the organisers team. In this paper, we describe the task, the test collection construction and the official evalution results of the submitted runs.conference pape

AIDAVANCE at the NTCIR-18 FinArg-2 Task: Making the Most of Small Language Models

Author: Hugo Dutra
Leonardo Martinho
Gabriel Assis
Jonnathan Carvalho
Aline Paes
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper presents AIDAVANCE's approach to Subtask 2 (Detection of Argument Temporal References) of the NTCIR-18 FinArg-2 Task. We explored different classification strategies, including direct multi-class classification, a hierarchical cascade approach that first identifies the presence of a temporal reference before further categorization, and an LLM-based argument rewriting method. Our best model, a fine-tuned mDeBERTa using the multi-class approach, ranked fourth overall, achieving a Micro-F1 score of 0.6905 and a Macro-F1 score of 0.6711. Our findings reinforce that fine-tuning smaller encoder models remains an effective strategy for specialized classification tasks, even outperforming state-of-the-art LLMs.conference pape

UPxSocio at NTCIR-18 MedNLP-CHAT Task: Similarity-Based Few-Shot Example Selection for Prompt-Based Detection

Author: Michael Van Supranes
Martin Augustine Borlongan
Joseph Ryan Lansangan
Genelyn Ma. Sarte
Shaowen Peng
Shoko Wakamiya
Eiji Aramaki
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper presents our submission to the MedNLP-CHAT Task at NTCIR-18, which focuses on detecting medical, ethical, and legal risks in chatbot-generated responses. We propose a two-step prompt-based classification framework using the Gemini-1.5-flash model. The method first generates support statements to guide reasoning, which are then integrated into a few-shot prompt for final classification. We evaluated our approach on the English versions of the Japanese and German subtasks, submitting two systems per subtask that varied in example selection strategy and label distribution. Our systems achieved strong performance in detecting medical risks—particularly in the German subtask—while ethical and legal risks were more challenging. To better understand the design factors influencing performance, we conducted ablation studies across 24 prompt variants. Logistic regression and CHAID analyses revealed that accuracy depends on complex interactions between subtask language, example similarity, actual label, and selection method. Higher similarity improves classification of risk-present cases but harms performance on risk-absent cases, indicating a trade-off between recall and false positives. The

k

-nearest method was more effective under high similarity, while

k

-spread offered balanced results across classes. Although the two-step prompting strategy did not show a statistically significant advantage overall, the best-performing configuration used five support statements, with diminishing gains beyond that. Our findings suggest that optimized prompt design, particularly with controlled support and example selection, can improve risk detection without requiring large-scale training or high computational resources.conference pape

NURad at the NTCIR-18 RadNLP Task

Author: Marina Higashi
Rintaro Ito
Keita Kato
Ryota Asai
Shingo Iwano
Shinji Naganawa
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

Lung cancer is the most common cause of cancer death in Japan. The TNM classification is essential for lung cancer diagnosis and treatment planning, and CT imaging plays a crucial role in its evaluation. However, the number of thoracic radiologists is limited in Japan. The development of a system to automatically extract TNM classification from radiology reports would be beneficial to radiologists and other clinicians. Large language models (LLMs) have recently shown remarkable progress in natural language processing, opening new possibilities for medical applications. The NURad team participated in the NTCIR-18 Natural Language Processing for Radiology (RadNLP) task . This paper describes our approach to the problem and discusses the official results. We explored different prompts, LLM models (Llama3, Open AI O1pro, Google Gemini 2.0, Google Notebook LM), and data types (Japanese and English). We also investigated fine-tuning with clinical data. The final model, utilizing a short prompt and trained on both Japanese and English datasets using Google Notebook LM, did not incorporate clinical data. Our final model with Google Notebook LM achieved a TNM (fine) score of 0.93 on the validation dataset. However, the score decreased to 0.54 on the test dataset. This decline was more pronounced for the T classification compared to the N and M classifications. This study demonstrates the potential of LLMs for automated TNM classification from radiology reports, but also highlights challenges in generalization to unseen data, particularly for T classification. Further research is needed to improve the robustness and accuracy of LLM-based TNM classification systems.conference pape

Automated Lung Cancer Staging from Radiological Reports: A Large Language Model Approach for the NTCIR-18 RadNLP Task

Author: Takahito Nakajima
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

Lung cancer TNM classification from narrative radiology reports presents challenges due to expression variability and complex relationships between findings. This study develops an automated TNM classification system utilizing large language models (LLMs) with supervised fine-tuning (SFT) and specialized prompting (SP) approaches. We evaluated our system on the NTCIR-18 RadNLP 2024 Task dataset, achieving 72.69\% (Japanese) and 55.56\% (English) fine-grained accuracy, ranking 5th among 15 teams. Our system demonstrated particularly high performance in N-factor classification (>93.98\% accuracy) and in the subtask of textual analysis (ranking 1st in Japanese and 3rd in English tracks). Error analysis revealed challenges in interpreting complex expressions and implicit information. This system shows potential for clinical workflow optimization, standardization of TNM classification, and educational support, with implications for improving cancer staging practices.conference pape

YMX2L at the NTCIR-18 Transfer-2 Task

Author: Riku Mizuguchi
Takeshi Yamazaki
Shuhei Yamamoto
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper presents the participation of the YMX2L research team in the NTCIR-18 Transfer-2 Dense Multimodal Retrieval (DMR) task. Our approach focuses on the integration of visual and sensor data, leveraging data augmentation techniques and object detection to enhance retrieval performance. The experimental results demonstrate the effectiveness of our proposed methods and highlight key features that contribute to addressing the challenges of multimodal dense retrieval.conference pape

Overview of the NTCIR-18 U4 Task

Author: Yasutomo Kimura
Sato Eisaku
Kazuma Kadowaki
Hokuto Ototake
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper provides an overview of the NTCIR-18 U4 shared task, which focuses on unifying, understanding, and utilizing unstructured data in financial reports. This task aims to improve methods for extracting and analyzing information, particularly from tables, within annual securities reports. These reports are crucial for understanding a company's financial performance, yet their complex and varied table structures present significant challenges for automated processing. To address these issues, the task comprises two subtasks, Table Retrieval and Table Question Answering, designed to evaluate and advance system capabilities for handling real world financial documents. The dataset, drawn from TOPIX100 companies, encompasses diverse table formats and content, serving as a rigorous test bed for participants. Performance is assessed via a leaderboard that evaluates JSON formatted system outputs, promoting transparent and reproducible results. The NTCIR-18 U4 task saw 10 active teams participate, submitting a total of 210 submissions.conference pape

2,022

full texts

2,035

metadata records

Updated in last 30 days.

NII Repository (National Institute of Informatics)

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇