NII Repository (National Institute of Informatics)

SPARC Japan NewsLetter NO.48

Author
Publication venue: National Institute of Informatics
Publication date: 2025
Field of study

■ SPARC Japan Activity Reports Support for arXiv.org[p.1] Support for CLOCKSS[p.2] Support for the SCOAP3[p.2] Contributions allocated for SCOAP3 Phase 4 (2025–2027)[p.3] ■ SPARC Japan Seminar Report Outline[p.4] Presentation Abstracts and Speakers[p.4] Panel Discussion[p.10] Attendee Feedback[p.12] Afterword[p.12]articl

リレーショナルDBにおける複数のテーブル

Author: 瀬尾崇一郎
Publication venue: 国立情報学研究所
Publication date: 21/08/2025
Field of study

研修名：2025年度大学図書館員のためのIT総合研修開催期間：2025年8月20日（水）～8月22日（金）主催：国立情報学研究所conference presentatio

ダイ　44　カイ　コレカラ　ノ　ガクジュツ　ジョウホウ　システム　コウチク　ケントウ　イインカイ　ハイフシリョウ

Author
Publication venue: これからの学術情報システム構築検討委員会
Publication date: 30/10/2025
Field of study

会議名：第44回これからの学術情報システム構築検討委員会開催場所：オンライン日時：2025年10月30日（水）10:00～12:00conference outpu

Overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) Task

Author: Junjie Chen
Haitao Li
Zhumin Chu
Yiqun Liu
Qingyao Ai
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

In this paper, we provide an overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) task. As large language models (LLMs) grow popular in both academia and industry, how to effectively evaluate the capacity of LLMs becomes an increasingly critical but still challenging issue. Existing methods can be divided into two types: manual evaluation, which is expensive, and automatic evaluation, which faces many limitations including task format (the majority belong to multiple-choice questions) and evaluation criteria (occupied by reference-based metrics). To advance the innovation of automatic evaluation, we propose the AEOLLM task which focuses on generative tasks and encourages reference-free methods. Besides, we set up diverse subtasks such as dialogue generation, text expansion, summary generation and non-factoid question answering to comprehensively test different methods. This year, we received 48 runs from 4 teams in total. This paper will describe the background of the task, the data set, the evaluation measures and the evaluation results, respectively.conference pape

SCUNLP-3 at the NTCIR-18 FinArg-2 Task: Template-Based Prompting and Augmentation

Author: Pan Hongrui
Wu Jheng-Long
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

Social media claims often have shifting validity that influences downstream tasks like misinformation detection, financial predictions, and domain-specific decisions. This study proposes a novel approach that merges original text with automatically generated template text to highlight temporal cues. By integrating this enriched data into the training process, the model more effectively gauges how long a claim remains reliable, even when its relevance rapidly evolves. This strategy addresses the challenge of ephemeral statements whose validity fluctuates as new information emerges. Experimental results underscore the method’s effectiveness, achieving a macro-F1 score of 78.10%. These findings highlight the importance of systematically assessing claim longevity, providing a pathway to more robust content analysis and better-informed decisions in ever-changing online environments.conference pape

vitrivr-engine at the NTCIR-18 Lifelog-6 Task

Author: Luca Rossetto
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper discusses vitrivr's participation in the Lifelog Semantic Access subtask of the 6th edition of the NTCIR Lifelog. It is based on the system that participated in the 2024 Lifelog Search Challenge and only replaces the interactive query interface with an LLM-based query transformation method. All results are generated in one pass without any further re-processing or refinement.conference pape

NTCIR-18 MedNLP-CHAT Determining Medical, Ethical and Lega Risks in Patient-Doctor Conversations: Task Overview

Author: Eiji Aramaki
Shoko Wakamiya
Shuntaro Yada
Shohei Hisada
Tomohiro Nishiyama
Lenard Paulo Tamayo
Jingnan Xiao
Axalia Levenchaud
Pierre Zweigenbaum
Christoph Otto
Jerycho Pasniczek
Philippe Thomas
Nathan Pohl
Wiebke Duettmann
Lisa Raithel
Roland Roller
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

This paper presents an overview of the Medical Natural Language Processing for AI Chat (MedNLP-CHAT) task, conducted as part of the shared task at NTCIR-18. Recently, medical chatbot services have emerged as a promising solution to address the shortage of medical and healthcare professionals. However, the potential risks associated with these chatbots remain insufficiently understood. Given this context, we designed the MedNLP-CHAT task to evaluate medical chatbots from multiple risk perspectives, including medical, legal, and ethical aspects. In this shared task, participants were required to analyze a given medical question along with the corresponding chatbot response and determine whether the response posed a potential medical, legal, or ethical risk (binary classification). Nine teams participated in this task applying different approaches, yielding valuable insights.conference pape

IMNTPU at NTCIR-18 MedNLP-CHAT Task: Evaluating Agentic AI for Multilingual Risk Assessment in Medical Chatbots

Author: Jun-Yu Wu
Cheng-Yun Wu
Bor-Jen Chen
Wen-Hsin Hsiao
Min-Yuh Day
Publication venue: NII Institutional Repository
Publication date: 06/06/2025
Field of study

The IMNTPU team presents a multilingual evaluation of Agentic AI for chatbot risk classification in the NTCIR-18 MedNLP-CHAT task. Our framework integrates fine-tuned small models, optimized few-shot prompting with GPT-4o, and multi-agent aggregation via majority and trust-weighted voting. Results show that Agentic AI enhances decision consistency, especially in subjective tasks like ethical risk, but yields limited gains in structured domains such as medical and legal assessment. Language-specific outcomes reveal that annotation quality and linguistic complexity jointly affect model performance, with Japanese systems showing the most stability. Confidence analysis highlights a decoupling between model certainty and accuracy, underscoring the need for adaptive trust and calibration strategies. Building on these insights, we propose a Trust-Guided Agentic AI architecture featuring self-consistency filtering, dynamic trust updating, and Chain-of-Thought prompting to further improve reliability in safety-critical AI systems.conference pape

令和7年度第1回CiNii Research作業部会議事要旨

Author
Publication venue: 研究データ基盤運営委員会CiNii Research作業部会
Publication date: 19/05/2025
Field of study

conference outpu

SPARC Japan セミナー2024 「オープンアクセス義務化の先にあるもの：来るべき世界に向けて」日本における研究力強化とオープンアクセス　発表資料

Author: 大隅典子
Publication venue: 国立情報学研究所
Publication date: 30/01/2025
Field of study

SPARC Japan セミナー2024「オープンアクセス義務化の先にあるもの：来るべき世界に向けて」開催場所：オンライン開催日時：2025年1月30日（木）13:00～17:00conference presentatio

2,022

full texts

2,035

metadata records

Updated in last 30 days.

NII Repository (National Institute of Informatics)

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇