NII Repository (National Institute of Informatics)
Not a member yet
    2035 research outputs found

    SPARC Japan NewsLetter NO.48

    Full text link
    ■ SPARC Japan Activity Reports Support for arXiv.org[p.1] Support for CLOCKSS[p.2] Support for the SCOAP3[p.2] Contributions allocated for SCOAP3 Phase 4 (2025–2027)[p.3] ■ SPARC Japan Seminar Report Outline[p.4] Presentation Abstracts and Speakers[p.4] Panel Discussion[p.10] Attendee Feedback[p.12] Afterword[p.12]articl

    リレーショナルDBにおける複数のテーブル

    Full text link
    研修名:2025年度大学図書館員のためのIT総合研修 開催期間:2025年8月20日(水)~8月22日(金) 主催:国立情報学研究所conference presentatio

    ダイ 44 カイ コレカラ ノ ガクジュツ ジョウホウ システム コウチク ケントウ イインカイ ハイフシリョウ

    Full text link
    会議名:第44回 これからの学術情報システム構築検討委員会 開催場所:オンライン 日時:2025年10月30日(水)10:00~12:00conference outpu

    Overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) Task

    Full text link
    In this paper, we provide an overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) task. As large language models (LLMs) grow popular in both academia and industry, how to effectively evaluate the capacity of LLMs becomes an increasingly critical but still challenging issue. Existing methods can be divided into two types: manual evaluation, which is expensive, and automatic evaluation, which faces many limitations including task format (the majority belong to multiple-choice questions) and evaluation criteria (occupied by reference-based metrics). To advance the innovation of automatic evaluation, we propose the AEOLLM task which focuses on generative tasks and encourages reference-free methods. Besides, we set up diverse subtasks such as dialogue generation, text expansion, summary generation and non-factoid question answering to comprehensively test different methods. This year, we received 48 runs from 4 teams in total. This paper will describe the background of the task, the data set, the evaluation measures and the evaluation results, respectively.conference pape

    SCUNLP-3 at the NTCIR-18 FinArg-2 Task: Template-Based Prompting and Augmentation

    Full text link
    Social media claims often have shifting validity that influences downstream tasks like misinformation detection, financial predictions, and domain-specific decisions. This study proposes a novel approach that merges original text with automatically generated template text to highlight temporal cues. By integrating this enriched data into the training process, the model more effectively gauges how long a claim remains reliable, even when its relevance rapidly evolves. This strategy addresses the challenge of ephemeral statements whose validity fluctuates as new information emerges. Experimental results underscore the method’s effectiveness, achieving a macro-F1 score of 78.10%. These findings highlight the importance of systematically assessing claim longevity, providing a pathway to more robust content analysis and better-informed decisions in ever-changing online environments.conference pape

    vitrivr-engine at the NTCIR-18 Lifelog-6 Task

    Full text link
    This paper discusses vitrivr's participation in the Lifelog Semantic Access subtask of the 6th edition of the NTCIR Lifelog. It is based on the system that participated in the 2024 Lifelog Search Challenge and only replaces the interactive query interface with an LLM-based query transformation method. All results are generated in one pass without any further re-processing or refinement.conference pape

    NTCIR-18 MedNLP-CHAT Determining Medical, Ethical and Lega Risks in Patient-Doctor Conversations: Task Overview

    Full text link
    This paper presents an overview of the Medical Natural Language Processing for AI Chat (MedNLP-CHAT) task, conducted as part of the shared task at NTCIR-18. Recently, medical chatbot services have emerged as a promising solution to address the shortage of medical and healthcare professionals. However, the potential risks associated with these chatbots remain insufficiently understood. Given this context, we designed the MedNLP-CHAT task to evaluate medical chatbots from multiple risk perspectives, including medical, legal, and ethical aspects. In this shared task, participants were required to analyze a given medical question along with the corresponding chatbot response and determine whether the response posed a potential medical, legal, or ethical risk (binary classification). Nine teams participated in this task applying different approaches, yielding valuable insights.conference pape

    IMNTPU at NTCIR-18 MedNLP-CHAT Task: Evaluating Agentic AI for Multilingual Risk Assessment in Medical Chatbots

    Full text link
    The IMNTPU team presents a multilingual evaluation of Agentic AI for chatbot risk classification in the NTCIR-18 MedNLP-CHAT task. Our framework integrates fine-tuned small models, optimized few-shot prompting with GPT-4o, and multi-agent aggregation via majority and trust-weighted voting. Results show that Agentic AI enhances decision consistency, especially in subjective tasks like ethical risk, but yields limited gains in structured domains such as medical and legal assessment. Language-specific outcomes reveal that annotation quality and linguistic complexity jointly affect model performance, with Japanese systems showing the most stability. Confidence analysis highlights a decoupling between model certainty and accuracy, underscoring the need for adaptive trust and calibration strategies. Building on these insights, we propose a Trust-Guided Agentic AI architecture featuring self-consistency filtering, dynamic trust updating, and Chain-of-Thought prompting to further improve reliability in safety-critical AI systems.conference pape

    令和7年度第1回CiNii Research作業部会議事要旨

    Full text link
    conference outpu

    SPARC Japan セミナー2024 「オープンアクセス義務化の先にあるもの:来るべき世界に向けて」 日本における研究力強化とオープンアクセス 発表資料

    Full text link
    SPARC Japan セミナー2024「オープンアクセス義務化の先にあるもの:来るべき世界に向けて」 開催場所:オンライン開催 日時:2025年1月30日(木)13:00~17:00conference presentatio

    2,022

    full texts

    2,035

    metadata records
    Updated in last 30 days.
    NII Repository (National Institute of Informatics)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇