Search CORE

1,721,033 research outputs found

REPLUG: Retrieval-Augmented Black-Box Language Models

Author: Seo Minjoon
Publication venue
Publication date: 2024
Field of study

KAIST Institutional Repository

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering

Author: Seo Minjoon
Yang Sohee
Publication venue
Publication date: 08/06/2021
Field of study

KAIST Institutional Repository

Rethinking the Role of Proxy Rewards in Language Model Alignment

Author: Kim Sungdong
Seo Minjoon
Publication venue
Publication date: 2024
Field of study

KAIST Institutional Repository

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

Author: Seo Minjoon
KIM GEE WOOK
Publication venue
Publication date: 2024
Field of study

KAIST Institutional Repository

Towards Reliable and Practical Phishing Detection

Author: Cho Hyowon
Seo Minjoon
Publication venue
Publication date: 2025
Field of study

As the prevalence of phishing attacks continues to rise, there is an increasing demand for more robust detection technologies. With recent advances in AI, we discuss how to construct a reliable and practical phishing detection system using language models. For this system, we introduce the first large-scale Korean dataset for phishing detection, encompassing six types of phishing attacks. We consider multiple factors for building a real-time detection system for edge devices, such as model size, Speech-To-Text quality, split length, training technique and multi-task learning. We evaluate the model’s ability twofold: in-domain, and unseen attack detection performance which is referred to as zero-day performance. Additionally, we demonstrate the importance of accurate comparison groups and evaluation datasets, showing that voice phishing detection performs reasonably well while smishing detection remains challenging. Both the dataset and the trained model will be available upon request

KAIST Institutional Repository

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Author: Seo Minjoon
Ye Seonghyeon
Kim Doyoung
Publication venue
Publication date: 2023
Field of study

KAIST Institutional Repository

Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization

Author: Seo Minjoon
Park Sue Hyun
Ko Miyoung
Publication venue
Publication date: 2024
Field of study

KAIST Institutional Repository

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Author: Seo Minjoon
Chang Hoyeon
Park Jinho
Ye Seonghyeon
Yang Sohee
Publication venue
Publication date: 2024
Field of study

KAIST Institutional Repository

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Author: Kim Seungone
Seo Minjoon
Suk Juyoung
Publication venue
Publication date: 2024
Field of study

KAIST Institutional Repository

Gradient Ascent Post-training Enhances Language Model Generalization

Author: Kim Sungdong
Seo Minjoon
Jang Joel
Yoon Dongkeun
Publication venue
Publication date: 2023
Field of study

In this work, we empirically show that updating pretrained LMs (350M, 1.3B, 2.7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks. Specifically, we show that GAP can allow LMs to become comparable to 2-3x times larger LMs across 12 different NLP tasks. We also show that applying GAP on out-of-distribution corpora leads to the most reliable performance improvements. Our findings indicate that GAP can be a promising method for improving the generalization capability of LMs without any task-specific fine-tunin

KAIST Institutional Repository