[2023 Spring NLP Seminar] Mutual Information Alleviates Hallucinations in Abstractive Summarization (EMNLP 2022)

AI/NLP 2023. 3. 28. 19:35

728x90

[2023 Spring NLP Seminar]
Mutual Information Alleviates Hallucinations in Abstractive Summarization (EMNLP 2022)

ChatGPT가 Abstract Summarize 해주는 Abstractive Summarization 모델 논문 (머가 더 성능 좋을까 🫢)

Abstract / Introduction

Background
- hallucination : 긴 문서를 짧은 요약으로 생성하는 추상적 요약(abstractive summarization) 작업에서 자주 발생하는 문제
Limitation
- 기존의 모델들은 원본 문서에서 나타나지 않는 내용을 생성하는 경향이 있음
- 이로 인해 잘못된 정보를 전달하게 되는데, 이를 방지하기 위해 이전 연구에서 다양한 시도가 있었지만, 효율적이고 robust한 기술은 없었다고 함
Our Approach
- 모델이 hallucination을 생성할 가능성이 높아지는 간단한 기준을 제시
  - 모델의 불확실성(uncertainty)이 높을 때 더 높은 확률로 나타나지 않는 내용을 생성하는 경향이 있다는 것
  - 이러한 현상은 모델이 훈련 데이터에서 빈번하게 나타난 토큰(token)에 대해 우선순위를 두기 때문에 발생할 수 있다는 가설에 기인
- 따라서 저자들은 모델이 uncertainty을 보일 때, target token의 확률 뿐만 아니라 소스 문서와 target token 사이의 pointwise mutual information(PMI)을 최적화하는 디코딩 전략을 제시 ~

https://arxiv.org/abs/2210.13210

Mutual Information Alleviates Hallucinations in Abstractive Summarization

Despite significant progress in the quality of language generated from abstractive summarization models, these models still exhibit the tendency to hallucinate, i.e., output content not supported by the source document. A number of works have tried to fix-

arxiv.org

https://github.com/VanderpoelLiam/CPMI

GitHub - VanderpoelLiam/CPMI: Mutual Information Predicts Hallucinations in Abstractive Summarization

Mutual Information Predicts Hallucinations in Abstractive Summarization - GitHub - VanderpoelLiam/CPMI: Mutual Information Predicts Hallucinations in Abstractive Summarization

github.com

발표 이후

ChatGPT처럼 가짜 정보 생성하는 경우도 Hallucinations이라 볼 수 있음
Hallucinations이 왜 발생할까?
- 특정 training corpora에서의 문제를 말함 : ground-truth 요약은 (human summarization) 원문에서 추론할 수 없는 outside information을 포함함
- model architecture
  - 보통의 LM decoding objective : standard log-probability
  - 본 논문에서는 다른 decoding 전략 사용 ! -> Pointwise Mutual Information Decoding
Point Mutual Information Decoding

채채채채채히님 발표

https://velog.io/@chaehee/Mutual-Information-Alleviates-Hallucinations-in-Abstractive-Summarization-EMNLP-2022

728x90

'AI > NLP' 카테고리의 다른 글

[2023 Spring NLP Seminar] How to Adapt Your Pretrained Multilingual Model to 1600 Languages (ACL 2021) (4)	2023.04.06
[2023 Spring Lab Seminar] Efficient Dialogue State Tracking by Selectively Overwriting Memory (ACL 2020) (0)	2023.04.04
[2023 Spring NLP Seminar] BART: Denoising Sequence-to-Sequence Pre-training for NaturalLanguage Generation, Translation, and Comprehension (ACL 2020) (0)	2023.03.21
[2023 Spring NLP Seminar] SimCSE : Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021) 간단 리뷰 (0)	2023.03.15
How to write an effective GPT-3 prompt (0)	2023.02.13

ABOUT ME

세상은 내가 정하는 대로 세상은 내가 정하는 대로

Abstract / Introduction

발표 이후

'AI > NLP' 카테고리의 다른 글

티스토리툴바

ABOUT ME

Abstract / Introduction

발표 이후

'AI > NLP' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바