[Story Generation Study Week 02 : Story Generation & Story Completion] Event Representations for Automated Story Generation with Deep Neural Nets (AAAI, 2018) Review

AI/NLP 2022. 6. 29. 16:16

728x90

[Story Generation Study Week 02 : Story Generation & Story Completion]
Event Representations for Automated Story Generation with Deep Neural Nets (AAAI, 2018)
Review

[Story Generation Study Week 02 : Story Generation & Story Completion]

[commonsense] A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories (NAACL, 2016)
[event] Event Representations for Automated Story Generation with Deep Neural Nets (AAAI, 2018)
Strategies for structuring story generation (ACL, 2019)

이번 스터디에서는 비교적 초반의 Story Generation 논문들을 살펴본다.

Abstract & Introduction

Automated Story Generation 이란?
- 정의 : event, action, word들의 연속열, 즉 story를 선택하는 것

⇩

이전 연구들 [1]
- 대부분의 story generation system은 symbolic planning이나 case based reasoning을 사용했다
  - 한계 : Domain Knowledge로 커버할 수 있는 Topic들만 서술할 수 있다 / 사람의 지식에 의존할 뿐, 알고리즘이나 엔지니어링의 결과가 아니였다

Open Story Generation [2]
- 이 Open Story Generation라는 분야는 어떠한 topic의 story이라도 다 생성할 수 있도록 한다 ! ( ↔ 이전의 manual domain knowledge만을 커버할 수 있는 것과 달리.. )

⇩

본 논문에서는 ... [3]
- Recurrent encoder-decoder NN (e.g. Seq2Seq)으로 위의 Open Story Generation 문제를 풀어나가겠다!
- 데이터는 Wikipedia에서 추출한 영화 플롯 요약본 Corpus 사용 → 가능한 많은 topic 커버하기 위해서

본 논문의 저자들의 아이디어 [4,5]
- 한계점
  - 한계 1 : Character나 Word Level에서 model를 구축하게 된다면 문법적으로 맞는 문장들을 만들어내겠지만, 문장들 간의 일치하는 서사들을 만들어낼 수 없다
  - 한계 2: 그렇다고 Sentence Level events는 각 event간의 어떠한 상관관계 찾기 어려울 수 있다
    - 대규모의 corpus를 가지고 있더라도 각 문장 다 한번씩만 존재할 가능성 높다 → Event Sparsity 문제 !
- 일관된 story 만들기 ? → "Event Representation!"
  - 이미 존재하는 story의 문장들에서 기본적인 semantic information을 추출할 수 있다면, 이를 기반으로 좋은 스토리는 어떤 구조를 가지고 있는지 학습할 수 있을 것이다!
  - 그리고 그 template들을 가지고 새로운 story들을 생성해낼 수 있을 것!

[Event Sparsity 문제란?]

["최불암이 손자랑 놀고 있었다.", "손자 : 굿모닝 ~ ", .....]

⇩

1. <"최불암","놀다", ø ,"손자와"> , <"손자","말하다",ø,"굿모닝">, ...
2. <"Entity 1","놀다", ø ,"손자와"> , <"Entity 2","말하다",ø,"굿모닝">, ....

→ 1번과 같이 Event를 표현할 경우 Event가 한번씩만 등장할 가능성 높지만,
2번과 같이 추상적으로 표현할 경우 (최불암 → Entity1) 동일한 Event의 등장 가능성 높아짐 (Event Sparsity 해결)

⇩

어떻게 Dataset의 문장에 있는 기본적인 semantic information을 잘 추출해낼 것이냐?
잘 표현하기 위한 Event Representation 찾을 것 !

⇩

그 Event Representation을 통해 새로운 story 만들어낼 것이다

Contributions [6~8]
1. An event represenation, event2event, and event2sentence [6,7]
  - event2event : 하위의 여러 event represenetation 생성
  - event2sentence : 추상적인 event → 인간이 읽을 수 있는 자연스러운 문장으로 translate
2. event2event + event2sentence 모델을 합친 전체 story generation pipeline [8]

[세 줄 요약]

1. Automated Story Generation 중에서도 Open Story Generation를 할 건데, 일관된 story를 유지하기 위한 방법으로 Event Representation을 사용할 것이다. Event Representation가 뭔지,

2. Event에서 Event를 추출해 새로운 Event representation를 생성하는 Event2Event 모델과,

3. 새로운 Event에서 자연스러운 Event2Sentence 모델을 뒤에서 설명해줄게 ~!

Event Representation

: 이 section에서는 Event Representation이라는 핵심 개념에 대해서 설명해준다.

: 이 section에서는 input의 abstraction 정도에 따라 성능이 어떻게 달라지는지, 의미를 잘 담는지 볼 것이다!

Event Representation 구성
- 4-tuple <s, v, o, m> (stemming 된 것들로... )
  - s : subject
  - v : verb
  - o : object
  - m : modifier
- object나 modifier 없으면 ø(EmptyPararmeter)으로 둔다

Event Representation 추출
- 각 문장에서 한 개 이상의 event들 뽑아낼 수 있다
- “John and Mary went to the store,”
  - <john, go, store, ∅>
  - <mary, go, store, ∅>
- 각 문장 내의 event 개수의 평균 = 2.69
- 각 story 내의 문장 개수의 평균 = 14.515

Event Representation의 Variations : abstraction의 수준을 높여본다! (sparsity 감소)
1. <s, v, o, m> : 가장 간단한 형태
  - <최불암, 놀다, ∅ , 손자와>
2. Generalized
  - “PERSON” names were replaced with the tag <NE>n
    - <최불암, 놀다, ∅ , 손자와> → <NE1, 놀다, ∅ , NE2>
  - Other named entities were labeled as their NER category (e.g. LOCATION,
    ORGANIZATION, etc.).
3. Named Entity Numbering
  - 본 논문에서는 두 방식으로 진행했다 함
    1. 매 문장마다 named entity numbering 리셋 (sentence NEs)
    2. 모든 input-output pair 마다 named entity numbering 리셋 (continued NEs)
4. Adding Genre Information
  - 5-tuple <s, v, o, m, g>
    - <최불암, 놀다, ∅ , 손자와> → <최불암, 놀다, ∅ , 손자와, 유머>
  - g : genre cluster number (LDA로 topic modeling해서 100개 추출)

⇒ 최종적인 Story는 위와 같이 표현되는 event representation의 sequence로 나타낼 수 있다!

⇩

⇒ 즉 다음 event의 probability를 maximize하는 과정을 통해 story generation할 수 있을 것이다 !

Event-to-Event

: 이 section에서는 위에서 정의한 각기 다른 event representation이 Event2Event 모델에 들어갔을 때, 어떻게 성능이 달라지는가를 볼 것임

: 성능 평가 방법 - 한 story에서 input Event x가 들어갔을 때, 다음 event인 y를 얼마나 잘 예측하느냐 ? / Perplexity 통해...

model :a recurrent multi-layer encoderdecoder network based on (Sutskever, Vinyals, and Le 2014) - Seq2seq 모델
input :

output :

metrics: Perplexity / BLEU

1) Perplexity

: PPL은 이 언어 모델이 특정 시점에서 평균적으로 몇 개의 선택지를 가지고 고민하고 있는지를 의미 ('헷갈리는 정도'로 이해 가능)

: PPL은 수치가 높으면 좋은 성능을 의미하는 것이 아니라, '낮을수록' 언어 모델의 성능이 좋다는 것을 의미한다

2) BLEU

: 기계 번역 결과(prediction)와 사람이 직접 번역한 결과(ground truth)가 얼마나 유사한지 비교하여 번역에 대한 성능을 측정하는 방법

: Generated Sentence의 단어가 Reference Sentence에 포함되는 정도

: 측정 기준은 n-gram에 기반 (본 논문에서는 n= 1~4)

n-gram을 통한 순서쌍들이 얼마나 겹치는지 측정(precision)
문장길이에 대한 과적합 보정 (Brevity Penalty)
같은 단어가 연속적으로 나올때 과적합 되는 것을 보정(Clipping)

실험 결과

original word events는 original sentences을 넣었을 때와 비슷한 Perplexity를 가지지만, <NE>tags로 generalize했을때 성능이 유의미하게 나아졌음을 알 수 있다 !
전체적으로 the generalized events가 더 나은 Perplexity score 결과를 보였다.
- Bi-gram으로 했을 때가 더 BLEU 값이 높아졌고,
- genre 정보를 추가했을때 perplexity가 낮아졌음을 알 수 있다
  - 한 문장의 모든 event가 다음 문장에 있는 모든 event들과 상관 관계를 잘 가질 수 있었다!
original word events는 BLEU score가 더 높게 나오고, 실험 전반에서 낮은 점수를 기록했는데, 이것은 BLEU가 올바른 성능 지표로 작용하지 않았기 때문!
- BLEU는 input을 얼마나 잘 recreation하는지를 판단하기 때문에 (우리는 recreation하는 것보단 다음 것을 잘 예측하는 게 중요하니까 !)
- 이에 비해 Perplexity 는 더 나은 metric인데, Perplexity는 전체 test dataset을 잘 예측하는지에 대해 연관되어있기 때문에 ...

Event-to-Sentence

: Event로는 사람이 직관적으로 이해할 수 없으니, 생성한 Events들을 sentence로 표현하는 모델이 바로 Event2Setence다!

: 성능 평가 방법 - Event를 얼마나 잘 자연어 문장으로 번역하는지?

Model : LSTM RNN - beam search decoder (B=5)
Input : the events of a particular representation
Output : a newly-generated sentence based on the input event(s).
Metrics : Perplexity / BLEU

[새로운 접근법 1 : Generalized Sentence]

generalized events를 자연어 문장으로 표현하는 것은 더 어려운 Task인데, 왜냐하면 <NE>로 표현된 인물의 이름, 장소 이름을 Model이 예측해야 하기 때문이다
- 그래서 본 논문에서는 새로운 방식을 고안했는데 event2sentence를 학습할 때 Generalized sentence로 한다 !
- 예시
  - Original Sentence : The remaining craft launches a Buzz droid at the ARC 1 7 0 which lands near the Clone Trooper rear gunner who uses a can of Buzz Spray to dislodge the robot.
  - Generalized Sentence : The remaining activity.n.01 launches a happening.n.01 droid at the ORGANIZATION 1 7 0 which property.n.01 near the person.n.01 enlisted person.n.01 rear skilled worker.n.01 who uses a instrumentality.n.03 ofhappening.n.01 chemical.n.01 to dislodge the device.n.01

[새로운 접근법 2: S+P Sentence(split and prune sentences) ]

한 문장 당 하나의 Event 대신에 한 문장당 여러 이벤트로 표현한다!
- 전치사구 없애고, 접속사 단위로 쪼개고, .... 등의 방식으로 문장을 잘게 쪼갠다
- 예시
  - Original Sentence : Lenny begins to walk away but Sam insults him so he turns
    and fires, but the gun explodes in his hand.
  - S+P Sentence : Lenny begins to walk away. Sam insults him. He turns and
    fires. The gun explodes.

실험 결과

여기서도 Perplexity는 Generalized Sentence에서 잘 나오지만,
splitting and pruning sentences 는 원래 단어가 지켜졌을 때 BLEU score가 더 높게 나온다

Generalized events with full-length generalized sentences는 the original words보다 BLEU 성능 더 높다
하지만 S+P sentence 썼을 때는 오히려 그 패턴 반대되더라 ~
S+P and word generalizing methods 모두 sparsity of events 문제를 줄여주기 때문에, 두 방식을 결합했을 땐 너무 많은 정보들이 사라진다

[세 줄 요약]

1. 문장에 담겨 있는 Event를 잘 추상화하기 위해 Event Representation을 사용했고, event representation의 추상화 정도에 따라 해당 event representation을 input으로 넣었을 때 성능이 달라지더라 ~

2. Event2Event 모델은 Seq2Seq을 활용하여 input sentence의 events에서 다음 문장의 events을 얼마나 잘 생성하는지를 측정했다

Future Work

다음 발표할 논문과 함께 돌아오도록 할 것이다,, 케케

다음 논문 링크 :

https://asidefine.tistory.com/194

Conclusions

Automated Story Generation 분야에서는 Event Representation이 중요하다 (제목에서 나타나있듯이 .. )
sparisty of event를 줄이면서도 story data의 semantic meaning을 유지할 수 있는 representation을 선택하는 방안을 제안했다
- sparisty of event이 story generation 성능을 낮추기 때문..
event2event / event2sentence으로 구성된 story generation pipeline 구축했다
- event2event : Data → event representation
- event2sentence : 추상적이고 일반화된 event representation → 자연어로 ..

공부 전 의문점
- 이 Story Generation 분야에서는 성능 평가를 어떻게 하지?
  - 얼마나 story가 자연스러운지, 재밌는지 이런 것은 정성적인 평가 방법일테고 ...

공부 후 답변
- 이 Story Generation 분야에서는 성능 평가를 어떻게 하지?
  - Perplexity와 BLEU 사용해서 !

Reference

https://arxiv.org/abs/1706.01331

Event Representations for Automated Story Generation with Deep Neural Nets

Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story. We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpo

arxiv.org

728x90

'AI > NLP' 카테고리의 다른 글

[논문 리뷰] Beyond goldfish memory: long-term open-domain conversation (0)	2022.06.30
[논문 리뷰] “I Have No Text in My Post” : Using Visual Hints to Model User Emotions in Social Media (WWW ‘22 / Best Paper Candidate) (0)	2022.06.30
[Story Generation Study Week 03 : Story Generation & Story Completion] Story Realization: Expanding Plot Events into Sentences (AAAI, 2020) Review (0)	2022.06.29
[Story Generation Study Week 01 : Fundamental of Text Generation] GPT-1 / GPT-2 Review & 스터디 메모 (0)	2022.06.29
[Story Generation Study Week 01 : Fundamental of Text Generation] GPT-3 : Language Models are Few-Shot Learners (2020) Review (0)	2022.06.28

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

인기포스트

ABOUT ME

세상은 내가 정하는 대로 세상은 내가 정하는 대로

⇩

⇩

⇩

⇩

⇩

⇩

'AI > NLP' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

인기포스트

ABOUT ME

⇩

⇩

⇩

⇩

⇩

⇩

'AI > NLP' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역