[2023 Spring NLP Seminar] SimCSE : Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021) 간단 리뷰

AI/NLP 2023. 3. 15. 14:46

728x90

[2023 Spring NLP Seminar] SimCSE : Simple Contrastive Learning of Sentence Embeddings
(EMNLP 2021) 간단 리뷰

Abstract / Introduction

SimCSE: sentence embedding을 advance하는 simple한 contrastive learning framework
- 이전에 있었던 것들: SBERT (Reimers and Gurevych, 2019) 등
- unsupervised approach
  - positive pair: 같은 문장에 한 번씩 standard dropout을 거친 2개의 문장
  - negative pair는 같은 mini-batch 안의 다른 문장들
  - 이렇게 간단한 방법이 NSP나 discrete data augmentation (word deletion, word replacement etc.) 보다 나음
  - dropout이 마치 data augmentation처럼 작동하고, dropout이 없으면 representation collapse가 일어남
- supervised approach
  - 원래는 3-label (entailment, neutral, contradiction)인 natural language inference dataset을 이용
  - positive pair: entailment
  - negative pair: contradiction

발표 이후

dropout은 data augmentation의 일종으로 볼 수도 있음
- noise가 생기기 때문 ~
- discrete augmentation (crop, word deletion, word replacement 등) 해서 BERT-base에 finetuning함
- discrete augmentation보다 simple dropout 적용한 게 performance가 낫더라
word embedding ↔ sentence embedding
- 동음이의어처럼 맥락을 고려해야 하는 경우, anisotropy한 분포가 필요할 수도 있다 ? (으로 밝혀졌다? )
  - 하지만 모든 단어가 조밀하게 분포되어 있으면, distance 계산에 문제가 생긴다
- 그래서 sentence embedding을 할 때 이 문제를 해결하기 위해선 Isotropy하게 해야 할 필요 느낌 !
Flickr30k dataset에서 random sentence를 query로 넣어서 비슷한 문장 retrieve하는 task 했을 때, SBERT보다 SimCSE가 나았음
contrastive learning + BERT-base로 SOTA 달성 !
- simple한 semi supervised approach: next sentence prediciton보다도 더 simple한데도 더 효과적임!
- 유사한 단어는 가까운 임베딩 공간에 위치하도록 학습 : multilingual에서 활용해볼까 ?

# MultipleNegativesRankingLoss from SBERT

    def __init__(self, model: SentenceTransformer, scale: float = 20.0, similarity_fct = util.cos_sim):
        """
        :param model: SentenceTransformer model
        :param scale: Output of similarity function is multiplied by scale value
        :param similarity_fct: similarity function between sentence embeddings. By default, cos_sim. Can also be set to dot product (and then set scale to 1)
        """
        super(MultipleNegativesRankingLoss, self).__init__()
        self.model = model
        self.scale = scale
        self.similarity_fct = similarity_fct
        self.cross_entropy_loss = nn.CrossEntropyLoss()


    def forward(self, sentence_features: Iterable[Dict[str, Tensor]], labels: Tensor):
        reps = [self.model(sentence_feature)['sentence_embedding'] for sentence_feature in sentence_features]
        embeddings_a = reps[0]
        embeddings_b = torch.cat(reps[1:])

        scores = self.similarity_fct(embeddings_a, embeddings_b) * self.scale
        labels = torch.tensor(range(len(scores)), dtype=torch.long, device=scores.device)  # Example a[i] should match with b[i]
        return self.cross_entropy_loss(scores, labels)
        
#### or simply, ####
from sentence_transformers import SentenceTransformer, losses, InputExample
train_loss = losses.MultipleNegativesRankingLoss(model=model)

forward()를 보면 두 개의 임베딩의 cosine similarity를 계산하여 loss를 계산해준다

민한님 발표

https://velog.io/@zvezda/SimCSE-Simple-Contrastive-Learning-of-Sentence-Embeddings-EMNLP-2021

SimCSE: Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021)

Abstract SimCSE: sentence embedding을 advance하는 simple한 contrastive learning framework 처음에는 input sentence에서 자신을 predict함 (contrastive objective) + sta

velog.io

728x90

'AI > NLP' 카테고리의 다른 글

[2023 Spring NLP Seminar] Mutual Information Alleviates Hallucinations in Abstractive Summarization (EMNLP 2022) (0)	2023.03.28
[2023 Spring NLP Seminar] BART: Denoising Sequence-to-Sequence Pre-training for NaturalLanguage Generation, Translation, and Comprehension (ACL 2020) (0)	2023.03.21
How to write an effective GPT-3 prompt (0)	2023.02.13
Training language models to follow instructions with human feedback (a.k.a. InstructGPT, 2022) 리뷰 (0)	2023.01.26
[Prompting] It’s Not Just Size That Matters:Small Language Models Are Also Few-Shot Learners 정리 (NAACL, 2021) (2)	2023.01.11

ABOUT ME

세상은 내가 정하는 대로 세상은 내가 정하는 대로

Abstract / Introduction

발표 이후

'AI > NLP' 카테고리의 다른 글

티스토리툴바

ABOUT ME

Abstract / Introduction

발표 이후

'AI > NLP' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바