-
[2023 Spring Lab Seminar] When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer (NAACL, 2022)AI/NLP 2023. 4. 25. 18:39728x90
[2023 Spring Lab Seminar] When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer (NAACL, 2022)
Abstract + Introduction
- Research Goal
- 어떤 Property가 cross-lingual zero shot transfer에 좋은 영향을 주냐 ?
- Method
- 16 natural-synthetic language pair
- Languages : 1) English, 2) French, 3) Arabic, and 4) Hindi
- 4 Downstream Tasks : 1) NLI, 2) NER, 3) POS, and 4) QA
- Transformation: 1) Inverting 2) Permuting, 3) Altering, and 4) Varying syntax
- 16 natural-synthetic language pair
- Findings
- The absence of subword overlap degrades transfer when languages differ in their word order.
- A strong correlation between token embedding alignment and zero-shot transfer.
- Using pre-training corpora from similar sources boosts transfer when compared to corpora from different sources.
Results
- Similar word Orders
- Sub-word Overlap
- Same Domain
- Embedding Alignment
- Zero-shot performance is strongly correlated with embedding
alignment
- Zero-shot performance is strongly correlated with embedding
https://aclanthology.org/2022.naacl-main.264/
728x90'AI > NLP' 카테고리의 다른 글
- Research Goal