Augmenting LLM Through external Knowledge - RAG (Retrieval Augment Generation)

AI/NLP

Augmenting LLM Through external Knowledge - RAG (Retrieval Augment Generation)

땽뚕 2024. 4. 22. 10:31

728x90

Augmenting LLM Through external Knowledge - RAG (Retrieval Augment Generation)

LLM이 가진 한계점 중 하나
- 최신 지식이나 특정 사례 또는 개인 정보에 대한 활용 적음
위 문제를 해결하기 위한 방법으로 Retrieval Augmented Generation, RAG 활용!

RAG (Retrieval Augmented Generation)란?

RAG는
- 입력 프롬프트에서 쿼리를 추출하고, 해당 쿼리를 사용하여 외부 지식 소스에서 관련 정보를 검색하는 과정 포함
- 검색된 관련 정보는 원래의 프롬프트에 추가되어 LLM에 제공되며, 모델은 이를 바탕으로 최종 응답을 생성합니다. RAG 시스템은 검색, 생성, 보강의 세 가지 중요한 요소로 구성됨

Fine-Tuning, Prompt Engineering과의 차이점
- 1) 요구되는 외부 지식과 2) Model Adaptation 필요 정도에 따라 나눌 수 있겠다
- (1) Prompt Engineering
  - 외부 지식과 모델 적응의 최소 필요성을 가지고 모델의 고유한 능력을 활용
- (2) RAG
  - 정보 검색을 위한 맞춤형 교과서를 모델에 제공하는 것과 유사하며, 정확한 정보 검색 작업에 이상적
- (3) Fine Tuning
  - 학생이 시간이 지남에 따라 지식을 내면화하는 것과 비슷하며, 특정 구조, 스타일 또는 형식을 재현해야 하는 시나리오

RAG의 종류

1) Naive RAG

ChatGPT가 주목 받기 시작한 후 초기의 방법론
indexing, retrieval, generation과 같은 과정들을 포함함
- 1) Indexing
  - PDF, HTML, Word, Markdown과 같은 다양한 형식의 원시 데이터를 정리하고 추출하는 것으로 시작
  - 통일된 일반 텍스트 형식으로 변환 -> 텍스트를 Chunk로 나눠서 -> 임베딩 모델을 사용하여 벡터 표현으로 인코딩되고 벡터 데이터베이스(VectorDB)에 저장
- 2) Retrieval
  - 사용자 쿼리를 받은 RAG 시스템은 색인 생성 단계에서 사용된 것과 동일한 인코딩 모델을 사용하여 쿼리를 벡터 표현으로 변환
  - 쿼리 벡터와 indexing된 corpus 내의 Chunk된 벡터 간의 유사성 점수를 계산
  - 가장 유사성이 높은 상위 K 개의 Chunk를 우선적으로 검색한다
- 3) Generation
  - 제시된 쿼리와 선택된 문서는 하나의 프롬프트로 합쳐지고 LLM에 입력으로 넣어짐
단점
- Retrieval 단계에서 전혀 연관이 없는 Chunk를 고를 가능성이 있음

2) Advanced RAG

Retrieval 단계에서의 Naive RAG의 한계를 극복하기 위해 1) Pre-retrieval process과 2) Post-retrieval Process를 가짐
1) Pre-retrieval Process
- Indexing되는 Content의 질을 향상 시키기!
- 활용할 수 있는 방식 : enhancingdata granularity, optimizing index structures, adding metadata,alignment optimization, and mixed retrieval
2) Post-retrieval Process
- 관련된 Context가 검색되었으면, Query에 효율적으로 합쳐주는 거 중요하다
- 활용할 수 있는 방식 : rerank chunks, context compressing
  - Reranking은
    - 프롬프트의 가장자리에 가장 관련성 높은 콘텐츠를 재배치하는 것
    - LlamaIndex, LangChain, HayStack에서 이렇게 활용되더라
- 필수 정보를 선택하고, 중요한 부분을 강조하며, 처리할 맥락을 단축하는 데 집중

3) Modular RAG

위의 두 가지 전략을 기반으로 각각의 모듈들의 방식을 좀 더 바꿔본다든가 ... 그런 접근 방식

RAG Tools 비교 - LlamaIndex VS LangChain

Related Papers

https://arxiv.org/abs/2312.10997

Retrieval-Augmented Generation for Large Language Models: A Survey

Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Retrieval-Augmented Generation (RAG) has emerged as a promising solution by

arxiv.org

https://arxiv.org/abs/2402.06196

Large Language Models: A Survey

Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022. LLMs' ability of general-purpose language understanding and generation is a

arxiv.org

https://arxiv.org/abs/2005.11401

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowledge is still lim

arxiv.org

728x90