Spring AI RAG

Mr.LR2024年12月6日

大约 2 分钟

Spring AI RAG

RAG是什么

**RAG（Retrieval-Augmented Generation，检索增强生成）**是一种结合信息检索（Retrieval）和生成式AI（Generation）的技术，常用于提升大语言模型（LLM）的准确性和信息完整性。它的核心思想是：

检索（Retrieval）：从外部知识库（如数据库、文档、API、向量存储等）中检索与用户查询相关的信息。
生成（Generation）：将检索到的信息提供给大语言模型，使其基于最新或更准确的数据生成答案，而不是仅依赖其内部训练数据。

也就是大模型可以根据我们提供的内部文档，知识库等，提供更准确的回答。

QuestionAnswerAdvisor

矢量数据库存储 AI 模型不知道的数据。将用户问题发送到 AI 模型时，QuestionAnswerAdvisor 会在向量数据库中查询与用户问题相关的文档。

借助之前的案例，我们把本站oracle索引优化存储到向量数据库中，通过向 ChatClient 提供 QuestionAnswerAdvisor 的实例来执行检索增强生成（RAG）

@RequestMapping(value = "/chat/rag", produces =  "text/event-stream;charset=UTF-8")
    public Flux<String> chatStreamWithDatabase(@RequestParam String prompt) {
        // 1. 定义提示词模板，question_answer_context会被替换成向量数据库中查询到的文档。
        String promptWithContext = """
                下面是上下文信息
                ---------------------
                {question_answer_context}
                ---------------------
                给定的上下文和提供的历史信息，而不是事先的知识，回复用户的意见。如果答案不在上下文中，告诉用户你不能回答这个问题。
                """;
        Flux<ServerSentEvent<String>> message = ChatClient.create(chatModel).prompt()
                .user(prompt)
                // 2. QuestionAnswerAdvisor会在运行时替换模板中的占位符`question_answer_context`，替换成向量数据库中查询到的文档。此时的query=用户的提问+替换完的提示词模板;
                .advisors(new QuestionAnswerAdvisor(elasticsearchVectorStore, SearchRequest.builder().build(), promptWithContext))
                .stream()
                // 3. query发送给大模型得到答案
                .content()
                .map(chatResponse -> ServerSentEvent.builder(chatResponse)
                        .event("message")
                        .data(chatResponse)
                        .build());
        return message.map(ServerSentEvent::data);
    }