Feature/issue 25/create chat with repo compoent #38

innovation64 · 2024-01-13T08:51:49Z

refactor: refactor the code
style: change the gradio UI
fix: remove useless files
fix: add config.yml

Umpire2018 · 2024-01-13T09:04:09Z

repo_agent/chat_with_repo/rag.py

+        self.logger.debug(f"Results: {results}")
+        chunkrecall = self.extract_and_format_documents(result)
+        retrieved_documents = results['documents'][0]
+        response = self.rag(prompt,retrieved_documents)


还有因为没有切块，所以每个索引的块内容都很长，不适合多个结果一块送入，会超过4096token,大概召回3个左右就 24900个左右了，所以把相关的都放进embedding recall里面了，没有再经过LLM做最后的输出。

所以目前的做法是只将 Top 1 的 rag 结果，即 md 文档中的内容给到大模型吗?

目前是这样的

Umpire2018 · 2024-01-13T09:14:47Z

repo_agent/chat_with_repo/main.py

+    assistant = RepoAssistant(api_key, api_base, db_path,log_file)
+    md_contents = assistant.json_data.extract_md_contents()
+    assistant.chroma_data.create_vector_store(md_contents)
+    GradioInterface(assistant.respond)


这个写法我持保留意见，我觉得当前 main 的目的是清晰地表达调用过程，但这行将以下过程进行了封装 1. 类实例化 2. 传递 RepoAssistant 类方法名 3. 将 assistant.respond 方法交由 GradioInterface 类执行，即与清晰的目的不一致。

这四部指的是1，传入相关配置，2.获取md内容3，放到向量数据库，4启动查询UI，主要rag已经封装了，上次说的能单例化的我基本都单例开了个类

我是针对 GradioInterface(assistant.respond) 这一行感觉 too heavy，但只是个人意见，Git 默认是显示前四行。

Umpire2018 · 2024-01-13T09:16:47Z

repo_agent/chat_with_repo/rag.py

+        chroma_data = ChromaManager(api_key, api_base)
+        self.textanslys = textanslys
+        self.json_data = json_data
+        self.chroma_data = chroma_data


如果我的话应该会这么写：self.chroma_data = ChromaManager(api_key, api_base)

好的，我简化一下

repo_agent/chat_with_repo/rag.py

Umpire2018 · 2024-01-15T15:57:29Z

repo_agent/chat_with_repo/logger.py

+        file_handler.setLevel(log_level)
+        formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
+        file_handler.setFormatter(formatter)
+        self.logger.addHandler(file_handler)


在下个版本中使用 from loguru import logger 吧，是一个简化了 Logging 配置的库

在下个版本中使用 from loguru import logger 吧，是一个简化了 Logging 配置的库

好的，下次更新就换掉

innovation64 added 2 commits January 13, 2024 12:34

fix: refactor chat_with_repo code

38cae67

style: change the space and file name

d36e677

Umpire2018 requested a review from LOGIC-10 January 13, 2024 08:57

Umpire2018 reviewed Jan 13, 2024

View reviewed changes

repo_agent/chat_with_repo/rag.py Show resolved Hide resolved

fix: fix the grammar mistake of space problem

84aa320

Umpire2018 merged commit c684dac into OpenBMB:chat_with_repo Jan 15, 2024

Umpire2018 reviewed Jan 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/issue 25/create chat with repo compoent #38

Feature/issue 25/create chat with repo compoent #38

innovation64 commented Jan 13, 2024

Umpire2018 Jan 13, 2024

innovation64 Jan 13, 2024

Umpire2018 Jan 13, 2024

innovation64 Jan 13, 2024

Umpire2018 Jan 13, 2024

Umpire2018 Jan 13, 2024

innovation64 Jan 13, 2024

Umpire2018 Jan 15, 2024

innovation64 Jan 15, 2024

Feature/issue 25/create chat with repo compoent #38

Feature/issue 25/create chat with repo compoent #38

Conversation

innovation64 commented Jan 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment