Skip to content

Commit

Permalink
Merge branch 'main' into git-status-check
Browse files Browse the repository at this point in the history
  • Loading branch information
LOGIC-10 committed Feb 27, 2024
2 parents 569f9f5 + 0f7e4bf commit 07c25d4
Show file tree
Hide file tree
Showing 25 changed files with 1,881 additions and 1,740 deletions.
8 changes: 5 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ __pycache__/
*/.DS_Store
*.py[cod]
*$py.class

chroma_db/
.vscode
.project_doc_record
# C extensions
*.so

Expand Down Expand Up @@ -171,8 +173,8 @@ prompt_output/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# IDE
.vscode
# VS Code
.vscode/

# RepoAgent
.project_doc_record
Expand Down
28 changes: 18 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,12 @@ Traditionally, creating and maintaining software documentation demanded signific

- **🤖 Automatically detects changes in Git repositories, tracking additions, deletions, and modifications of files.**
- **📝 Independently analyzes the code structure through AST, generating documents for individual objects.**
- **🔍 Accurate identification of inter-object invocation relationships, enriching the global perspective of document content.**
- **🔍 Accurate identification of inter-object bidirectional invocation relationships, enriching the global perspective of document content.**
- **📚 Seamlessly replaces Markdown content based on changes, maintaining consistency in documentation.**
- **🕙 Executes multi-threaded concurrent operations, enhancing the efficiency of document generation.**
- **👭 Offer a sustainable, automated documentation update method for team collaboration.**
- **😍 Display Code Documentation in an amazing way. (with document book per project powered by Gitbook)**


## 🚀 Getting Started

Expand Down Expand Up @@ -92,19 +94,20 @@ api_keys:
...
default_completion_kwargs:
model: gpt-4
model: gpt-4-1106
temperature: 0.2
request_timeout: 60
max_thread_count: int # We support multiprocessing to speedup the process
repo_path: /path/to/your/repo
project_hierarchy: .project_hierarchy # This is a folder, where we store the project hierarchy and metainfo. This can be shared with your team members.
Markdown_Docs_folder: Markdown_Docs # The folder in the root directory of your target repository to store the documentation.
ignore_list: ["ignore_file1.py", "ignore_file2.py", "ignore_directory"] # Ignore some py files or folders that you don't want to generate documentation for by giving relative paths in ignore_list.
whitelist_path: /path/of/whitelist_path_json #if you provide the whitelist json, will only process the given part. This is useful in a very big project, like "higgingface Transformers"
whitelist_path: /path/of/whitelist_path_json #if you provide the whitelist json with the same structure in Metainfo, RepoAgent will only process the given part. This is useful in a very big project, like "higgingface Transformers"
language: en # Two-letter language codes (ISO 639-1 codes), e.g. `language: en` for English. Refer to Supported Language for more languages.
max_thread_count: 10 # We support multiprocessing to speedup the process
max_document_tokens: 1024 # the maximum number of tokens in a document generated
log_level: info
```

### Run RepoAgent
Expand Down Expand Up @@ -162,13 +165,16 @@ After execution, RepoAgent will automatically modify the staged files in the tar

The generated document will be stored in the specified folder in the root directory of the target warehouse. The rendering of the generated document is as shown below:
![Documentation](assets/images/Doc_example.png)
![Documentation](assets/images/8_documents.png)

We utilized the default model **gpt-3.5-turbo** to generate documentation for the [**XAgent**](https://github.com/OpenBMB/XAgent) project, which comprises approximately **270,000 lines** of code. You can view the results of this generation in the Markdown_Docs directory of the XAgent project on GitHub. For enhanced documentation quality, we suggest considering more advanced models like **gpt-4** or **gpt-4-1106-preview**.
We utilized the default model **gpt-3.5-turbo** to generate documentation for the [**XAgent**](https://github.com/OpenBMB/XAgent) project, which comprises approximately **270,000 lines** of code. You can view the results of this generation in the Markdown_Docs directory of the XAgent project on GitHub. For enhanced documentation quality, we suggest considering more advanced models like **gpt-4-1106** or **gpt-4-0125-preview**.

**In the end, you can flexibly adjust the output format, template, and other aspects of the document by customizing the prompt. We are excited about your exploration of a more scientific approach to Automated Technical Writing and your contributions to the community.**

### Using chat with repo
### Exploring chat with repo
We conceptualize **Chat With Repo** as a unified gateway for these downstream applications, acting as a connector that links RepoAgent to human users and other AI agents. Our future research will focus on adapting the interface to various downstream applications and customizing it to meet their unique characteristics and implementation requirements.

Here we demonstrate a preliminary prototype of one of our downstream tasks: Automatic Q&A for Issues and Code Explanation. You can start the server by running the following code.
```bash
python -m repo_agent.chat_with_repo
```
Expand All @@ -182,7 +188,9 @@ python -m repo_agent.chat_with_repo
- [x] Automatically generate better visualizations such as Gitbook
- [ ] Generate README.md automatically combining with the global documentation
- [ ] **Multi-programming-language support** Support more programming languages like Java, C or C++, etc.
- [ ] Local model support like Llama, chatGLM, Qianwen, GLM4, etc.
- [ ] Local model support like Llama, chatGLM, Qwen, GLM4, etc.
- [X] Automatically generate Gitbook for better visualization effects


## 🇺🇳 Supported Language

Expand Down Expand Up @@ -227,11 +235,11 @@ Set the target language with the two-letter language codes (ISO 639-1 codes), Cl

```bibtex
@misc{RepoAgent,
author = {Qinyu Luo, Yining Ye, Shihao Liang, Arno},
author = {Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Arno, Yang Li},
title = {RepoAgent: A LLM-based Intelligent tool for repository understanding and documentation writing},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LOGIC-10/RepoAgent}},
howpublished = {\url{https://github.com/OpenBMB/RepoAgent}},
}
```
31 changes: 23 additions & 8 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,17 @@ RepoAgent是一个由大型语言模型(LLMs)驱动的开源项目,旨在

传统上,创建和维护软件文档需要大量的人力和专业知识,这对于没有专门人员的小团队来说是一个挑战。大型语言模型(LLMs)如GPT的引入改变了这一情况,使得AI能够处理大部分文档编写过程。这种转变使得人类开发人员可以专注于验证和微调修改,极大地减轻了文档编写的人工负担。

**🏆 我们的目标是创建一个智能的文档助手,帮助人类阅读并理解repo项目,并生成文档,最终帮助人类提高效率、节省时间。**
**🏆 我们的目标是创建一个智能的文档助手,自动生成和维护文档,并帮助人类阅读并理解repo项目,最终帮助人类提高效率、节省时间。**

# 🪭 特性

- **🤖 自动检测Git仓库中的变更,跟踪文件的添加、删除和修改。**
- **📝 通过AST独立分析代码结构,为各个对象生成文档。**
- **🔍 精准识别对象间调用关系,丰富文档内容的全局视野**
- **📝 通过深度递归+AST独立分析代码结构,为各个对象生成文档。**
- **🔍 精准识别对象间双向调用关系,丰富文档内容的全局视野**
- **📚 根据变更无缝替换Markdown内容,保持文档的一致性。**
- **🕙 执行多线程并发操作,提高文档生成的效率。**
- **👭 为团队协作提供可持续、自动化的文档更新方法。**
- **😍 美观的文档书(Gitbook)展示**

# 📦 安装
首先,确保您的机器安装了python3.9以上的版本
Expand Down Expand Up @@ -66,12 +67,15 @@ Markdown_Docs_folder: Markdown_Docs # 目标存储库根目录中用于存储文
ignore_list: ["ignore_file1.py", "ignore_file2.py", "ignore_directory"] # 通过在ignore_list中给出相对路径来忽略一些您不想为其生成文档的py文件或文件夹

language: zh # 双字母语言代码(ISO 639-1 代码),例如 `language: en` 表示英语,有关更多语言,请参阅支持的语言
max_thread_count: 10 # 我们支持多线程执行来加速文档生成过程
max_document_tokens: 1024 # 每一个对象文档(如类、函数)允许的最大长度
log_level: info # log信息显示等级
```
## 运行RepoAgent
进入RepoAgent根目录,在命令行输入以下命令:
```
python repo_agent/runner.py
```sh
python -m repo_agent
```
如果您是第一次对目标仓库生成文档,此时RepoAgent会自动生成一个维护全局结构信息的json文件,并在目标仓库根目录下创建一个名为Markdown_Docs的文件夹,用于存放文档。
全局结构信息json文件和文档文件夹的路径都可以在`config.yml`中进行配置。
Expand Down Expand Up @@ -117,11 +121,22 @@ RepoAgent hook会在git commit时自动触发,检测前一步您git add的文

生成的文档将存放在目标仓库根目录下的指定文件夹中,生成的文档效果如下图所示:
![Documentation](assets/images/Doc_example.png)
![Documentation](assets/images/8_documents.png)


我们使用默认模型**gpt-3.5-turbo**对一个约**27万行**的中大型项目[**XAgent**](https://github.com/OpenBMB/XAgent)生成了文档。您可以前往XAgent项目的Markdown_Docs文件目录下查看生成效果。如果您希望得到更好的文档效果,我们建议您使用更先进的模型,如**gpt-4****gpt-4-1106-preview**
我们使用默认模型**gpt-3.5-turbo**对一个约**27万行**的中大型项目[**XAgent**](https://github.com/OpenBMB/XAgent)生成了文档。您可以前往XAgent项目的Markdown_Docs文件目录下查看生成效果。如果您希望得到更好的文档效果,我们建议您使用更先进的模型,如**gpt-4-1106****gpt-4-0125-preview**

**最后,您可以通过自定义Prompt来灵活调整文档的输出格式、模板等方面的效果。 我们很高兴您探索更科学的自动化Technical Writing Prompts并对社区作出贡献。**

### 探索 chat with repo

我们将与仓库对话视为所有下游应用的统一入口,作为连接RepoAgent与人类用户和其他AI智能体之间的接口。我们未来的研究将探索适配各种下游应用的接口,并实现这些下游任务的独特性和现实要求。

在这里,我们展示了我们的下游任务之一的初步原型:自动issue问题解答和代码解释。您可以通过在终端运行以下代码启动服务。
```bash
python -m repo_agent.chat_with_repo
```

# ✅ 未来工作

- [x] 对象间父子关系层级结构识别及维护
Expand Down Expand Up @@ -174,12 +189,12 @@ RepoAgent hook会在git commit时自动触发,检测前一步您git add的文
# 📊 引用我们
```bibtex
@misc{RepoAgent,
author = {Qinyu Luo, Yining Ye, Shihao Liang, Arno},
author = {Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Arno, Yang Li},
title = {RepoAgent: A LLM-based Intelligent tool for repository understanding and documentation writing},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LOGIC-10/RepoAgent}},
howpublished = {\url{https://github.com/OpenBMB/RepoAgent}},
}
```

Binary file added assets/images/8_documents.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified assets/images/Doc_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified assets/images/RepoAgent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 11 additions & 6 deletions config.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,27 @@ api_keys:
- api_key: sk-XXXX
base_url: https://example.com/v1/
model: gpt-4
gpt-4-1106:
- api_key: sk-XXXX
base_url: https://example.com/v1/
model: gpt-4-1106
gpt-4-32k:
- api_key: sk-XXXX
base_url: https://example.com/v1/
api_type: XXX
api_version: XXX
engine: gpt4-32
gpt-4-1106:
- api_key: sk-XXXX
base_url: https://example.com/v1/
model: gpt-4-1106
gpt-4-0125-preview:
- api_key: sk-XXXX
base_url: https://example.com/v1/
model: gpt-4-0125-preview

default_completion_kwargs:
model: gpt-3.5-turbo
temperature: 0.2
request_timeout: 60

max_thread_count: 5


repo_path: /path/to/your/local/repo
project_hierarchy: .project_doc_record # Please NOTE that this is a folder where you can store your project hierarchy and share it with your team members.
Expand All @@ -42,5 +46,6 @@ ignore_list: ["ignore_file1.py", "ignore_file2.py", "ignore_directory"] # option
whitelist_path: #if whitelist_path is not none, We only generate docs on whitelist

language: zh

max_thread_count: 5
max_document_tokens: 1024 # the maximum number of tokens in a document generated
log_level: info
4 changes: 2 additions & 2 deletions display/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ RESET := $(shell tput -Txterm sgr0)
# We need nodejs 10.x to run gitbook, this target will install nodejs 10.x
################################################################################
env_install:
chmod +x ./scripts/install-nodejs.sh
./scripts/install-nodejs.sh
chmod +x ./scripts/install_nodejs.sh
./scripts/install_nodejs.sh

## init nodejs 10.x env
init_env: env_install
Expand Down
7 changes: 4 additions & 3 deletions display/README_DISPLAY.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,12 @@ Tasks:

```

其中,`make init_env` 还在测试,您可以自己根据自身系统,安装 nodejs 10。
其中,您可以直接使用`make init_env`进行nvm和nodejs 10 的安装,也可以自己根据自身系统,自行安装 nodejs 10。
如果是Windows系统,您可以使用管理员权限打开命令行,然后输入命令。

然后您可以依次进行 `make init` 初始化 gitbook 运行环境(make init 运行一次即可)。

环境准备妥当后,您可以多次执行 `make generate`,更改相关配置或者`book.json`后,只需重新运行`make generate` 即可重新部署。
环境准备妥当后,您可以多次执行 `make serve`,更改相关配置或者`book.json`后,只需重新运行`make serve` 即可重新部署。

成功后命令行输出如下所示:

Expand All @@ -52,7 +53,7 @@ Serving book on http://localhost:4000

## Future TODO List:

[ ] 一键自动创建环境
[] 一键自动创建环境

[ ] (本地创建环境不好弄的话)docker 一键部署 gitbook 以及上传

Expand Down
5 changes: 5 additions & 0 deletions display/scripts/install_nodejs.sh
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
#!/bin/bash

export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion" # This loads nvm bash_completion


# 检查是否已经安装了 nvm
check_nvm_installed() {
if [ -s "$NVM_DIR/nvm.sh" ]; then
Expand Down
Loading

0 comments on commit 07c25d4

Please sign in to comment.