diff --git a/README.md b/README.md
index 814277d5..7e3c7e54 100644
--- a/README.md
+++ b/README.md
@@ -7,13 +7,17 @@ ChatGLM-6B 使用了和 ChatGPT 相似的技术，针对中文问答和对话进
 
 不过，由于 ChatGLM-6B 的规模较小，目前已知其具有相当多的[**局限性**](#局限性)，如事实性/数学逻辑错误，可能生成有害/有偏见内容，较弱的上下文能力，自我认知混乱，以及对英文指示生成与中文指示完全矛盾的内容。请大家在使用前了解这些问题，以免产生误解。更大的基于1300亿参数[GLM-130B](https://github.com/THUDM/GLM-130B)的ChatGLM正在内测开发中。
 
+欢迎体验 Huggingface Spaces 上的[在线演示](https://huggingface.co/spaces/ysharma/ChatGLM-6b_Gradio_Streaming)。
+
 *Read this in [English](README_en.md).*
 
 ## 更新信息
+
 **[2023/03/23]** 增加API部署（感谢 [@LemonQu-GIT](https://github.com/LemonQu-GIT)）。增加Embedding量化模型[ChatGLM-6B-INT4-QE](https://huggingface.co/THUDM/chatglm-6b-int4-qe)。增加对基于Apple Silicon的Mac上GPU加速的支持。
 
 **[2023/03/19]** 增加流式输出接口 `stream_chat`，已更新到网页版和命令行 Demo。修复输出中的中文标点。增加量化后的模型 [ChatGLM-6B-INT4](https://huggingface.co/THUDM/chatglm-6b-int4)
 
+
 ## 友情链接
 以下是部分基于本仓库开发的开源项目：
 * [ChatGLM-MNN](https://github.com/wangzhaode/ChatGLM-MNN): 一个基于 MNN 的 ChatGLM-6B C++ 推理实现，支持根据显存大小自动分配计算任务给 GPU 和 CPU
@@ -28,17 +32,17 @@ ChatGLM-6B 使用了和 ChatGPT 相似的技术，针对中文问答和对话进
 
 ### 硬件需求
 
-| **量化等级**    | **最低 GPU 显存** |
-| -------------- | ----------------- |
-| FP16（无量化）   | 13 GB             |
-| INT8           | 10 GB              |
-| INT4           | 6 GB               |
+| **量化等级** | **最低 GPU 显存** |
+| ------------------ | ----------------------- |
+| FP16（无量化）     | 13 GB                   |
+| INT8               | 10 GB                   |
+| INT4               | 6 GB                    |
 
 ### 环境安装
 
 使用 pip 安装依赖：`pip install -r requirements.txt`，其中 `transformers` 库版本推荐为 `4.26.1`，但理论上不低于 `4.23.1` 即可。
 
-### 代码调用 
+### 代码调用
 
 可以通过如下代码调用 ChatGLM-6B 模型来生成对话：
 
@@ -63,6 +67,7 @@ ChatGLM-6B 使用了和 ChatGPT 相似的技术，针对中文问答和对话进
 
 如果这些方法无法帮助你入睡,你可以考虑咨询医生或睡眠专家,寻求进一步的建议。
 ```
+
 完整的模型实现可以在 [Hugging Face Hub](https://huggingface.co/THUDM/chatglm-6b) 上查看。如果你从 Hugging Face Hub 上下载checkpoint的速度较慢，也可以从[这里](https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/)手动下载。
 
 ### Demo
@@ -78,7 +83,7 @@ cd ChatGLM-6B
 
 ![web-demo](resources/web-demo.gif)
 
-首先安装 Gradio：`pip install gradio`，然后运行仓库中的 [web_demo.py](web_demo.py)： 
+首先安装 Gradio：`pip install gradio`，然后运行仓库中的 [web_demo.py](web_demo.py)：
 
 ```shell
 python web_demo.py
@@ -88,6 +93,17 @@ python web_demo.py
 
 感谢 [@AdamBear](https://github.com/AdamBear) 实现了基于 Streamlit 的网页版 Demo，运行方式见[#117](https://github.com/THUDM/ChatGLM-6B/pull/117).
 
+
+
+
+#### 网页版 Demo (Chat with OpenAI wikipages)
+
+基于ChatGLM实现的, 结合Langchain和FAISS 的vectorstore Chat.
+
+![1679635888842](image/README/1679635888842.png)
+
+
+
 #### 命令行 Demo
 
 ![cli-demo](resources/cli-demo.png)
@@ -98,20 +114,26 @@ python web_demo.py
 python cli_demo.py
 ```
 
-程序会在命令行中进行交互式的对话，在命令行中输入指示并回车即可生成回复，输入`clear`可以清空对话历史，输入`stop`终止程序。
+程序会在命令行中进行交互式的对话，在命令行中输入指示并回车即可生成回复，输入 `clear`可以清空对话历史，输入 `stop`终止程序。
 
 ### API部署
-首先需要安装额外的依赖`pip install fastapi uvicorn`，然后运行仓库中的[api.py](api.py)：
+
+首先需要安装额外的依赖 `pip install fastapi uvicorn`，然后运行仓库中的[api.py](api.py)：
+
 ```shell
 python api.py
 ```
+
 默认部署在本地的8000端口，通过POST方法进行调用
+
 ```shell
 curl -X POST "http://127.0.0.1:8000" \
      -H 'Content-Type: application/json' \
      -d '{"prompt": "你好", "history": []}'
 ```
+
 得到的返回值为
+
 ```shell
 {
   "response":"你好👋！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。",
@@ -122,7 +144,9 @@ curl -X POST "http://127.0.0.1:8000" \
 ```
 
 ## 低成本部署
+
 ### 模型量化
+
 默认情况下，模型以 FP16 精度加载，运行上述代码需要大概 13GB 显存。如果你的 GPU 显存有限，可以尝试以量化方式加载模型，使用方法如下：
 
 ```python
@@ -135,24 +159,27 @@ model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).ha
 模型量化会带来一定的性能损失，经过测试，ChatGLM-6B 在 4-bit 量化下仍然能够进行自然流畅的生成。使用 [GPT-Q](https://arxiv.org/abs/2210.17323) 等量化方案可以进一步压缩量化精度/提升相同量化精度下的模型性能，欢迎大家提出对应的 Pull Request。
 
 **[2023/03/19]** 量化过程需要在内存中首先加载 FP16 格式的模型，消耗大概 13GB 的内存。如果你的内存不足的话，可以直接加载量化后的模型，仅需大概 5.2GB 的内存：
+
 ```python
 model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True).half().cuda()
 ```
 
 **[2023/03/24]** 我们进一步提供了对Embedding量化后的模型，模型参数仅占用4.3 GB显存：
+
 ```python
 model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True).half().cuda()
 ```
 
-
-
 ### CPU 部署
+
 如果你没有 GPU 硬件的话，也可以在 CPU 上进行推理，但是推理速度会更慢。使用方法如下（需要大概 32GB 内存）
+
 ```python
 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).float()
 ```
 
 **[2023/03/19]** 如果你的内存不足，可以直接加载量化后的模型：
+
 ```python
 model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4",trust_remote_code=True).float()
 ```
@@ -160,14 +187,19 @@ model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4",trust_remote_code=True
 如果遇到了报错 `Could not find module 'nvcuda.dll'` 或者 `RuntimeError: Unknown platform: darwin` (MacOS) 的话请参考这个[Issue](https://github.com/THUDM/ChatGLM-6B/issues/6#issuecomment-1470060041).
 
 ### Mac 上的 GPU 加速
+
 对于搭载了Apple Silicon的Mac（以及MacBook），可以使用 MPS 后端来在 GPU 上运行 ChatGLM-6B。首先需要参考 Apple 的 [官方说明](https://developer.apple.com/metal/pytorch) 安装 PyTorch-Nightly。然后将模型仓库 clone 到本地
+
 ```shell
 git clone https://huggingface.co/THUDM/chatglm-6b
 ```
+
 将代码中的模型加载改为从本地加载，并使用 mps 后端
+
 ```python
 model = AutoModel.from_pretrained("your local path", trust_remote_code=True).half().to('mps')
 ```
+
 即可使用在 Mac 上使用 GPU 加速模型推理。
 
 ## ChatGLM-6B 示例
@@ -231,28 +263,27 @@ model = AutoModel.from_pretrained("your local path", trust_remote_code=True).hal
 由于 ChatGLM-6B 的小规模，其能力仍然有许多局限性。以下是我们目前发现的一些问题：
 
 - 模型容量较小：6B 的小容量，决定了其相对较弱的模型记忆和语言能力。在面对许多事实性知识任务时，ChatGLM-6B 可能会生成不正确的信息；它也不擅长逻辑类问题（如数学、编程）的解答。
-    <details><summary><b>点击查看例子</b></summary>
-    
-    ![](limitations/factual_error.png)
-    
-    ![](limitations/math_error.png)
-    
-    </details>
-  
-- 产生有害说明或有偏见的内容：ChatGLM-6B 只是一个初步与人类意图对齐的语言模型，可能会生成有害、有偏见的内容。（内容可能具有冒犯性，此处不展示）
 
-- 英文能力不足：ChatGLM-6B 训练时使用的指示/回答大部分都是中文的，仅有极小一部分英文内容。因此，如果输入英文指示，回复的质量远不如中文，甚至与中文指示下的内容矛盾，并且出现中英夹杂的情况。
+  <details><summary><b>点击查看例子</b></summary>
+
+  ![](limitations/factual_error.png)
+
+  ![](limitations/math_error.png)
 
+  </details>
+- 产生有害说明或有偏见的内容：ChatGLM-6B 只是一个初步与人类意图对齐的语言模型，可能会生成有害、有偏见的内容。（内容可能具有冒犯性，此处不展示）
+- 英文能力不足：ChatGLM-6B 训练时使用的指示/回答大部分都是中文的，仅有极小一部分英文内容。因此，如果输入英文指示，回复的质量远不如中文，甚至与中文指示下的内容矛盾，并且出现中英夹杂的情况。
 - 易被误导，对话能力较弱：ChatGLM-6B 对话能力还比较弱，而且 “自我认知” 存在问题，并很容易被误导并产生错误的言论。例如当前版本的模型在被误导的情况下，会在自我认知上发生偏差。
-    <details><summary><b>点击查看例子</b></summary>
 
-    ![](limitations/self-confusion_google.jpg)
-    
-    ![](limitations/self-confusion_openai.jpg)
-    
-    ![](limitations/self-confusion_tencent.jpg)
-    
-    </details>
+  <details><summary><b>点击查看例子</b></summary>
+
+  ![](limitations/self-confusion_google.jpg)
+
+  ![](limitations/self-confusion_openai.jpg)
+
+  ![](limitations/self-confusion_tencent.jpg)
+
+  </details>
 
 ## 协议
 
@@ -272,6 +303,7 @@ model = AutoModel.from_pretrained("your local path", trust_remote_code=True).hal
   url={https://openreview.net/forum?id=-Aw0rrrPUF}
 }
 ```
+
 ```
 @inproceedings{du2022glm,
   title={GLM: General Language Model Pretraining with Autoregressive Blank Infilling},
diff --git a/example_with_langchain_and_vectorstore/chat_backend.py b/example_with_langchain_and_vectorstore/chat_backend.py
new file mode 100644
index 00000000..37d3ea2f
--- /dev/null
+++ b/example_with_langchain_and_vectorstore/chat_backend.py
@@ -0,0 +1,111 @@
+import os
+from typing import List, Dict, Tuple, Any
+import streamlit as st
+import pandas as pd
+import os
+from langchain.embeddings.openai import OpenAIEmbeddings
+from langchain.vectorstores import Chroma
+from langchain.text_splitter import CharacterTextSplitter
+from langchain.chains import (
+    ChatVectorDBChain,
+    QAWithSourcesChain,
+    VectorDBQAWithSourcesChain,
+)
+from langchain.prompts.prompt import PromptTemplate
+
+from langchain.docstore.document import Document
+from langchain.vectorstores.faiss import FAISS
+from langchain.chat_models import ChatOpenAI
+from langchain.prompts.chat import (
+    ChatPromptTemplate,
+    SystemMessagePromptTemplate,
+    AIMessagePromptTemplate,
+    HumanMessagePromptTemplate,
+)
+from transformers import AutoTokenizer, AutoModel
+
+# Set up OpenAI API key
+# This is solely for the purpose of semantic search part of langchain vector search.
+# Completion is still purely done using ChatGLM model.
+os.environ["OPENAI_API_KEY"] = ""
+
+
+@st.cache_resource()
+def get_chat_glm():
+    tokenizer = AutoTokenizer.from_pretrained(
+        "THUDM/chatglm-6b-int4", trust_remote_code=True
+    )
+    model = (
+        AutoModel.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
+        .half()
+        .cuda()
+    )
+    model = model.eval()
+    return model, tokenizer
+
+
+def chat_with_agent(user_input, temperature=0.2, max_tokens=800, chat_history=[]):
+    model, tokenizer = get_chat_glm()
+    response, updated_history = model.chat(
+        tokenizer,
+        user_input,
+        history=chat_history,
+        temperature=temperature,
+        max_length=max_tokens,
+    )
+    return response, updated_history
+
+
+# Langchian related features
+def init_wiki_agent(
+    index_dir,
+    max_token=800,
+    temperature=0.3,
+):
+
+    embeddings = OpenAIEmbeddings()
+    if index_dir:
+        vectorstore = FAISS.load_local(index_dir, embeddings=embeddings)
+    else:
+        raise ValueError("Need saved vector store location")
+    system_template = """使用以下文段, 简洁和专业的来回答用户的问题。
+如果无法从中得到答案，请说 "不知道" 或 "没有足够的相关信息". 不要试图编造答案。 答案请使用中文.
+----------------
+{context}
+----------------
+"""
+    messages = [
+        SystemMessagePromptTemplate.from_template(system_template),
+        HumanMessagePromptTemplate.from_template("{question}"),
+    ]
+    prompt = ChatPromptTemplate.from_messages(messages)
+    # qa = ChatVectorDBChain.from_llm(llm=ChatOpenAI(temperature=temperature, max_tokens=max_token),
+    #                                  vectorstore=vectorstore,
+    #                                  qa_prompt=prompt)
+
+    condese_propmt_template = """任务: 给一段对话和一个后续问题，将后续问题改写成一个独立的问题。(确保问题是完整的, 没有模糊的指代)
+聊天记录：
+{chat_history}
+###
+
+后续问题：{question}
+
+改写后的独立, 完整的问题："""
+    new_question_prompt = PromptTemplate.from_template(condese_propmt_template)
+
+    from chatglm_llm import ChatGLM_G
+
+    qa = ChatVectorDBChain.from_llm(
+        llm=ChatGLM_G(),
+        vectorstore=vectorstore,
+        qa_prompt=prompt,
+        condense_question_prompt=new_question_prompt,
+    )
+    qa.return_source_documents = True
+    qa.top_k_docs_for_context = 3
+    return qa
+
+
+def get_wiki_agent_answer(query, qa, chat_history=[]):
+    result = qa({"question": query, "chat_history": chat_history})
+    return result
diff --git a/example_with_langchain_and_vectorstore/chat_style.css b/example_with_langchain_and_vectorstore/chat_style.css
new file mode 100644
index 00000000..109208ca
--- /dev/null
+++ b/example_with_langchain_and_vectorstore/chat_style.css
@@ -0,0 +1,146 @@
+.image-left {
+  display: inline-block;
+  vertical-align: middle;
+  margin-right: 1em;
+}
+.conversation-container {
+  padding: 1em;
+  border-radius: 10px;
+  /* margin-bottom: 1em; */
+  margin-top: 1em;
+}
+
+.conversation-container.user {
+  background-color: rgba(217, 217, 227, 0.4);
+}
+
+.conversation-container.bot {
+  /* background-color: rgba(247,247,248,0.3); */
+  margin-bottom: 0em;
+}
+.text-area-input {
+  height: 5em;
+  margin-bottom: 1em;
+  font-size: 1.2rem;
+}
+.conversation-scroll {
+  height: 70vh;
+  overflow-y: scroll;
+}
+
+[data-testid="stForm"] {
+  /* width: 55vw;
+max-width: 80wh;
+margin-left: -15vw; */
+  /* max-height: 70vh;
+ overflow-y: scroll; */
+  width: 70%;
+  margin-left: 15%;
+}
+
+[data-testid="stForm"] .stTextArea {
+  box-shadow: 0 5px 6px -4px #c7cdce;
+  width: 60% !important;
+  margin-left: 20% !important;
+}
+
+footer {
+  /* margin-left: -20vw; */
+}
+
+/* [data-testid="stMarkdownContainer"]:has(.bot){
+    margin-bottom: 0px;
+    margin-top: 0px;
+} */
+
+[data-testid="stForm"]
+  [data-testid="stVerticalBlock"]
+  [data-testid="stVerticalBlock"]:has(div.conversation-container) {
+  max-height: 55vh !important;
+  overflow-y: auto !important;
+  overflow-x: hidden;
+  /* font-size: 1.2rem; */
+  margin-right: 5px;
+}
+
+[data-testid="stForm"]
+  [data-testid="stVerticalBlock"]
+  [data-testid="stVerticalBlock"]
+  p {
+  /* font-size: 1.2rem; */
+}
+
+[data-testid="stForm"] .stImage {
+  /* width: 6rem !important;
+     */
+  width: 5rem !important;
+}
+
+[data-testid="stForm"] .img {
+  /* width: 6rem !important;
+     */
+  max-width: 85px;
+}
+
+@font-face {
+  font-family: "Josefin Slab", serif;
+  font-style: normal;
+  font-weight: 300;
+}
+header,
+h1,
+h2,
+h3 [class*="css"] {
+  font-family: "Josefin Slab", serif;
+  font-style: normal;
+  font-weight: 300;
+}
+
+h1 {
+  font-weight: bold;
+}
+
+body {
+  font-size: medium;
+}
+
+/* .hhh{
+    color: black;
+    background-color:#fff;
+}
+
+.show:hover .hhh{
+  color: white;
+} */
+
+[data-testid="stForm"] .stButton {
+  box-shadow: 0 5px 6px -4px #c7cdce;
+  /*   width: auto;
+  font-size: 20pt;
+  float: right;
+  box-shadow: rgba(44, 43, 43, 0.5) 0px 0px 0px 0.2rem; */
+  /* width: 60%; */
+}
+
+.stButton button {
+  width: 100%;
+}
+
+::-webkit-scrollbar:vertical {
+  width: 10px;
+}
+
+/* Track */
+::-webkit-scrollbar-track:vertical {
+  background: #f1f1f1;
+}
+
+/* Handle */
+::-webkit-scrollbar-thumb:vertical {
+  background: #888;
+}
+
+/* Handle on hover */
+::-webkit-scrollbar-thumb:vertical:hover {
+  background: #555;
+}
diff --git a/example_with_langchain_and_vectorstore/chatglm_llm.py b/example_with_langchain_and_vectorstore/chatglm_llm.py
new file mode 100644
index 00000000..ccba777d
--- /dev/null
+++ b/example_with_langchain_and_vectorstore/chatglm_llm.py
@@ -0,0 +1,45 @@
+from langchain.llms.base import LLM
+from typing import Optional, List, Mapping, Any
+from langchain.llms.utils import enforce_stop_tokens
+from transformers import AutoTokenizer, AutoModel
+
+"""ChatGLM_G is a wrapper around the ChatGLM model to fit LangChain framework. May not be an optimal implementation"""
+
+
+class ChatGLM_G(LLM):
+
+    tokenizer = AutoTokenizer.from_pretrained(
+        "THUDM/chatglm-6b-int4", trust_remote_code=True
+    )
+    model = (
+        AutoModel.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
+        .half()
+        .cuda()
+    )
+    history = []
+
+    @property
+    def _llm_type(self) -> str:
+        return "ChatGLM_G"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        response, updated_history = self.model.chat(
+            self.tokenizer, prompt, history=self.history, max_length=10000
+        )
+        print("history: ", self.history)
+        if stop is not None:
+            response = enforce_stop_tokens(response, stop)
+        self.history = updated_history
+        return response
+
+    def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        response, updated_history = self.model.chat(
+            self.tokenizer, prompt, history=self.history, max_length=10000
+        )
+        print("history: ", self.history)
+
+        if stop is not None:
+            response = enforce_stop_tokens(response, stop)
+        self.history = updated_history
+
+        return response
diff --git a/example_with_langchain_and_vectorstore/index/how_to_avoid_climate_change_chinese_vectorstore/index.faiss b/example_with_langchain_and_vectorstore/index/how_to_avoid_climate_change_chinese_vectorstore/index.faiss
new file mode 100644
index 00000000..739d11db
Binary files /dev/null and b/example_with_langchain_and_vectorstore/index/how_to_avoid_climate_change_chinese_vectorstore/index.faiss differ
diff --git a/example_with_langchain_and_vectorstore/index/how_to_avoid_climate_change_chinese_vectorstore/index.pkl b/example_with_langchain_and_vectorstore/index/how_to_avoid_climate_change_chinese_vectorstore/index.pkl
new file mode 100644
index 00000000..777a52e6
Binary files /dev/null and b/example_with_langchain_and_vectorstore/index/how_to_avoid_climate_change_chinese_vectorstore/index.pkl differ
diff --git a/example_with_langchain_and_vectorstore/index/wiki_faiss_2023_03_06/index.faiss b/example_with_langchain_and_vectorstore/index/wiki_faiss_2023_03_06/index.faiss
new file mode 100644
index 00000000..6558696d
Binary files /dev/null and b/example_with_langchain_and_vectorstore/index/wiki_faiss_2023_03_06/index.faiss differ
diff --git a/example_with_langchain_and_vectorstore/index/wiki_faiss_2023_03_06/index.pkl b/example_with_langchain_and_vectorstore/index/wiki_faiss_2023_03_06/index.pkl
new file mode 100644
index 00000000..3191ae8e
Binary files /dev/null and b/example_with_langchain_and_vectorstore/index/wiki_faiss_2023_03_06/index.pkl differ
diff --git a/example_with_langchain_and_vectorstore/logo.png b/example_with_langchain_and_vectorstore/logo.png
new file mode 100644
index 00000000..63af50ac
Binary files /dev/null and b/example_with_langchain_and_vectorstore/logo.png differ
diff --git a/example_with_langchain_and_vectorstore/webapp_with_vectorstore.py b/example_with_langchain_and_vectorstore/webapp_with_vectorstore.py
new file mode 100644
index 00000000..71ebac45
--- /dev/null
+++ b/example_with_langchain_and_vectorstore/webapp_with_vectorstore.py
@@ -0,0 +1,371 @@
+import streamlit as st
+from chat_backend import chat_with_agent, init_wiki_agent, get_wiki_agent_answer
+from streamlit.components.v1 import html
+
+import os
+import streamlit as st
+from PIL import Image
+import html
+import uuid
+
+path = os.path.dirname(__file__)
+
+
+icon_img = Image.open(os.path.join(path, "logo.png"))
+
+USER_NAME = "Me"
+AGENT_NAME = "Helpbot"
+
+
+st.set_page_config(
+    page_title="ChatGLM",
+    page_icon=icon_img,
+    layout="wide",
+    # initial_sidebar_state="collapsed",
+)
+
+st.write(
+    "<style>div.block-container{padding-top:1rem;}</style>", unsafe_allow_html=True
+)
+
+
+def local_css(file_name):
+    with open(file_name) as f:
+        st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)
+
+
+def remote_css(url):
+    st.markdown(f'<link href="{url}" rel="stylesheet">', unsafe_allow_html=True)
+
+
+def icon(icon_name):
+    st.markdown(f'<i class="material-icons">{icon_name}</i>', unsafe_allow_html=True)
+
+
+def javascript(source: str) -> None:
+    """loading javascript correctly"""
+    div_id = uuid.uuid4()
+
+    st.markdown(
+        f"""
+    <div style="display:none" id="{div_id}">
+        <iframe src="javascript: \
+            var script = document.createElement('script'); \
+            script.type = 'text/javascript'; \
+            script.text = {html.escape(repr(source))}; \
+            var div = window.parent.document.getElementById('{div_id}'); \
+            div.appendChild(script); \
+            div.parentElement.parentElement.parentElement.style.display = 'none'; \
+        "/>
+    </div>
+    """,
+        unsafe_allow_html=True,
+    )
+
+
+local_css("chat_style.css")
+
+
+st.markdown(
+    """
+<link href="https://fonts.googleapis.com/css2?family=Josefin+Slab&display=swap" rel="stylesheet">
+    """,
+    unsafe_allow_html=True,
+)
+
+# User Input and Send button
+user_profile_image = "https://img1.baidu.com/it/u=3150659458,3834452201&fm=253&fmt=auto&app=138&f=JPEG?w=369&h=378"
+
+chatgpt_profile_image = (
+    "https://cdn.dribbble.com/users/722835/screenshots/4082720/bot_icon.gif"
+)
+
+
+def display_chat_log(cur_container):
+    for cur_conversation in st.session_state["chat_log"]:
+        for msg in cur_conversation:
+            if msg["role"] == USER_NAME:
+                cur_container.markdown(
+                    "<div class='   conversation-container user'><img src='{}' class='image-left' width='50'><br> {} </div>".format(
+                        user_profile_image, html.escape(msg["content"])
+                    ),
+                    unsafe_allow_html=True,
+                )
+            else:
+                cur_container.markdown(
+                    "<div class='conversation-container bot'><img src='{}' class='image-left' width='50'></div>".format(
+                        chatgpt_profile_image
+                    ),
+                    unsafe_allow_html=True,
+                )
+                cur_container.markdown(
+                    f"{'&nbsp;&nbsp;&nbsp;'+msg['content']}", unsafe_allow_html=True
+                )
+
+
+def dict_to_github_markdown(data, has_section=False):
+    wiki_logo = "https://upload.wikimedia.org/wikipedia/en/thumb/8/80/Wikipedia-logo-v2.svg/1200px-Wikipedia-logo-v2.svg.png"
+    slack_logo = (
+        "https://cdn.freebiesupply.com/logos/large/2x/slack-1-logo-png-transparent.png"
+    )
+    book_logo = "https://cdn-icons-png.flaticon.com/512/182/182956.png"
+    markdown = ""
+    for item in data:
+        if "url" in item:
+            title = item["title"]
+            url = item["url"]
+            if has_section:
+                section = item["section"]
+                title_text_and_section = f"{title} - {section}"
+            else:
+                title_text_and_section = title
+            if "wikipedia" in url:
+                logo = wiki_logo
+            elif "slack" in url:
+                logo = slack_logo
+            else:
+                logo = None
+            if len(title_text_and_section) > 50:
+                title_text_and_section = title_text_and_section[:50] + "..."
+            hyperlink = f"[{title_text_and_section}]({url})"
+            if logo:
+                markdown += f"&nbsp;&nbsp;  <img src='{logo}' width='20' height='20'> {hyperlink} "
+            else:
+                markdown += f"&nbsp;&nbsp;  {hyperlink}"
+        elif "chapter" in item:
+            section = item["section"]
+            if "chapter" in item and "page" in item:
+                chapter = item["chapter"]
+                if "总排放量" in chapter:
+                    chapter = chapter.split("总排放量")[0]  # for better display
+                page = item["page"]
+                title_text_and_section = f"{chapter} (p. {page})"
+            elif "page" in item:
+                page = item["page"]
+                title_text_and_section = f"p. {page}"
+            elif "chapter" in item:
+                chapter = item["chapter"]
+                title_text_and_section = chapter
+            else:
+                continue
+            markdown += f"&nbsp;&nbsp;  <img src='{book_logo}' width='20' height='20' title='Book' alt='Book'> {title_text_and_section} "
+    return markdown
+
+
+if "bot_desc" not in st.session_state:
+    st.session_state["bot_desc"] = "General conversational Chatbot based on ChatGLM"
+
+
+def clean_agent():
+    st.session_state["chat_log"] = [[]]
+    st.session_state["messages"] = None
+    st.session_state["agent"] = None
+    st.session_state["agent_chat_history"] = []
+    if "agent_selected_str" in st.session_state:
+        cur_agent = st.session_state["agent_selected_str"]
+        # Set description
+        bot_description = {
+            "Chat": "General conversational Chatbot based on ChatGLM",
+            "AI Wikipedia Agent": "Chat with knowlegebase. (OpenAI related wikipedia pages).",
+            "Climate Book Agent": "Chat with Book: How to Avoid a Climate Disaster",
+        }
+
+        st.session_state["bot_desc"] = bot_description[cur_agent]
+
+
+# Sidebar
+st.sidebar.subheader("Model Settings")
+agent_selected = st.sidebar.selectbox(
+    label="Agent",
+    options=["Chat", "AI Wikipedia Agent", "Climate Book Agent"],
+    index=0,
+    on_change=clean_agent,
+    key="agent_selected_str",
+    help="""Select the agent to chat with.\n\n
+Chat: General conversational Chatbot based on ChatGLM.\n\n
+AI Wikipedia Agent: Chat with knowlegebase. \n(OpenAI related wikipedia pages).
+Climate Book Agent: Chat with Bill Gate's Book: How to Avoid a Climate Disaster
+""",
+)
+max_token_selected = st.sidebar.slider(
+    label="Model Max Output Length",
+    min_value=50,
+    max_value=4500,
+    value=500,
+    step=50,
+    help="The maximum number of tokens to generate. Requests can use up to 2,048 or 4,000 tokens shared between prompt and completion. The exact limit varies by model. (One token is roughly 4 characters for normal English text)",
+)
+tempature_selected = st.sidebar.number_input(
+    label="Model Tempature",
+    min_value=0.0,
+    max_value=1.0,
+    value=0.2,
+    step=0.1,
+    help="Controls randomness: Lowering results in less random completions. As the temperature approaches zero, the model will become deterministic and repetitive.",
+)
+
+# Dynamic conversation display
+if "chat_log" not in st.session_state:
+    st.session_state["chat_log"] = [[]]
+
+if "messages" not in st.session_state:
+    st.session_state["messages"] = None
+if "agent_chat_history" not in st.session_state:
+    st.session_state["agent_chat_history"] = []
+
+if "agent" not in st.session_state:
+    st.session_state["agent"] = None
+
+
+with st.form(key="user_question", clear_on_submit=True):
+
+    # Title and Image in same line
+    # Use user chatgpt profile image
+    c1, c2 = st.columns((9, 1))
+    c1.write("# ChatGLM")
+    c1.write(f"### {st.session_state['bot_desc']}")
+
+    help_bot_icon = (
+        f'<img src="{chatgpt_profile_image}" width="60" style="vertical-align:middle">'
+    )
+    app_log_image = Image.open("logo.png")
+    c2.image(app_log_image)
+
+    conversation_main_container = st.container()
+
+    user_input = st.text_area(
+        "", key="user_input", height=20, placeholder="Ask me anything!"
+    )
+    # set button on the right
+    _, c_clean_btn, c_btn, _ = st.columns([5.2, 1, 1.8, 2])
+    send_button = c_btn.form_submit_button(label="Send")
+    clean_button = c_clean_btn.form_submit_button(label="Clear")
+    if clean_button:
+        clean_agent()
+
+    conversation = []
+    if send_button:
+        if user_input:
+            with st.spinner("Thinking..."):
+                # Determin which agent to call:
+                if agent_selected == "Chat":
+
+                    output, cur_chat_history = chat_with_agent(
+                        user_input,
+                        temperature=tempature_selected,
+                        max_tokens=max_token_selected,
+                        chat_history=st.session_state["messages"],
+                    )
+
+                    # Update chat history
+                    st.session_state["messages"] = cur_chat_history
+                    # Update overall displayed conversations
+                    conversation.append({"role": USER_NAME, "content": user_input})
+                    conversation.append({"role": AGENT_NAME, "content": output})
+                elif agent_selected == "AI Wikipedia Agent":
+                    if (
+                        "agent" not in st.session_state
+                        or st.session_state.agent is None
+                    ):
+                        st.session_state.agent = init_wiki_agent(
+                            index_dir="index/openai_wiki_chinese_index_2023_03_24",
+                            max_token=max_token_selected,
+                            temperature=tempature_selected,
+                        )
+                    output_dict = get_wiki_agent_answer(
+                        user_input,
+                        st.session_state.agent,
+                        chat_history=st.session_state["agent_chat_history"],
+                    )
+                    output = output_dict["answer"]
+
+                    output_sources = [
+                        c.metadata for c in list(output_dict["source_documents"])
+                    ]
+
+                    st.session_state["agent_chat_history"].append((user_input, output))
+
+                    conversation.append({"role": USER_NAME, "content": user_input})
+                    conversation.append(
+                        {
+                            "role": AGENT_NAME,
+                            "content": output
+                            + "\n\n&nbsp;&nbsp;&nbsp;**Sources:** "
+                            + dict_to_github_markdown(output_sources, has_section=True),
+                        }
+                    )
+                elif agent_selected == "Climate Book Agent":
+                    if (
+                        "agent" not in st.session_state
+                        or st.session_state.agent is None
+                    ):
+                        st.session_state.agent = init_wiki_agent(
+                            index_dir="index/how_to_avoid_climate_change_chinese_vectorstore",
+                            max_token=max_token_selected,
+                            temperature=tempature_selected,
+                        )
+                    output_dict = get_wiki_agent_answer(
+                        user_input,
+                        st.session_state.agent,
+                        chat_history=st.session_state["agent_chat_history"],
+                    )
+                    output = output_dict["answer"]
+
+                    output_sources = [
+                        c.metadata for c in list(output_dict["source_documents"])
+                    ]
+
+                    st.session_state["agent_chat_history"].append((user_input, output))
+
+                    conversation.append({"role": USER_NAME, "content": user_input})
+                    conversation.append(
+                        {
+                            "role": AGENT_NAME,
+                            "content": output
+                            + "\n\n&nbsp;&nbsp;&nbsp;**Sources:** "
+                            + dict_to_github_markdown(output_sources, has_section=True),
+                        }
+                    )
+
+                st.session_state["chat_log"].append(conversation)
+        col99, col1 = st.columns([999, 1])
+        with col99:
+            display_chat_log(conversation_main_container)
+        with col1:
+
+            # Scroll to bottom of conversation
+            scroll_to_element = """
+var element = document.getElementsByClassName('conversation-container')[
+document.getElementsByClassName('conversation-container').length - 1
+];
+element.scrollIntoView({behavior: 'smooth', block: 'start'});
+            """
+            javascript(scroll_to_element)
+
+
+def footer():
+    style = """
+    <style>
+      # MainMenu {visibility: hidden;}
+      footer {visibility: hidden;}
+    </style>
+  """
+
+    myargs = [
+        "Made with ChatGLM models, check out the models on ",
+        '<img src="https://cdn-icons-png.flaticon.com/512/25/25231.png" width="18" height="18" margin="0em">',
+        ' <a href="https://github.com/THUDM/ChatGLM-6B" target="_blank">official repo</a>',
+        "!",
+        "<br>",
+    ]
+
+    st.markdown(style, unsafe_allow_html=True)
+    st.markdown(
+        '<div style="left: 0; bottom: 0; margin: 0px 0px 0px 0px; width: 100%; text-align: center; height: 30px; opacity: 0.8;">'
+        + "".join(myargs)
+        + "</div>",
+        unsafe_allow_html=True,
+    )
+
+
+footer()
diff --git a/examples/vectorstore_chat.png b/examples/vectorstore_chat.png
new file mode 100644
index 00000000..b47cec95
Binary files /dev/null and b/examples/vectorstore_chat.png differ
diff --git a/image/README/1679635888842.png b/image/README/1679635888842.png
new file mode 100644
index 00000000..b47cec95
Binary files /dev/null and b/image/README/1679635888842.png differ