常见问题

以下是一些常见的问题和回答

Q: 我要提出问题，怎么办

A: 首先，你要观察一下你的问题是否有没有被解决，建议翻看以往的Issue和Discussion，如果有，先按照他们的方法来做。如果没有，按照以下步骤

这是一个bug还是一个讨论问题，如果是讨论问题，放在disscusion，如果是bug和feature，放在issue。
如果要提出feature，提交一份对应的PR会让开发者更重视你的问题，否则你的问题很有可能被直接关闭。

Q: ValueError: Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU. You can deactivate exllama backend by setting disable_exllama=True in the quantization config object.

A: 这是Fschat依赖源码的问题，请查看以下解决方式，通过修改'Fschat'库中的对应内容。

https://github.com/lm-sys/FastChat/issues/2459

https://stackoverflow.com/questions/76983305/fine-tuning-thebloke-llama-2-13b-chat-gptq-model-with-hugging-face-transformers

Q: AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'

A: 查看以下Issue

https://github.com/chatchat-space/Langchain-Chatchat/issues/1835

Q: 使用Qwen API key 报错 multiple wodgets with the same key＝“

A: 确保你的key是dashscope平台的key。并保证dashscope依赖满足我们的依赖版本。

Q：linux下向量化PDF文件时出错：`ImportError: 从文件 *.pdf 加载文档时出错：libGL.so.1: cannot open shared object file: No such file or directory`

A：这是系统缺少必要的动态库，可以手动安装：libgl1-mesa-glx 和 libglib2.0-0

Q: 各种Int4模型无法载入

A. 由于各种Int4模型与Fp16模型并不相似，且量化技术可能有所不同，无法载入可能是因为fschat不支持或者缺少对应的依赖，需要查看对应仓库的issue获得更多信息。开发组没有针对Int4模型进行优化。

Q1: 本项目支持哪些文件格式？

A1: 目前已测试支持 txt、docx、md、pdf、csv、html、json 等格式文件

更多文件格式请参考 langchain 文档。目前已知文档中若含有特殊字符，可能存在文件无法加载的问题。

Q2: 使用过程中 Python 包 `nltk`发生了 `Resource punkt not found.`报错，该如何解决？

A2: 方法一：https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenizers/punkt.zip 中的 packages/tokenizers 解压，放到 nltk_data/tokenizers 存储路径下。

nltk_data 存储路径可以通过 nltk.data.path 查询。

方法二：执行python代码

import nltk
nltk.download()

Q3: 使用过程中 Python 包 `nltk`发生了 `Resource averaged_perceptron_tagger not found.`报错，该如何解决？

A3:

方法一：将 https://github.com/nltk/nltk_data/blob/gh-pages/packages/taggers/averaged_perceptron_tagger.zip 下载，解压放到 nltk_data/taggers 存储路径下。

nltk_data 存储路径可以通过 nltk.data.path 查询。

方法二：执行python代码

import nltk
nltk.download()

Q4: 本项目可否在 colab 中运行？

A4: 可以尝试使用 chatglm-6b-int4 模型在 colab 中运行，需要注意的是，如需在 colab 中运行 Web UI，需将 webui.py中 demo.queue(concurrency_count=3).launch( server_name='0.0.0.0', share=False, inbrowser=False)中参数 share设置为 True。

Q5: 在 Anaconda 中使用 pip 安装包无效如何解决？

A5: 此问题是系统环境问题，详细见在Anaconda中使用pip安装包无效问题

Q6: 本项目中所需模型如何下载至本地？

A6: 本项目中使用的模型均为 huggingface.com 中可下载的开源模型，以默认选择的 chatglm-6b和 text2vec-large-chinese模型为例，下载模型可执行如下代码：

# 安装 git lfs
$ git lfs install

# 下载 LLM 模型
$ git clone https://huggingface.co/THUDM/chatglm-6b /your_path/chatglm-6b

# 下载 Embedding 模型
$ git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese /your_path/text2vec

# 模型需要更新时，可打开模型所在文件夹后拉取最新模型文件/代码
$ git pull

Q7: `huggingface.com`中模型下载速度较慢怎么办？

A7: 可使用本项目用到的模型权重文件百度网盘地址：

ernie-3.0-base-zh.zip 链接: https://pan.baidu.com/s/1CIvKnD3qzE-orFouA8qvNQ?pwd=4wih
ernie-3.0-nano-zh.zip 链接: https://pan.baidu.com/s/1Fh8fgzVdavf5P1omAJJ-Zw?pwd=q6s5
text2vec-large-chinese.zip 链接: https://pan.baidu.com/s/1sMyPzBIXdEzHygftEoyBuA?pwd=4xs7
chatglm-6b-int4-qe.zip 链接: https://pan.baidu.com/s/1DDKMOMHtNZccOOBGWIOYww?pwd=22ji
chatglm-6b-int4.zip 链接: https://pan.baidu.com/s/1pvZ6pMzovjhkA6uPcRLuJA?pwd=3gjd
chatglm-6b.zip 链接: https://pan.baidu.com/s/1B-MpsVVs1GHhteVBetaquw?pwd=djay

Q8: 老版本和新版本无法兼容怎么办？

A8: 保存老版本的配置文件，删除老版本代码并下载新版本代码后，根据新版本的配置文件格式进行修改。

在 0.2.6后，运行环境和配置文件发生重大变化，建议重新配置环境和配置文件，并重建知识库。

Q9: 显卡内存爆了，提示 "OutOfMemoryError: CUDA out of memory"

A9: VECTOR_SEARCH_TOP_K 和 HISTORY_LEN 的值调低，比如 VECTOR_SEARCH_TOP_K = 3 和 LLM_HISTORY_LEN = 2，这样由 query 和 context 拼接得到的 prompt 会变短，会减少内存的占用。或者使用量化模型减少显存占用。

Q10: 执行 `pip install -r requirements.txt` 过程中遇到 python 包，如 langchain 找不到对应版本的问题

A10: 更换 pypi 源后重新安装，如阿里源、清华源等，网络条件允许时建议直接使用 pypi.org 源，具体操作命令如下：

# 使用 pypi 源
$ pip install -r requirements.txt -i https://pypi.python.org/simple

或

# 使用阿里源
$ pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/

或

# 使用清华源
$ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/

Q11: 启动 api.py 时 upload_file 接口抛出 `partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)`

A11: 这是由于 charset_normalizer 模块版本过高导致的，需要降低低 charset_normalizer 的版本,测试在 charset_normalizer==2.1.0 上可用。

Q12: 调用api中的 `bing_search_chat` 接口时，报出 `Failed to establish a new connection: [Errno 110] Connection timed out`

A12: 这是因为服务器加了防火墙，需要联系管理员加白名单，如果公司的服务器的话，就别想了GG--!

Q13: 加载 chatglm-6b-int8 或 chatglm-6b-int4 抛出 `RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients`

A13: 疑为 chatglm 的 quantization 的问题或 torch 版本差异问题，针对已经变为 Parameter 的 torch.zeros 矩阵也执行 Parameter 操作，从而抛出 RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients。解决办法是在 chatglm 项目的原始文件中的 quantization.py 文件 374 行改为：

    try:
        self.weight =Parameter(self.weight.to(kwargs["device"]), requires_grad=False)
    except Exception as e:
        pass

如果上述方式不起作用，则在.cache/hugggingface/modules/目录下针对chatglm项目的原始文件中的quantization.py文件执行上述操作，若软链接不止一个，按照错误提示选择正确的路径。

注：虽然模型可以顺利加载但在cpu上仍存在推理失败的可能：即针对每个问题，模型一直输出gugugugu。

因此，最好不要试图用cpu加载量化模型，原因可能是目前python主流量化包的量化操作是在gpu上执行的,会天然地存在gap。

Q14: 修改配置中路径后，加载 text2vec-large-chinese 依然提示 `WARNING: No sentence-transformers model found with name text2vec-large-chinese. Creating a new one with MEAN pooling.`

A14: 尝试更换 embedding，如 text2vec-base-chinese，请在 configs/model_config.py 文件中，修改 text2vec-base参数为本地路径，绝对路径或者相对路径均可

Q16: 使用pg向量库建表报错

A15: 需要手动安装对应的vector扩展(连接pg执行 CREATE EXTENSION IF NOT EXISTS vector)

Q16: pymilvus 连接超时

A16.pymilvus版本需要匹配和milvus对应否则会超时参考pymilvus==2.1.3

Q17: 使用vllm推理加速框架时，已经下载了模型但出现HuggingFace通信问题

A17: 参照如下代码修改python环境下/site-packages/vllm/model_executor/weight_utils.py文件的prepare_hf_model_weights函数如下对应代码：

    if not is_local:
        # Use file lock to prevent multiple processes from
        # downloading the same model weights at the same time.
        model_path_temp = os.path.join(
            os.getenv("HOME"),
            ".cache/huggingface/hub",
            "models--" + model_name_or_path.replace("/", "--"),
            "snapshots/",
        )
        downloaded = False
        if os.path.exists(model_path_temp):
            temp_last_dir = os.listdir(model_path_temp)[-1]
            model_path_temp = os.path.join(model_path_temp, temp_last_dir)
            base_pattern = os.path.join(model_path_temp, "pytorch_model*.bin")
            files = glob.glob(base_pattern)
            if len(files) > 0:
                downloaded = True

        if downloaded:
           hf_folder = model_path_temp
        else:
            with get_lock(model_name_or_path, cache_dir):
                hf_folder = snapshot_download(model_name_or_path,
                                            allow_patterns=allow_patterns,
                                            cache_dir=cache_dir,
                                            tqdm_class=Disabledtqdm)
    else:
        hf_folder = model_name_or_path

Q18: `/xxx/base_model_worer.py` 报 `assert r.status_code == 200` 错误

A：这个错误是本地模型进程注册到 fastchat controller 失败了。一般有两种原因：1、开了系统全局代理，关闭即可。2、DEFAULT_BIND_HOST 设为'0.0.0.0'，改成'127.0.0.1' 或本机实际 IP 即可。或者更新到最新版本代码也可以解决。

Q19: 使用vllm后端加速，无返回且不报错。

A: fschat=0.2.33的vllm_worker脚本代码有bug, 如需使用，需源码修改fastchat.server.vllm_worker，将103行中sampling_params = SamplingParams的参数stop=list(stop)修改为stop= [i for i in stop if i!=""]

Q20: chatglm3-6b对话中出现"<|user|>"标签，且自问自答。

A20: 这个问题是由于fastchat==0.2.33并没有正确适配chatglm3的对话模板，如需消除该问题，需修改fschat源码：

第一步：修改fastchat/conversation.py，将约472行关于chatglm3的模板修改为：

# ChatGLM3 default template
register_conv_template(
    Conversation(
        name="chatglm3",
        system_template="<|system|>\n {system_message}",
        roles=("",""),
        sep_style=SeparatorStyle.CHATGLM3,
        stop_token_ids=[
            2,
        ],  # "<|user|>", "<|observation|>", "</s>"
    )
)

第二步:在fastchat/model/目录下增加chatglm3.py脚本，脚本见https://github.com/hzg0601/Fast-Chatchat/blob/main/fastchat/model/model_chatglm3.py

第三步：脚本头部import model_chatglm3,修改fastchat/model/model_adapter.py的get_generate_stream_function函数，将` is_chatglm = "chatglm"....elif is_xft: return generate_stream_xft"修改为：

    is_chatglm = "chatglm" in model_type and "chatglm3" not in model_type
    is_chatglm3 = "chatglm3" in model_type
    is_falcon = "rwforcausallm" in model_type
    is_codet5p = "codet5p" in model_type 
    is_peft = "peft" in model_type
    is_exllama = "exllama" in model_type
    is_xft = "xft" in model_type

    if is_chatglm:
        return generate_stream_chatglm
    elif is_chatglm3:
        return generate_stream_chatglm3
    elif is_falcon:
        return generate_stream_falcon
    elif is_codet5p:
        return generate_stream_codet5p
    elif is_exllama:
        return generate_stream_exllama
    elif is_xft:
        return generate_stream_xft

导航栏，一切从这里出发

常见问题

Q: 我要提出问题，怎么办

Q: ValueError: Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU. You can deactivate exllama backend by setting disable_exllama=True in the quantization config object.

Q: AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'

Q: 使用Qwen API key 报错 multiple wodgets with the same key＝“

Q：linux下向量化PDF文件时出错：ImportError: 从文件 *.pdf 加载文档时出错：libGL.so.1: cannot open shared object file: No such file or directory

Q: 各种Int4模型无法载入

Q1: 本项目支持哪些文件格式？

Q2: 使用过程中 Python 包 nltk发生了 Resource punkt not found.报错，该如何解决？

Q3: 使用过程中 Python 包 nltk发生了 Resource averaged_perceptron_tagger not found.报错，该如何解决？

Q4: 本项目可否在 colab 中运行？

Q5: 在 Anaconda 中使用 pip 安装包无效如何解决？

Q6: 本项目中所需模型如何下载至本地？

Q7: huggingface.com中模型下载速度较慢怎么办？

Q8: 老版本和新版本无法兼容怎么办？

Q9: 显卡内存爆了，提示 "OutOfMemoryError: CUDA out of memory"

Q10: 执行 pip install -r requirements.txt 过程中遇到 python 包，如 langchain 找不到对应版本的问题

Q11: 启动 api.py 时 upload_file 接口抛出 partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)

Q12: 调用api中的 bing_search_chat 接口时，报出 Failed to establish a new connection: [Errno 110] Connection timed out

Q13: 加载 chatglm-6b-int8 或 chatglm-6b-int4 抛出 RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients

Q14: 修改配置中路径后，加载 text2vec-large-chinese 依然提示 WARNING: No sentence-transformers model found with name text2vec-large-chinese. Creating a new one with MEAN pooling.