Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assertion error #563

Closed
9 of 14 tasks
TomiLikesToCode opened this issue Sep 7, 2023 · 6 comments · Fixed by #649
Closed
9 of 14 tasks

assertion error #563

TomiLikesToCode opened this issue Sep 7, 2023 · 6 comments · Fixed by #649
Labels
bug Something isn't working

Comments

@TomiLikesToCode
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

Operating system information

Windows

Python version information

3.10

DB-GPT version

main

Related scenes

  • Chat Data
  • Chat Excel
  • Chat DB
  • Chat Knowledge
  • Dashboard
  • Plugins

Installation Information

Device information

gpu count 1
cpu count 1

Models information

orca_mini_v3_7b.ggmlv3.q8_0

What happened

D:\AI\DB-GPT>python pilot/server/dbgpt_server.py 2023-09-07 20:10:00 | INFO | numexpr.utils | NumExpr defaulting to 8 threads. 2023-09-07 20:10:06 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: d:\ai\db-gpt\models\bge-large-en 2023-09-07 20:10:09 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda Add file db, db_name: sqlite_default_sqlite, db_type: sqlite, db_path: data/default_sqlite.db add db connect info error2!Constraint Error: Duplicate key "db_name: sqlite_default_sqlite" violates unique constraint. If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (docs - sql - indexes). d:\ai\db-gpt\pilot Model Unified Deployment Mode! 2023-09-07 20:10:10 | INFO | model_worker | Worker params: =========================== ModelWorkerParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin worker_type: None worker_class: None host: 0.0.0.0 port: 8000 limit_model_concurrency: 5 standalone: False register: True worker_register_host: None controller_addr: None send_heartbeat: True heartbeat_interval: 20 ====================================================================== 2023-09-07 20:10:10 | INFO | model_worker | Worker params: =========================== ModelWorkerParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin worker_type: None worker_class: None host: 0.0.0.0 port: 8000 limit_model_concurrency: 5 standalone: False register: True worker_register_host: None controller_addr: None send_heartbeat: True heartbeat_interval: 20 ====================================================================== 2023-09-07 20:10:10 | INFO | model_worker | Not register current to controller, register: False, controller_addr: None Found llm model adapter with model path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.model.adapter.LlamaCppAdapater object at 0x0000029C8D61B400> 2023-09-07 20:10:10 | INFO | LOGGER | Found llm model adapter with model path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.model.adapter.LlamaCppAdapater object at 0x0000029C8D61B400> Get model chat adapter with model path d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.server.chat_adapter.LlamaCppChatAdapter object at 0x0000029C8D6689A0> 2023-09-07 20:10:10 | INFO | model_worker | Init empty instances list for orca_mini_v3_7b@llm 2023-09-07 20:10:10 | INFO | model_worker | [DefaultModelWorker] Parameters of device is None, use cuda 2023-09-07 20:10:10 | INFO | model_worker | Begin start all worker, apply_req: None 2023-09-07 20:10:10 | INFO | model_worker | Apply to all workers: [WorkerRunData(worker_key='orca_mini_v3_7b@llm', worker=<pilot.model.worker.default_worker.DefaultModelWorker object at 0x0000029C8D5C7CA0>, worker_params=ModelWorkerParameters(model_name='orca_mini_v3_7b', model_path='d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', worker_type='llm', worker_class=None, host='0.0.0.0', port=8000, limit_model_concurrency=5, standalone=False, register=False, worker_register_host=None, controller_addr=None, send_heartbeat=True, heartbeat_interval=20), model_params=LlamaCppModelParameters(model_name='orca_mini_v3_7b', model_path='d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', device='cuda', model_type='llama.cpp', prompt_template=None, max_context_size=4096, num_gpus=None, max_gpu_memory=None, cpu_offloading=False, load_8bit=True, load_4bit=False, quant_type='nf4', use_double_quant=True, compute_dtype=None, trust_remote_code=True, verbose=False, seed=-1, n_threads=None, n_batch=512, n_gpu_layers=1000000000, n_gqa=None, rms_norm_eps=5e-06, cache_capacity=None, prefer_cpu=False), stop_event=<asyncio.locks.Event object at 0x0000029C8D6690F0 [unset]>, semaphore=<asyncio.locks.Semaphore object at 0x0000029C8D669FF0 [unlocked, value:5]>, command_args=[], _heartbeat_future=None, _last_heartbeat=None)] 2023-09-07 20:10:10 | INFO | model_worker | Begin load model, model params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== model_params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== 2023-09-07 20:10:10 | INFO | LOGGER | model_params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== [(0, 'name', '', 0, None, 0), (1, 'seq', '', 0, None, 0)] [(0, 'order_id', 'INTEGER', 0, None, 1), (1, 'user_id', 'INTEGER', 0, None, 0), (2, 'product_id', 'INTEGER', 0, None, 0), (3, 'quantity', 'INTEGER', 0, None, 0), (4, 'order_date', 'DATE', 0, None, 0)] [(0, 'product_id', 'INTEGER', 0, None, 1), (1, 'product_name', 'VARCHAR(100)', 0, None, 0), (2, 'product_price', 'REAL', 0, None, 0)] [(0, 'student_id', 'INTEGER', 0, None, 1), (1, 'student_name', 'VARCHAR(100)', 0, None, 0), (2, 'major', 'VARCHAR(100)', 0, None, 0), (3, 'year_of_enrollment', 'INTEGER', 0, None, 0), (4, 'student_age', 'INTEGER', 0, None, 0)] [(0, 'case_id', 'INTEGER', 0, None, 1), (1, 'scenario_name', 'VARCHAR(100)', 0, None, 0), (2, 'scenario_description', 'TEXT', 0, None, 0), (3, 'test_question', 'VARCHAR(500)', 0, None, 0), (4, 'expected_sql', 'TEXT', 0, None, 0), (5, 'correct_output', 'TEXT', 0, None, 0)] [(0, 'user_id', 'INTEGER', 0, None, 1), (1, 'user_name', 'VARCHAR(100)', 0, None, 0), (2, 'user_email', 'VARCHAR(100)', 0, None, 0), (3, 'registration_date', 'DATE', 0, None, 0), (4, 'user_country', 'VARCHAR(100)', 0, None, 0)] [(0, 'course_id', 'INTEGER', 0, None, 1), (1, 'course_name', 'VARCHAR(100)', 0, None, 0), (2, 'credit', 'REAL', 0, None, 0)] [(0, 'student_id', 'INTEGER', 0, None, 1), (1, 'course_id', 'INTEGER', 0, None, 2), (2, 'score', 'INTEGER', 0, None, 0), (3, 'semester', 'VARCHAR(50)', 0, None, 0)] 2023-09-07 20:10:10 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: d:\ai\db-gpt\models\bge-large-en Llama.cpp use cpu 2023-09-07 20:10:10 | INFO | LOGGER | Llama.cpp use cpu Llama.cpp use cpu 2023-09-07 20:10:10 | INFO | LOGGER | Llama.cpp use cpu Cache capacity is 0 bytes 2023-09-07 20:10:10 | INFO | LOGGER | Cache capacity is 0 bytes Load LLama model with params: {'model_path': 'd:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', 'n_ctx': 4096, 'seed': -1, 'n_threads': None, 'n_batch': 512, 'use_mmap': True, 'use_mlock': False, 'low_vram': False, 'n_gpu_layers': 1000000000, 'n_gqa': None, 'logits_all': True, 'rms_norm_eps': 5e-06} 2023-09-07 20:10:10 | INFO | LOGGER | Load LLama model with params: {'model_path': 'd:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', 'n_ctx': 4096, 'seed': -1, 'n_threads': None, 'n_batch': 512, 'use_mmap': True, 'use_mlock': False, 'low_vram': False, 'n_gpu_layers': 1000000000, 'n_gqa': None, 'logits_all': True, 'rms_norm_eps': 5e-06} gguf_init_from_file: invalid magic number 4f44213c error loading model: llama_model_loader: failed to load model from d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin llama_load_model_from_file: failed to load model Traceback (most recent call last): File "D:\AI\DB-GPT\pilot\server\dbgpt_server.py", line 115, in initialize_worker_manager_in_client( File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 657, in initialize_worker_manager_in_client loop.run_until_complete( File "C:\Users\Tomi\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 646, in run_until_complete return future.result() File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 341, in _start_all_worker await self._apply_worker(apply_req, _start_worker) File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 313, in _apply_worker return await asyncio.gather( File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 324, in _start_worker worker_run_data.worker.start( File "d:\ai\db-gpt\pilot\model\worker\default_worker.py", line 78, in start self.model, self.tokenizer = self.ml.loader_with_params(model_params) File "d:\ai\db-gpt\pilot\model\loader.py", line 121, in loader_with_params return llamacpp_loader(llm_adapter, model_params) File "d:\ai\db-gpt\pilot\model\loader.py", line 347, in llamacpp_loader model, tokenizer = LlamaCppModel.from_pretrained(model_path, model_params) File "d:\ai\db-gpt\pilot\model\llm\llama_cpp\llama_cpp.py", line 85, in from_pretrained result.model = Llama(**params) File "C:\Users\Tomi\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 323, in init assert self.model is not None AssertionError 2023-09-07 20:10:14 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda 2023-09-07 20:10:14 | INFO | chromadb | Running Chroma using direct local API. 2023-09-07 20:10:14 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_profile.vectordb 2023-09-07 20:10:14 | INFO | clickhouse_connect.driver.ctypes | Successfully imported ClickHouse Connect C data optimizations 2023-09-07 20:10:14 | INFO | clickhouse_connect.json_impl | Using orjson library for writing JSON byte strings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 0 embeddings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 1 collections 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | collection with name langchain already exists, returning existing collection 2023-09-07 20:10:14 | INFO | db_summary | init db profile success... 2023-09-07 20:10:14 | INFO | chromadb | Running Chroma using direct local API. 2023-09-07 20:10:14 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_summary.vectordb 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 0 embeddings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 1 collections 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | collection with name langchain already exists, returning existing collection 2023-09-07 20:10:14 | INFO | db_summary | db summary embedding success 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_summary.vectordb 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_profile.vectordb

What you expected to happen

To run the llm model as normal

How to reproduce

follow the "Installation From Source" guide and download my model

Additional context

myenv.txt

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@TomiLikesToCode TomiLikesToCode added bug Something isn't working Waiting for reply labels Sep 7, 2023
@TomiLikesToCode TomiLikesToCode changed the title [Bug] [Module Name] Bug title assertion error Sep 7, 2023
@fangyinc
Copy link
Collaborator

fangyinc commented Sep 8, 2023

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

@TomiLikesToCode
Copy link
Author

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

@fangyinc
Copy link
Collaborator

fangyinc commented Sep 9, 2023

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know.

@TomiLikesToCode
Copy link
Author

TomiLikesToCode commented Sep 10, 2023

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know.

What do you mean other install method? Can I just download from releases and run that? I hate docker I don't wanna use that.

@fangyinc
Copy link
Collaborator

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know.

What do you mean other install method? Can I just download from releases and run that? I hate docker I don't wanna use that.

The problem now is that using llama.cpp to deploy models is not supported, You can try other models, such as configuration:

LLM_MODEL=vicuna-13b-v1.5

More details can be found at Installation From Source.

Aries-ckt added a commit that referenced this issue Oct 7, 2023
Close #567 
Close #644
Close #563

**Other**
- Fix raise Exception when stop DB-GPT
@yumemio
Copy link

yumemio commented Oct 31, 2023

I'm way too late to reply, but separately from the GGML/GGUF incompatibility issue...

gguf_init_from_file: invalid magic number 4f44213c

...the magic number implies that your model file is actually a webpage. Re-downloading the model file might help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants