assertion error #563

TomiLikesToCode · 2023-09-07T18:13:42Z

Search before asking

I had searched in the issues and found no similar issues.

Operating system information

Windows

Python version information

3.10

DB-GPT version

main

Related scenes

Installation Information

Device information

gpu count 1
cpu count 1

Models information

orca_mini_v3_7b.ggmlv3.q8_0

What happened

D:\AI\DB-GPT>python pilot/server/dbgpt_server.py 2023-09-07 20:10:00 | INFO | numexpr.utils | NumExpr defaulting to 8 threads. 2023-09-07 20:10:06 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: d:\ai\db-gpt\models\bge-large-en 2023-09-07 20:10:09 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda Add file db, db_name: sqlite_default_sqlite, db_type: sqlite, db_path: data/default_sqlite.db add db connect info error2！Constraint Error: Duplicate key "db_name: sqlite_default_sqlite" violates unique constraint. If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (docs - sql - indexes). d:\ai\db-gpt\pilot Model Unified Deployment Mode! 2023-09-07 20:10:10 | INFO | model_worker | Worker params: =========================== ModelWorkerParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin worker_type: None worker_class: None host: 0.0.0.0 port: 8000 limit_model_concurrency: 5 standalone: False register: True worker_register_host: None controller_addr: None send_heartbeat: True heartbeat_interval: 20 ====================================================================== 2023-09-07 20:10:10 | INFO | model_worker | Worker params: =========================== ModelWorkerParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin worker_type: None worker_class: None host: 0.0.0.0 port: 8000 limit_model_concurrency: 5 standalone: False register: True worker_register_host: None controller_addr: None send_heartbeat: True heartbeat_interval: 20 ====================================================================== 2023-09-07 20:10:10 | INFO | model_worker | Not register current to controller, register: False, controller_addr: None Found llm model adapter with model path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.model.adapter.LlamaCppAdapater object at 0x0000029C8D61B400> 2023-09-07 20:10:10 | INFO | LOGGER | Found llm model adapter with model path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.model.adapter.LlamaCppAdapater object at 0x0000029C8D61B400> Get model chat adapter with model path d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.server.chat_adapter.LlamaCppChatAdapter object at 0x0000029C8D6689A0> 2023-09-07 20:10:10 | INFO | model_worker | Init empty instances list for orca_mini_v3_7b@llm 2023-09-07 20:10:10 | INFO | model_worker | [DefaultModelWorker] Parameters of device is None, use cuda 2023-09-07 20:10:10 | INFO | model_worker | Begin start all worker, apply_req: None 2023-09-07 20:10:10 | INFO | model_worker | Apply to all workers: [WorkerRunData(worker_key='orca_mini_v3_7b@llm', worker=<pilot.model.worker.default_worker.DefaultModelWorker object at 0x0000029C8D5C7CA0>, worker_params=ModelWorkerParameters(model_name='orca_mini_v3_7b', model_path='d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', worker_type='llm', worker_class=None, host='0.0.0.0', port=8000, limit_model_concurrency=5, standalone=False, register=False, worker_register_host=None, controller_addr=None, send_heartbeat=True, heartbeat_interval=20), model_params=LlamaCppModelParameters(model_name='orca_mini_v3_7b', model_path='d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', device='cuda', model_type='llama.cpp', prompt_template=None, max_context_size=4096, num_gpus=None, max_gpu_memory=None, cpu_offloading=False, load_8bit=True, load_4bit=False, quant_type='nf4', use_double_quant=True, compute_dtype=None, trust_remote_code=True, verbose=False, seed=-1, n_threads=None, n_batch=512, n_gpu_layers=1000000000, n_gqa=None, rms_norm_eps=5e-06, cache_capacity=None, prefer_cpu=False), stop_event=<asyncio.locks.Event object at 0x0000029C8D6690F0 [unset]>, semaphore=<asyncio.locks.Semaphore object at 0x0000029C8D669FF0 [unlocked, value:5]>, command_args=[], _heartbeat_future=None, _last_heartbeat=None)] 2023-09-07 20:10:10 | INFO | model_worker | Begin load model, model params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== model_params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== 2023-09-07 20:10:10 | INFO | LOGGER | model_params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== [(0, 'name', '', 0, None, 0), (1, 'seq', '', 0, None, 0)] [(0, 'order_id', 'INTEGER', 0, None, 1), (1, 'user_id', 'INTEGER', 0, None, 0), (2, 'product_id', 'INTEGER', 0, None, 0), (3, 'quantity', 'INTEGER', 0, None, 0), (4, 'order_date', 'DATE', 0, None, 0)] [(0, 'product_id', 'INTEGER', 0, None, 1), (1, 'product_name', 'VARCHAR(100)', 0, None, 0), (2, 'product_price', 'REAL', 0, None, 0)] [(0, 'student_id', 'INTEGER', 0, None, 1), (1, 'student_name', 'VARCHAR(100)', 0, None, 0), (2, 'major', 'VARCHAR(100)', 0, None, 0), (3, 'year_of_enrollment', 'INTEGER', 0, None, 0), (4, 'student_age', 'INTEGER', 0, None, 0)] [(0, 'case_id', 'INTEGER', 0, None, 1), (1, 'scenario_name', 'VARCHAR(100)', 0, None, 0), (2, 'scenario_description', 'TEXT', 0, None, 0), (3, 'test_question', 'VARCHAR(500)', 0, None, 0), (4, 'expected_sql', 'TEXT', 0, None, 0), (5, 'correct_output', 'TEXT', 0, None, 0)] [(0, 'user_id', 'INTEGER', 0, None, 1), (1, 'user_name', 'VARCHAR(100)', 0, None, 0), (2, 'user_email', 'VARCHAR(100)', 0, None, 0), (3, 'registration_date', 'DATE', 0, None, 0), (4, 'user_country', 'VARCHAR(100)', 0, None, 0)] [(0, 'course_id', 'INTEGER', 0, None, 1), (1, 'course_name', 'VARCHAR(100)', 0, None, 0), (2, 'credit', 'REAL', 0, None, 0)] [(0, 'student_id', 'INTEGER', 0, None, 1), (1, 'course_id', 'INTEGER', 0, None, 2), (2, 'score', 'INTEGER', 0, None, 0), (3, 'semester', 'VARCHAR(50)', 0, None, 0)] 2023-09-07 20:10:10 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: d:\ai\db-gpt\models\bge-large-en Llama.cpp use cpu 2023-09-07 20:10:10 | INFO | LOGGER | Llama.cpp use cpu Llama.cpp use cpu 2023-09-07 20:10:10 | INFO | LOGGER | Llama.cpp use cpu Cache capacity is 0 bytes 2023-09-07 20:10:10 | INFO | LOGGER | Cache capacity is 0 bytes Load LLama model with params: {'model_path': 'd:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', 'n_ctx': 4096, 'seed': -1, 'n_threads': None, 'n_batch': 512, 'use_mmap': True, 'use_mlock': False, 'low_vram': False, 'n_gpu_layers': 1000000000, 'n_gqa': None, 'logits_all': True, 'rms_norm_eps': 5e-06} 2023-09-07 20:10:10 | INFO | LOGGER | Load LLama model with params: {'model_path': 'd:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', 'n_ctx': 4096, 'seed': -1, 'n_threads': None, 'n_batch': 512, 'use_mmap': True, 'use_mlock': False, 'low_vram': False, 'n_gpu_layers': 1000000000, 'n_gqa': None, 'logits_all': True, 'rms_norm_eps': 5e-06} gguf_init_from_file: invalid magic number 4f44213c error loading model: llama_model_loader: failed to load model from d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin llama_load_model_from_file: failed to load model Traceback (most recent call last): File "D:\AI\DB-GPT\pilot\server\dbgpt_server.py", line 115, in initialize_worker_manager_in_client( File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 657, in initialize_worker_manager_in_client loop.run_until_complete( File "C:\Users\Tomi\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 646, in run_until_complete return future.result() File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 341, in _start_all_worker await self._apply_worker(apply_req, _start_worker) File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 313, in _apply_worker return await asyncio.gather( File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 324, in _start_worker worker_run_data.worker.start( File "d:\ai\db-gpt\pilot\model\worker\default_worker.py", line 78, in start self.model, self.tokenizer = self.ml.loader_with_params(model_params) File "d:\ai\db-gpt\pilot\model\loader.py", line 121, in loader_with_params return llamacpp_loader(llm_adapter, model_params) File "d:\ai\db-gpt\pilot\model\loader.py", line 347, in llamacpp_loader model, tokenizer = LlamaCppModel.from_pretrained(model_path, model_params) File "d:\ai\db-gpt\pilot\model\llm\llama_cpp\llama_cpp.py", line 85, in from_pretrained result.model = Llama(**params) File "C:\Users\Tomi\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 323, in init assert self.model is not None AssertionError 2023-09-07 20:10:14 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda 2023-09-07 20:10:14 | INFO | chromadb | Running Chroma using direct local API. 2023-09-07 20:10:14 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_profile.vectordb 2023-09-07 20:10:14 | INFO | clickhouse_connect.driver.ctypes | Successfully imported ClickHouse Connect C data optimizations 2023-09-07 20:10:14 | INFO | clickhouse_connect.json_impl | Using orjson library for writing JSON byte strings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 0 embeddings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 1 collections 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | collection with name langchain already exists, returning existing collection 2023-09-07 20:10:14 | INFO | db_summary | init db profile success... 2023-09-07 20:10:14 | INFO | chromadb | Running Chroma using direct local API. 2023-09-07 20:10:14 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_summary.vectordb 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 0 embeddings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 1 collections 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | collection with name langchain already exists, returning existing collection 2023-09-07 20:10:14 | INFO | db_summary | db summary embedding success 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_summary.vectordb 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_profile.vectordb

What you expected to happen

To run the llm model as normal

How to reproduce

follow the "Installation From Source" guide and download my model

Additional context

myenv.txt

Are you willing to submit PR?

Yes I am willing to submit a PR!

fangyinc · 2023-09-08T10:59:01Z

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

TomiLikesToCode · 2023-09-08T14:40:28Z

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

fangyinc · 2023-09-09T22:40:33Z

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know.

TomiLikesToCode · 2023-09-10T05:51:03Z

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know.

What do you mean other install method? Can I just download from releases and run that? I hate docker I don't wanna use that.

fangyinc · 2023-09-10T08:16:43Z

so i cant use it at this time?

Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it.

Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know.

What do you mean other install method? Can I just download from releases and run that? I hate docker I don't wanna use that.

The problem now is that using llama.cpp to deploy models is not supported, You can try other models, such as configuration:

LLM_MODEL=vicuna-13b-v1.5

More details can be found at Installation From Source.

Close #567 Close #644 Close #563 **Other** - Fix raise Exception when stop DB-GPT

yumemio · 2023-10-31T08:45:38Z

I'm way too late to reply, but separately from the GGML/GGUF incompatibility issue...

gguf_init_from_file: invalid magic number 4f44213c

...the magic number implies that your model file is actually a webpage. Re-downloading the model file might help.

TomiLikesToCode added bug Something isn't working Waiting for reply labels Sep 7, 2023

TomiLikesToCode changed the title ~~[Bug] [Module Name] Bug title~~ assertion error Sep 7, 2023

fangyinc removed the Waiting for reply label Sep 8, 2023

fangyinc mentioned this issue Oct 7, 2023

feat(model): llama.cpp support new GGUF file format #649

Merged

Aries-ckt closed this as completed in #649 Oct 7, 2023

Aries-ckt added a commit that referenced this issue Oct 7, 2023

feat(model): llama.cpp support new GGUF file format (#649)

f2427b1

Close #567 Close #644 Close #563 **Other** - Fix raise Exception when stop DB-GPT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assertion error #563

assertion error #563

TomiLikesToCode commented Sep 7, 2023

fangyinc commented Sep 8, 2023 •

edited

Loading

TomiLikesToCode commented Sep 8, 2023

fangyinc commented Sep 9, 2023

TomiLikesToCode commented Sep 10, 2023 •

edited

Loading

fangyinc commented Sep 10, 2023

yumemio commented Oct 31, 2023

assertion error #563

assertion error #563

Comments

TomiLikesToCode commented Sep 7, 2023

Search before asking

Operating system information

Python version information

DB-GPT version

Related scenes

Installation Information

Device information

Models information

What happened

What you expected to happen

How to reproduce

Additional context

Are you willing to submit PR?

fangyinc commented Sep 8, 2023 • edited Loading

TomiLikesToCode commented Sep 8, 2023

fangyinc commented Sep 9, 2023

TomiLikesToCode commented Sep 10, 2023 • edited Loading

fangyinc commented Sep 10, 2023

yumemio commented Oct 31, 2023

fangyinc commented Sep 8, 2023 •

edited

Loading

TomiLikesToCode commented Sep 10, 2023 •

edited

Loading