Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naive query fails with "'NoneType' object is not subscriptable" #306

Open
rcoundon opened this issue Nov 19, 2024 · 12 comments
Open

Naive query fails with "'NoneType' object is not subscriptable" #306

rcoundon opened this issue Nov 19, 2024 · 12 comments

Comments

@rcoundon
Copy link

When issuing a naive query like this:

rag.query(query, param=QueryParam(mode="naive"))

This fails with:

  File ".../LightRAG/lightrag/operate.py", line 1083, in <lambda>
    key=lambda x: x["content"],
                  ~^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

Running a local, global or hybrid query works fine.

This seemed to be introduced with the changes deployed on 19 Nov (UK time)

@sougannkyou
Copy link

遇到同样问题

@LarFii
Copy link
Collaborator

LarFii commented Nov 27, 2024

Do you have a more detailed log? I haven't been able to identify this issue based on the current logs.

@dipakmeher
Copy link

Same Issue. Any solution to this? I am using llama3 and nomic-embed-text. I have changed the context length for llama3 to 32768 as well.

INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
Extracting entities from chunks: 95%|████████████████████▉ | 40/42 [14:20<00:39, 19.79s/chunk]INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
Extracting entities from chunks: 98%|█████████████████████▍| 41/42 [14:31<00:17, 17.17s/chunk]INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
Extracting entities from chunks: 100%|██████████████████████| 42/42 [14:32<00:00, 20.78s/chunk]
INFO:lightrag:Inserting entities into storage...
Inserting entities: 0entity [00:00, ?entity/s]
INFO:lightrag:Inserting relationships into storage...
Inserting relationships: 0relationship [00:00, ?relationship/s]
WARNING:lightrag:Didn't extract any entities, maybe your LLM is not working
WARNING:lightrag:No new entities and relationships found
INFO:lightrag:Writing graph with 0 nodes, 0 edges
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
File "/scratch/dmeher/GraphRAGProject/LightRAG/examples/lightrag_ollama_demo.py", line 35, in
rag.query("What are the top themes in this story?", param=QueryParam(mode="naive"))
File "/scratch/dmeher/GraphRAGProject/LightRAG/lightrag/lightrag.py", line 414, in query
return loop.run_until_complete(self.aquery(query, param))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/dmeher/custom_env/miniforge/envs/lightrag_env/lib/python3.12/asyncio/base_events.py", line 664, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/scratch/dmeher/GraphRAGProject/LightRAG/lightrag/lightrag.py", line 428, in aquery
response = await naive_query(
^^^^^^^^^^^^^^^^^^
File "/scratch/dmeher/GraphRAGProject/LightRAG/lightrag/operate.py", line 994, in naive_query
maybe_trun_chunks = truncate_list_by_token_size(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/dmeher/GraphRAGProject/LightRAG/lightrag/utils.py", line 191, in truncate_list_by_token_size
tokens += len(encode_string_by_tiktoken(key(data)))
^^^^^^^^^
File "/scratch/dmeher/GraphRAGProject/LightRAG/lightrag/operate.py", line 996, in
key=lambda x: x["content"],
~^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

@xandernewton
Copy link

I have the same error. From both of our logs, it seems like Didn't extract any relationships, maybe your LLM is not working is the issue. The relationships and entities files are empty as well.

INFO:lightrag:[New Docs] inserting 436 docs
Chunking documents: 100%|██████████| 436/436 [00:00<00:00, 1388.09doc/s]
INFO:lightrag:[New Chunks] inserting 436 chunks
INFO:lightrag:Inserting 436 vectors to chunks
Generating embeddings: 100%|██████████| 14/14 [00:11<00:00,  1.23batch/s]
INFO:lightrag:[Entity Extraction]...
Extracting entities from chunks:   0%|          | 0/436 [00:00<?, ?chunk/s]
⠦ Processed 436 chunks, 42 entities(duplicated), 0 relations(duplicated)
Extracting entities from chunks: 100%|██████████| 436/436 [00:00<00:00, 604.99chunk/s]
INFO:lightrag:Inserting entities into storage...
Inserting entities: 100%|██████████| 35/35 [00:00<00:00, 8682.32entity/s]
INFO:lightrag:Inserting relationships into storage...
Inserting relationships: 0relationship [00:00, ?relationship/s]
WARNING:lightrag:Didn't extract any relationships, maybe your LLM is not working
WARNING:lightrag:No new entities and relationships found
INFO:lightrag:Writing graph with 35 nodes, 0 edges

@matthewcoole
Copy link

I'm experiencing the same issue where entities don't seem to be extracted correctly - leading to the missing reseponse when querying

  File "/home/mpc/github/LightRAG/lightrag/operate.py", line 996, in <lambda>
    key=lambda x: x["content"],
                  ~^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

I've tried just running examples/lightrag_ollama_demo.py with mistral-nemo, gemma2:2b and llama3.1:8b. I was thinking the issue was the response wasn't the exact format lightrag expected (some of the responses in kv_store_llm_response_cache.json seemed to suggest this). Still no luck...

@xandernewton
Copy link

Rolling back to the last version on PIP and not installing from source resolved my issues

@dipakmeher
Copy link

Rolling back to the last version on PIP and not installing from source resolved my issues

Thanks for sharing! What was the last version of PIP you used? Which models did you use specifically for text generation and embedding generation? Did you set the context length of your model to 32,768?

@elitedeveloper
Copy link

`import os

from lightrag import LightRAG, QueryParam
from lightrag.llm import hf_model_complete, hf_embedding
from lightrag.utils import EmbeddingFunc
from transformers import AutoModel, AutoTokenizer

WORKING_DIR = "./dickens"

if not os.path.exists(WORKING_DIR):
os.mkdir(WORKING_DIR)

rag = LightRAG(
working_dir=WORKING_DIR,
llm_model_func=hf_model_complete,
llm_model_name="meta-llama/Llama-3.2-1B",
chunk_token_size=16,
chunk_overlap_token_size=2,
embedding_func=EmbeddingFunc(
embedding_dim=384,
max_token_size=5000,
func=lambda texts: hf_embedding(
texts,
tokenizer=AutoTokenizer.from_pretrained(
"sentence-transformers/all-MiniLM-L6-v2"
),
embed_model=AutoModel.from_pretrained(
"sentence-transformers/all-MiniLM-L6-v2"
),
),
),
)

with open("story.txt", "r", encoding="utf-8") as f:
rag.insert(f.read())

Perform naive search

print(
rag.query("What did villagers looked for?", param=QueryParam(mode="naive"))
)

Perform local search

print(
rag.query("What did villagers looked for?", param=QueryParam(mode="local"))
)

Perform global search

print(
rag.query("What did villagers looked for?", param=QueryParam(mode="global"))
)

Perform hybrid search

print(
rag.query("What did villagers looked for?", param=QueryParam(mode="hybrid"))
)
`

I am using google Colab and also tried locally but also getting same error


TypeError Traceback (most recent call last)
in <cell line: 40>()
39 # Perform naive search
40 print(
---> 41 rag.query("What did villagers looked for?", param=QueryParam(mode="naive"))
42 )
43

7 frames
/usr/local/lib/python3.10/dist-packages/lightrag/operate.py in (x)
1081 maybe_trun_chunks = truncate_list_by_token_size(
1082 chunks,
-> 1083 key=lambda x: x["content"],
1084 max_token_size=query_param.max_token_for_text_unit,
1085 )

TypeError: 'NoneType' object is not subscriptable

@mcgillkwok
Copy link

使用naive模式的查询结果:
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
File "/Users/mcgill/LightRAG/examples/test.py", line 49, in
result = rag.query(query, param=QueryParam(mode=mode))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mcgill/LightRAG/lightrag/lightrag.py", line 414, in query
return loop.run_until_complete(self.aquery(query, param))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/Users/mcgill/LightRAG/lightrag/lightrag.py", line 428, in aquery
response = await naive_query(
^^^^^^^^^^^^^^^^^^
File "/Users/mcgill/LightRAG/lightrag/operate.py", line 1001, in naive_query
maybe_trun_chunks = truncate_list_by_token_size(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mcgill/LightRAG/lightrag/utils.py", line 191, in truncate_list_by_token_size
tokens += len(encode_string_by_tiktoken(key(data)))
^^^^^^^^^
File "/Users/mcgill/LightRAG/lightrag/operate.py", line 1003, in
key=lambda x: x["content"],
~^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

@davyWong3
Copy link

davyWong3 commented Dec 3, 2024

Rolling back to the last version on PIP and not installing from source resolved my issues

Same with me. Recent changes made to main branch may have caused this issue.

@einsqing
Copy link

einsqing commented Dec 3, 2024

Same with me

@zhenya-zhu
Copy link

same with me, I roll back to commit 186cd34 (2024-11-14) then lightrag_ollama_demo.py works

magicyuan876 added a commit to magicyuan876/LightRAG that referenced this issue Dec 9, 2024
HKUDS#306
主要修改包括:
在存储文本块数据时增加了验证,确保只存储有效的数据
在处理文本块之前增加了空列表检查
在截断文本块之前过滤掉无效的数据
增加了更多的日志警告信息
查询的修改:
添加了对 chunks 的有效性检查,过滤掉无效的 chunks:
LarFii added a commit that referenced this issue Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests