Token Sequence Length Exceeds Limit Despite `model_tokens` Parameter in Ollama Model #768

JSchmie · 2024-10-25T07:56:16Z

Describe the bug
The model_tokens parameter in the graph_config dictionary is not being applied to the Ollama model within the SmartScraperGraph setup. Despite setting model_tokens to 128000, the output still shows an error indicating that the token sequence length exceeds the model's limit (2231 > 1024), causing indexing errors.

To Reproduce
Steps to reproduce the behavior:

Set up a SmartScraperGraph using the code below.
Configure the graph_config dictionary, specifying model_tokens: 128000 under the "llm" section.
Run the scraper with smart_scraper_graph.run().
Observe the error regarding token sequence length.

Expected behavior
The model_tokens parameter should be applied to Ollama's model, ensuring that the model respects the 128000-token length specified without raising indexing errors.

Code

from scrapegraphai.graphs import SmartScraperGraph

ollama_base_url = 'http://localhost:11434'
graph_config = {
    "llm": {
        "model": "ollama/mistral",
        "temperature": 1,
        "format": "json",
        'model_tokens': 128000,
        "base_url": ollama_base_url
    },
    "embeddings": {
        "model": "ollama/nomic-embed-text",
        "base_url": ollama_base_url
    },
}

smart_scraper_graph = SmartScraperGraph(
   prompt='What is this website about?',
   source="my.example.com",
   config=graph_config
)
result = smart_scraper_graph.run()
print(result)

Error Message

Token indices sequence length is longer than the specified maximum sequence length for this model (2231 > 1024). Running this sequence through the model will result in indexing errors.

Desktop:

OS: Ubuntu 22.04.5 LTS
Browser: Chromium
Version:
- Python 3.12.7
- scrapegraphai 1.26.7
- Torch 2.5 (Torch should not be necessary as Ollama is being used)

Additional context: Ollama typically uses the num_ctx parameter to set context length. It seems that model_tokens does not directly influence the model's context length, suggesting a possible oversight or misconfiguration in how the SmartScraperGraph handles token length parameters with Ollama models.

Thank you for taking the time to look into this issue! I appreciate any guidance or suggestions you can provide to help resolve this problem. Your assistance means a lot, and I'm looking forward to any insights you might have on how to apply the model_tokens parameter correctly with Ollama. Thanks again for your help!

The text was updated successfully, but these errors were encountered:

VinciGit00 · 2024-10-25T12:03:43Z

please update to the new examples

VinciGit00 · 2024-10-25T12:03:52Z

they are legacy

JSchmie · 2024-10-25T12:35:35Z

Can you point me towards the latest examples? I followed this one.

VinciGit00 · 2024-10-26T08:27:05Z

look at this link https://github.com/ScrapeGraphAI/Scrapegraph-ai/tree/main/examples

JSchmie · 2024-10-26T09:38:24Z

look at this link https://github.com/ScrapeGraphAI/Scrapegraph-ai/tree/main/examples

I've looked into the examples, and I noticed that in this example and other examples related to Ollama, the context window is set using model_tokens. However, in the example for simple web scraping, the context window isn’t modified at all.

I really like your project, but without being able to increase the context window to make full use of the model, I won’t be able to use this framework effectively. Could you please provide a short code snippet or guidance on changing the context length in the latest version?

VinciGit00 · 2024-10-26T11:32:44Z

ok, can you specify the context_window inside? like this graph_config = {
"llm": {
"model": "ollama/mistral",
"temperature": 1,
"format": "json",
'model_tokens': 128000,
"base_url": ollama_base_url
}

VinciGit00 · 2024-10-26T12:19:08Z

Btw which model of mistral are you using?
These are the available models https://ollama.com/library/mistral

JSchmie · 2024-10-27T10:19:47Z

ok, can you specify the context_window inside? like this graph_config = { "llm": { "model": "ollama/mistral", "temperature": 1, "format": "json", 'model_tokens': 128000, "base_url": ollama_base_url }

As you can see from my example, I followed this procedure. I attempted to execute it without embedding for debugging purposes, however, the identical error persists. I am using ollama version 0.3.14.

Btw which model of mistral are you using? These are the available models https://ollama.com/library/mistral

I just use the latest model of mistral, but I also tried llama3.1:8b and 70b which has a context length of 128k and also the gemma2:9b.

closes #768

f-aguzzi · 2024-10-28T13:55:36Z

@JSchmie I fixed this in #773. The model_tokens dictionary key was only available with model instances before this, but now it's accessible for all models.

The pull request will be merged into the development branch (pre/beta) first, so a few days will pass before the fix will be available in a stable release.

JSchmie · 2024-10-28T19:40:35Z

@JSchmie I fixed this in #773. The model_tokens dictionary key was only available with model instances before this, but now it's accessible for all models.

The pull request will be merged into the development branch (pre/beta) first, so a few days will pass before the fix will be available in a stable release.

I have installed your branch using.

pip install --force-reinstall git+https://github.com/ScrapeGraphAI/Scrapegraph-ai.git@768-fix-model-tokens

But unfortunately, I cannot confirm that it works. I still get the error:

Token indices sequence length is longer than the specified maximum sequence length for this model (11148 > 1024). Running this sequence through the model will result in indexing errors

I can confirm that self.model_token is set to 128000. Furthermore, I also tried to set llm_params["num_ctx"] = self.model_token here since ChatOllama also uses num_ctx to set the context window (see documentation here) but it still does not work.

VinciGit00 · 2024-10-29T08:00:01Z

hi please update to the new beta

## [1.27.0-beta.13](v1.27.0-beta.12...v1.27.0-beta.13) (2024-10-29) ### Bug Fixes * **AbstractGraph:** manually select model tokens ([f79f399](f79f399)), closes [#768](#768)

github-actions · 2024-10-29T08:02:38Z

🎉 This issue has been resolved in version 1.27.0-beta.13 🎉

The release is available on:

v1.27.0-beta.13
GitHub release

Your semantic-release bot 📦🚀

JSchmie · 2024-10-29T10:32:30Z

@VinciGit00 I tried that adjustment, and while the error persists:

Token indices sequence length is longer than the specified maximum sequence length for this model (11148 > 1024). Running this sequence through the model will result in indexing errors

the results are looking significantly better now! Could it be that this error is being thrown unintentionally?

f-aguzzi · 2024-10-29T11:07:36Z

@JSchmie the error is coming from LangChain and not from ScrapeGraphAI. Using ollama/mistral will call Mistral 7B, which has a context window of 1024 tokens.

JSchmie · 2024-10-29T11:54:23Z

Yes, but the error still occurs when I am using:

graph_config = {
    "llm": {
        "model": "ollama/llama3.1:8b",
        "temperature": 1,
        "format": "json",
        'model_tokens' : 128000, 
        "base_url": ollama_base_url 
    },
    "embeddings": {
        "model": "ollama/nomic-embed-text",
        "base_url": ollama_base_url 
    },
}

JSchmie · 2024-10-29T13:37:24Z

Update:

I believe I found a crucial issue, which may stem from Ollama itself. In their API documentation, they note:

Important:
When format is set to json, the output will always be a well-formed JSON object. It’s essential to also instruct the model to respond in JSON.

Until now, I wasn't aware of this limitation. If the model doesn’t respond in JSON, it outputs a series of newline characters. Given that inputs can sometimes be quite large, the model might ignore the instruction to respond in JSON, potentially leading to significant quality discrepancies.

Interestingly, when using LangChain directly, this issue doesn’t occur, and the context length is applied correctly. I’ve included the code below, which may be helpful for debugging.

import requests
from bs4 import BeautifulSoup
from langchain_ollama import ChatOllama

# Define the URL to fetch content from
url = "https://github.com/ScrapeGraphAI/Scrapegraph-ai"

# Send a GET request to fetch the raw HTML content from the URL
response = requests.get(url)
response.raise_for_status()  # Raise an exception if an HTTP error occurs

# Parse the HTML content with BeautifulSoup
soup = BeautifulSoup(response.content, "html.parser")

# Extract and clean up text content from HTML, removing tags and adding line breaks
text_content = soup.get_text(separator='\n', strip=True)

# Create a prompt to ask the language model (LLM) what the website is about
# JSON format is explicitly requested in the prompt
promt = f"""
        USE JSON!!!
        What is this website about?
        {text_content}
        """

# Initialize the language model with specific configurations
llm = ChatOllama(
    base_url='http://localhost:11434',  # Specify the base URL for the LLM server
    model='llama3.1:8b',               # Define the model to use
    num_ctx=128000,                    # Set the maximum context length for the LLM
    format='json'                      # Request JSON output format from the LLM
)

# Invoke the LLM with the prompt and print its response
print(llm.invoke(promt))

The output looks like this:

AIMessage(content='{ "type": "json", "result": { "website": "scrapegraphai.com", "library_name": "ScrapeGraphAI", "description": "A Python library for scraping leveraging large language models.", "license": "MIT license" } }\n\n \n\n\n\n\n\n \n\n\n\n ', additional_kwargs={}, response_metadata={'model': 'llama3.1:8b', 'created_at': '2024-10-29T13:35:28.025049704Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 5262412590, 'load_duration': 4070193548, 'prompt_eval_count': 2328, 'prompt_eval_duration': 385116000, 'eval_count': 61, 'eval_duration': 761894000}, id='run-34027abd-c2ea-433e-8eb5-3bb57b5e97a2-0', usage_metadata={'input_tokens': 2328, 'output_tokens': 61, 'total_tokens': 2389})

## [1.28.0-beta.1](v1.27.0...v1.28.0-beta.1) (2024-10-30) ### Features * add new mistral models ([6914170](6914170)) * refactoring of the base_graph ([12a6c18](12a6c18)) ### Bug Fixes * **AbstractGraph:** manually select model tokens ([f79f399](f79f399)), closes [#768](#768) ### CI * **release:** 1.27.0-beta.11 [skip ci] ([3b2cadc](3b2cadc)) * **release:** 1.27.0-beta.12 [skip ci] ([62369e3](62369e3)) * **release:** 1.27.0-beta.13 [skip ci] ([deed355](deed355)), closes [#768](#768)

github-actions · 2024-10-30T08:02:20Z

🎉 This issue has been resolved in version 1.28.0-beta.1 🎉

The release is available on:

v1.28.0-beta.1
GitHub release

Your semantic-release bot 📦🚀

## [1.28.0](v1.27.0...v1.28.0) (2024-11-01) ### Features * add new mistral models ([6914170](6914170)) * refactoring of the base_graph ([12a6c18](12a6c18)) * update generate answer ([7172b32](7172b32)) ### Bug Fixes * **AbstractGraph:** manually select model tokens ([f79f399](f79f399)), closes [#768](#768) ### CI * **release:** 1.27.0-beta.11 [skip ci] ([3b2cadc](3b2cadc)) * **release:** 1.27.0-beta.12 [skip ci] ([62369e3](62369e3)) * **release:** 1.27.0-beta.13 [skip ci] ([deed355](deed355)), closes [#768](#768) * **release:** 1.28.0-beta.1 [skip ci] ([8cbe582](8cbe582)), closes [#768](#768) [#768](#768) * **release:** 1.28.0-beta.2 [skip ci] ([7e3598d](7e3598d))

github-actions · 2024-11-01T09:44:43Z

🎉 This issue has been resolved in version 1.28.0 🎉

The release is available on:

v1.28.0
GitHub release

Your semantic-release bot 📦🚀

f-aguzzi self-assigned this Oct 28, 2024

f-aguzzi added a commit that referenced this issue Oct 28, 2024

fix(AbstractGraph): manually select model tokens

f79f399

closes #768

f-aguzzi mentioned this issue Oct 28, 2024

fix(AbstractGraph): manually select model tokens #773

Merged

VinciGit00 closed this as completed Oct 29, 2024

github-actions bot added the released on @dev label Oct 29, 2024

github-actions bot added the released on @stable label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token Sequence Length Exceeds Limit Despite `model_tokens` Parameter in Ollama Model #768

Token Sequence Length Exceeds Limit Despite `model_tokens` Parameter in Ollama Model #768

JSchmie commented Oct 25, 2024

VinciGit00 commented Oct 25, 2024

VinciGit00 commented Oct 25, 2024

JSchmie commented Oct 25, 2024

VinciGit00 commented Oct 26, 2024

JSchmie commented Oct 26, 2024

VinciGit00 commented Oct 26, 2024

VinciGit00 commented Oct 26, 2024

JSchmie commented Oct 27, 2024

f-aguzzi commented Oct 28, 2024

JSchmie commented Oct 28, 2024 •

edited

Loading

VinciGit00 commented Oct 29, 2024

github-actions bot commented Oct 29, 2024

JSchmie commented Oct 29, 2024

f-aguzzi commented Oct 29, 2024

JSchmie commented Oct 29, 2024

JSchmie commented Oct 29, 2024

github-actions bot commented Oct 30, 2024

github-actions bot commented Nov 1, 2024

Token Sequence Length Exceeds Limit Despite model_tokens Parameter in Ollama Model #768

Token Sequence Length Exceeds Limit Despite model_tokens Parameter in Ollama Model #768

Comments

JSchmie commented Oct 25, 2024

VinciGit00 commented Oct 25, 2024

VinciGit00 commented Oct 25, 2024

JSchmie commented Oct 25, 2024

VinciGit00 commented Oct 26, 2024

JSchmie commented Oct 26, 2024

VinciGit00 commented Oct 26, 2024

VinciGit00 commented Oct 26, 2024

JSchmie commented Oct 27, 2024

f-aguzzi commented Oct 28, 2024

JSchmie commented Oct 28, 2024 • edited Loading

VinciGit00 commented Oct 29, 2024

github-actions bot commented Oct 29, 2024

JSchmie commented Oct 29, 2024

f-aguzzi commented Oct 29, 2024

JSchmie commented Oct 29, 2024

JSchmie commented Oct 29, 2024

github-actions bot commented Oct 30, 2024

github-actions bot commented Nov 1, 2024

Token Sequence Length Exceeds Limit Despite `model_tokens` Parameter in Ollama Model #768

Token Sequence Length Exceeds Limit Despite `model_tokens` Parameter in Ollama Model #768

JSchmie commented Oct 28, 2024 •

edited

Loading