Browserbase Haystack Fetcher

Browserbase is a developer platform to reliably run, manage, and monitor headless browsers.

Power your AI data retrievals with:

Serverless Infrastructure providing reliable browsers to extract data from complex UIs
Stealth Mode with included fingerprinting tactics and automatic captcha solving
Session Debugger to inspect your Browser Session with networks timeline and logs
Live Debug to quickly debug your automation

Installation and setup

Get an API key and Project ID from browserbase.com and set it in environment variables (BROWSERBASE_API_KEY, BROWSERBASE_PROJECT_ID).
Install the required dependencies:

pip install browserbase-haystack

Usage

You can load webpages into Haystack using BrowserbaseFetcher. Optionally, you can set text_content parameter to convert the pages to text-only representation.

Standalone

from browserbase_haystack import BrowserbaseFetcher

browserbase_fetcher = BrowserbaseFetcher()
browserbase_fetcher.run(urls=["https://example.com"], text_content=False)

In a pipeline

from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from browserbase_haystack import BrowserbaseFetcher

prompt_template = (
    "Tell me the titles of the given pages. Pages: {{ documents }}"
)
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator()

browserbase_fetcher = BrowserbaseFetcher()

pipe = Pipeline()
pipe.add_component("fetcher", browserbase_fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("fetcher.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.prompt")
result = pipe.run(data={"fetcher": {"urls": ["https://example.com"]}})

Parameters

urls Required. A list of URLs to fetch.
text_content Retrieve only text content. Default is False.
session_id Optional. Provide an existing Session ID.
proxy Optional. Enable/Disable Proxies.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
browserbase_haystack		browserbase_haystack
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Browserbase Haystack Fetcher

Installation and setup

Usage

Standalone

In a pipeline

Parameters

About

Releases

Packages

Languages

License

browserbase/haystack

Folders and files

Latest commit

History

Repository files navigation

Browserbase Haystack Fetcher

Installation and setup

Usage

Standalone

In a pipeline

Parameters

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages