Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Google in 2023, and save to long term memory #507

Closed
wants to merge 4 commits into from

Conversation

gigabit-eth
Copy link

@gigabit-eth gigabit-eth commented Apr 8, 2023

Background

This is basically a culmination of a handful of different PRs all rolled into one with some GPT4 and CodeGenie optimization.

I'll start with the biggest one #121, changing the scraper from BeautifulSoup to Selenium. I also believe this to be the best option going forward as it's much more robust.

But once we have access to the web we need data from 2023 #185 , then a more robust prompt structure wouldn't hurt.

Finally, when AutoGPT finds better data it needs to overwrite the memory, and this comment helped a lot. I also had CodeGenie optimize the code for faster processing.

Changes

Added the persistent_memory = [] stub back to prevent errors

Added string_key_memory = {} to handle memory_ovr error no attribute 'string_key_memory'

Removed space typo in prompt.txt

Test Plan

Google searches no longer time out, and if it does return something it's unable to save the data to long-term memory. I've been able to successfully get it to run for 15 - 20 min sessions regularly before hitting any errors.

⭐️ 26 min 🎥
https://youtu.be/yM_yxVn4y2I

Change Safety

  • I have added tests to cover my changes
  • I have considered potential risks and mitigations for my changes

I have not added extensive test coverage but did have an uninterrupted session of 35 minutes before hitting another error. Mind you this was all on --gpt3only as I don't have access to GPT4 yet. AutoGPT knew it needed to check the price of a cryptocurrency multiple times so it reasoned it was more efficient to deploy a bot to constantly watch this page for price changes, and if I would have had GPT4 it would have started building python bots for me.

Copy link

@SamPrinceFranklin SamPrinceFranklin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in the PR include switching the scraper to Selenium, improving the prompt structure, and optimizing the code for faster processing using CodeGenie. The author has added tests and considered potential risks. The changes have resulted in more robust scraping and longer uninterrupted sessions. The addition of GPT4 could further improve efficiency in deploying bots. Overall, the changes seem to be positive and well-considered.

Copy link

@yuanyowwu yuanyowwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my first review on an open-source project, so I'm leaving it as a comment.

But in general, I think this brings in some welcome changes.

Thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding time, I think, is a great change!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time out to review.

.env.template Outdated Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that a better webscaper should be implemented.

@@ -1,28 +1,18 @@
import browse
import json
from memory import PineconeMemory
import memory as mem

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - if Pinecone isn't being used, how does the program know which mem to use?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, Honestly I couldn't figure out why that worked and not PineconeMemory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no usage of memory here. The only command that needed memory was memory.add. Which seems to have been removed away from the prompt.

There is very little need for memory.add command in the first place since we're storing to the embeddings vector database for every command.

@@ -118,62 +105,26 @@ def get_datetime():

def google_search(query, num_results=8):
search_results = []
for j in ddg(query, max_results=num_results):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using ddg (duck duck go) anymore?

@JustinHinh
Copy link

JustinHinh commented Apr 11, 2023

EDIT: NVM I had to reinstall requirements.txt

This is no longer an issue

I have this error

(base) PS C:\Users\USERNAME\Source\Repos\Auto-GPT> python scripts/main.py --gpt3only
Traceback (most recent call last):
import commands as cmd
File "C:\Users\USERNAME\Source\Repos\Auto-GPT\scripts\commands.py", line 12, in
from googlesearch import search
ModuleNotFoundError: No module named 'googlesearch'

There is no module named googlesearch???

@gigabit-eth gigabit-eth deleted the devv branch April 15, 2023 02:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants