Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links() #780

Conversation

coditamar
Copy link
Contributor

Background

• Validating URLs is important and we wish to apply it to all browse functions and not only scrape_text()
• Continuing to add unit-test to enable us to develop with confidence, this time adding tests to scrape_links()

Changes

  1. extracting the URL validation from scrape_text() into a separate function get_validated_response()
  2. applied get_validated_response() also to scrape_links()
  3. added tests for scrape_links() in tests/test_browse_scrape_links.py

Documentation

Functions include comments.
test_browse_scrape_links.py includes explanations.

Test Plan

Tests where added

PR Quality Checklist

  • [v] My pull request is atomic and focuses on a single change.
  • [v] I have thoroughly tested my changes with multiple different prompts.
  • [v] I have considered potential risks and mitigations for my changes.
  • [v] I have documented my changes clearly and comprehensively.
  • [v] I have not snuck in any "extra" small tweaks changes

@nponeccop nponeccop mentioned this pull request Apr 11, 2023
1 task
assert result[0] == "Google (https://www.google.com)"
assert result[1] == "GitHub (https://github.com)"
assert result[2] == "CodiumAI (https://www.codium.ai)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are non-PEP8 compliant whitespace. Remove them (but ensure that there is a final CRLF

nponeccop
nponeccop previously approved these changes Apr 12, 2023
Copy link
Contributor

@nponeccop nponeccop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you overdone it. You should leave the final line separator. The exact rules are pretty hard, you can just run the flake8 and see if it stops complaining at the EOL :-\

But this is lesser evil, I approve

richbeales
richbeales previously approved these changes Apr 12, 2023
@nponeccop
Copy link
Contributor

@coditamar There are conflicts now

@coditamar
Copy link
Contributor Author

I will make the needed changes or open a clean PR

@coditamar coditamar dismissed stale reviews from richbeales and nponeccop via c63645c April 12, 2023 19:41
@coditamar
Copy link
Contributor Author

Overall changed:
browse:
(1) apply url validation also to scrape_links(),
(2) add unit-tests for scrape_links(),
(3) fix url validation (cleaned conflict),
(4) moved related unit-test under the right folder

To be frank, I think that the changes that made the conflicts were a bit messy -- it wasn't "dry" for example

I hope I made the code cleaner

@nponeccop @richbeales

@richbeales
Copy link
Contributor

Conflicting files
scripts/browse.py
tests/test_browse_scrape_text.py

@richbeales richbeales merged commit 4e4af3e into Significant-Gravitas:master Apr 13, 2023
sindlinger pushed a commit to Orgsindlinger/Auto-GPT-WebUI that referenced this pull request Sep 25, 2024
…ape_links_test_and_validate

browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants