browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links() #780

coditamar · 2023-04-11T08:30:17Z

Background

• Validating URLs is important and we wish to apply it to all browse functions and not only scrape_text()
• Continuing to add unit-test to enable us to develop with confidence, this time adding tests to scrape_links()

Changes

extracting the URL validation from scrape_text() into a separate function get_validated_response()
applied get_validated_response() also to scrape_links()
added tests for scrape_links() in tests/test_browse_scrape_links.py

Documentation

Functions include comments.
test_browse_scrape_links.py includes explanations.

Test Plan

Tests where added

PR Quality Checklist

[v] My pull request is atomic and focuses on a single change.
[v] I have thoroughly tested my changes with multiple different prompts.
[v] I have considered potential risks and mitigations for my changes.
[v] I have documented my changes clearly and comprehensively.
[v] I have not snuck in any "extra" small tweaks changes

…r scrape_links()

…r message the same

…_test_and_validate

nponeccop · 2023-04-12T09:00:52Z

tests/test_browse_scrape_links.py

+        assert result[0] == "Google (https://www.google.com)"
+        assert result[1] == "GitHub (https://github.com)"
+        assert result[2] == "CodiumAI (https://www.codium.ai)"
+


These lines are non-PEP8 compliant whitespace. Remove them (but ensure that there is a final CRLF

…ks_test_and_validate

nponeccop

Now you overdone it. You should leave the final line separator. The exact rules are pretty hard, you can just run the flake8 and see if it stops complaining at the EOL :-\

But this is lesser evil, I approve

nponeccop · 2023-04-12T18:53:27Z

@coditamar There are conflicts now

coditamar · 2023-04-12T19:00:24Z

I will make the needed changes or open a clean PR

coditamar · 2023-04-12T19:44:15Z

Overall changed:
browse:
(1) apply url validation also to scrape_links(),
(2) add unit-tests for scrape_links(),
(3) fix url validation (cleaned conflict),
(4) moved related unit-test under the right folder

To be frank, I think that the changes that made the conflicts were a bit messy -- it wasn't "dry" for example

I hope I made the code cleaner

@nponeccop @richbeales

richbeales · 2023-04-12T20:46:05Z

Conflicting files
scripts/browse.py
tests/test_browse_scrape_text.py

…ape_links_test_and_validate browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links()

coditamar and others added 4 commits April 11, 2023 11:17

browse: (1) apply validation also to scrape_links(), (2) add tests fo…

2d5d013

…r scrape_links()

browse: make scrape_links() & scrape_text() "status_code >= 400" erro…

64c21ee

…r message the same

Merge branch 'master' into browse_scrape_links_test_and_validate

f6c8a0f

Merge remote-tracking branch 'upstream/master'

1210ba4

nponeccop approved these changes Apr 11, 2023

View reviewed changes

nponeccop mentioned this pull request Apr 11, 2023

PR batch 3 #709

Closed

1 task

coditamar added 3 commits April 12, 2023 11:48

Merge remote-tracking branch 'upstream/master'

98778ce

Merge remote-tracking branch 'origin/master' into browse_scrape_links…

e8b7a11

…_test_and_validate

Merge remote-tracking branch 'upstream/master'

354fc76

nponeccop suggested changes Apr 12, 2023

View reviewed changes

coditamar added 3 commits April 12, 2023 12:18

Merge remote-tracking branch 'upstream/master' into browse_scrape_lin…

1a71590

…ks_test_and_validate

Merge branch 'master' into browse_scrape_links_test_and_validate

11abb90

removing compliant whitespace

2ec42bf

nponeccop previously approved these changes Apr 12, 2023

View reviewed changes

richbeales previously approved these changes Apr 12, 2023

View reviewed changes

coditamar added 2 commits April 12, 2023 22:31

Merge branch 'master' into browse_scrape_links_test_and_validate

7c0c896

redo suggested changes. move unit test files to the fitting directory

c63645c

coditamar dismissed stale reviews from richbeales and nponeccop via c63645c April 12, 2023 19:41

minor style

57bca36

coditamar added 5 commits April 12, 2023 23:51

Merge branch 'master' into browse_scrape_links_test_and_validate

54478b3

flake8 style

a40ccc1

flake8 style

9f972f4

flake8 style

bf3c76c

flake8 style

3e53e97

nponeccop approved these changes Apr 12, 2023

View reviewed changes

richbeales approved these changes Apr 13, 2023

View reviewed changes

richbeales merged commit 4e4af3e into Significant-Gravitas:master Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links() #780

browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links() #780

coditamar commented Apr 11, 2023

nponeccop Apr 12, 2023

nponeccop left a comment

nponeccop commented Apr 12, 2023

coditamar commented Apr 12, 2023

coditamar commented Apr 12, 2023

richbeales commented Apr 12, 2023

browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links() #780

browse: (1) apply url validation also to scrape_links(), (2) add unit-tests for scrape_links() #780

Conversation

coditamar commented Apr 11, 2023

Background

Changes

Documentation

Test Plan

PR Quality Checklist

nponeccop Apr 12, 2023

Choose a reason for hiding this comment

nponeccop left a comment

Choose a reason for hiding this comment

nponeccop commented Apr 12, 2023

coditamar commented Apr 12, 2023

coditamar commented Apr 12, 2023

richbeales commented Apr 12, 2023