Skip to content

Releases: simonw/shot-scraper

0.9

14 Mar 00:05
9c73ed6
Compare
Choose a tag to compare
0.9
  • New shot-scraper javascript command for executing JavaScript against a web page and returning the result to the console as JSON: #38

    % shot-scraper javascript datasette.io document.title
    "Datasette: An open source multi-tool for exploring and publishing data"
    

    This can be used for web scraping and data extraction. Any JavaScript errors will cause the command to return an exit code of 1, so this can also be used to run tests against a website from within a continuous integration environment such as GitHub Actions.

  • The shot-scraper pdf and shot-scraper accessibility commands can both now be used with local files in addition to URLs. #37

  • The output: key is no longer required in YAML shot configuration: if omitted, an automatic filename will be used instead. #40

  • An empty YAML file no longer produces an error. #41

0.8

13 Mar 17:05
18f3397
Compare
Choose a tag to compare
0.8
  • shot-scraper can now take screenshots of local files on disk: #35

    shot-scraper index.html -o index.png
    
  • If you call shot-scraper on a URL with no protocol, http:// will be assumed. Redirects will be followed:

    shot-scraper datasette.io -o datasette.png
    

0.7

13 Mar 04:49
Compare
Choose a tag to compare
0.7
  • The shot-scraper shot and shot-scraper pdf commands both now default to writing a file to disk if no filename is specified, using a name derived from the URL. If you want to write the PNG or PDF content to standard output you can do so using -o -. #32
  • New --retina flag for shot-scraper shot and shot-scraper multi which causes the screenshot to be taken with a device scale factor of 2. #33
  • shot-scraper shot --devtools option opens an interactive browser window with the browser developer tools enabled. #34

0.6

12 Mar 21:30
f06ef3d
Compare
Choose a tag to compare
0.6
  • Now supports taking screenshots of pages that require authentication. #18

    The following command will open a browser window for the specified website, wait for you to manually authenticate and hit <enter> in the terminal, and then write the resulting authentication context out to auth.json:

    shot-scraper auth https://github.com/ auth.json`
    

    You can then take authenticated screenshots like this:

    shot-scraper https://github.com/notifications \
      --auth auth.json -o notifications.png
    

    The -a/--auth option is also supported by the multi, pdf and accessibility commands.

  • The shot-scraper command can now open a browser in which you can interact with a page before the screenshot is taken: #31

    shot-scraper https://simonwillison.net/ \
      -o after-interaction.png \
      --height 800 --interactive
    

    This will output:

    Hit <enter> to take the shot and close the browser window:
      # And after you hit <enter>...
    Screenshot of 'https://simonwillison.net/' written to 'after-interaction.png'
    
  • You can now pass multiple CSS selectors in order to take a screenshot of the smallest area that encompasses all of the content referenced by those selectors: #21

    shot-scraper https://simonwillison.net/ \
      -s '#bighead' -s .overband \
      -o bighead-multi-selector.png
    

    Add --padding 20 to include an additional 20px of padding around the specified area.

    The YAML format used by snap-shotter multi also now supports multiple CSS selectors, which look like this:

    - output: bighead-multi-selector.png
      url: https://simonwillison.net/
      selectors:
      - "#bighead"
      - .overband
      padding: 20
  • Scripted tests can now be run using tests/run_examples.sh #29

0.5

12 Mar 19:17
be88ac2
Compare
Choose a tag to compare
0.5
  • New shot-scraper pdf command for creating a PDF export of a web page. #24
  • shot-scraper accessibility --javascript option for executing custom JavaScript before taking the accessibility snapshot. #23
  • shot-scraper accessibility -o filename.json option. #25
  • README demos section now links to @newshomepages Twitter bot by @palewire
  • README now includes tips on executing JavaScript. #20
  • README now includes the --help output of the various commands.

0.4

10 Mar 22:33
b974f18
Compare
Choose a tag to compare
0.4
  • Added shot-scraper accessibility URL command, which dumps out a JSON copy of the Chromium accessibility tree for the page. #22
  • Fixed error in the --help output for the shot-scraper multi command.

0.3

09 Mar 22:24
e50f656
Compare
Choose a tag to compare
0.3
  • Added a live demo, in the shot-scraper-demo repository. #14
  • New --quality 80 option for outputting smaller JPEG images with the specified quality. #15
  • New --wait 2000 option for waiting the specified number of milliseconds before taking the screenshot. #16

0.2

09 Mar 21:05
ccfd42a
Compare
Choose a tag to compare
0.2
  • shot-scraper --selector SELECTOR option to specify an element on the page using a CSS selector and take a screenshot of just that element. #8
  • selector: ... key in YAML file to specify an element by CSS selector.
  • --javascript SCRIPT option to specify custom JavaScript to be executed after the page has loaded but before the screenshot is taken. #12
  • javascript: key in YAML to specify JavaScript to execute.
  • --width and --height options to set the width and height of the browser window used for the screenshot. If a height is specified, the resulting screenshot will be that height rather than being the full height of the page. #13
  • Equivalent width: and height: keys in the YAML configuration.

0.1

09 Mar 19:16
f4149e1
Compare
Choose a tag to compare
0.1
  • Switched from npm Playwright to Python Playwright. #3
  • New shot-scraper install command for installing the browser needed by Playwright. #6
  • New shot-scraper shot URL command (also the default if you just run shot-scraper ...) which takes a single screenshot. #5
  • shot-scraper multi shots.yml command now executes the YAML file with a list of shots in it.

0.1a0

08 Mar 23:10
Compare
Choose a tag to compare
0.1a0 Pre-release
Pre-release
  • Initial prototype. #1