> Add this link to your feedreader <
An RSS feed of whenever COTA posts a new alert box in the header of COTA.com. This is most-often used for notice of reroutes.
This feed updates once per day. If you need more-timely updates than that, I'm sorry.
This feed contains the following information, using the screenshoted alert as an example:
- The alert's header —
REROUTES AHEAD
- The alert's description —
Dec. 2 | Holiday Hop
- The link title —
LEARN MORE
- The link url —
https://www.cota.com/reroutes/cota-reroutes-holiday-hop-231202.pdf
- The date of the scrape that generated this alert
This feed does not contain:
- Any information only found at the link
- Any information parsed from the reroute PDF, such as:
- which lines are affected
- where the reroute is located
- where you can catch your bus
- Any information not contained in the alert box
If you want additional information on items in this feed, you'll need to click the link and/or contact COTA.
When COTA posts a reroute PDF to COTA.com, they don't always announce it. When they do announce it, they only announce it on enshittified privately-owned social media networks: Twitter, Facebook, Instagram. If you don't have logins for those accounts, either you use a proxy like Nitter or you're locked out.
COTA also sometimes posts a notice via their GTFS feed about the reroute, but historically speaking, their GTFS alerts only say that there will be a reroute affecting a route. Their alerts do not, generally speaking, say the location of the reroute.
I don't want to manually check the COTA website every day; I want to receive notifications in the tools that I habitually use. So I wrote this scraper to make an RSS feed.
GitHub serves the RSS feed with the incorrect Content-Type
header of text/html
. If this causes problems for your feedreader, consult your feedreader's documentation. FreshRSS supports adding a #force_feed
to the end of the feed URL to force the software to interpret the file as application/rss+xml
.
This project scrapes and archives the contents of the "Alerts" box which is intermittently present in the header of COTA.com. COTA.com is a Gatsby app, and that header is baked in directly. While examining the site's source code one day, I discovered a reference to the WordPress site which powers the Gatsby app. From there, I examined the read-only side of its WP-JSON API, and discovered that the "Alerts" box appears to be powered by the Advanced Custom Fields plugin for WordPress. So rather than scraping the COTA.com Gatsby app directly, I check the ACF endpoint to see if there's a new Alert posted.
cota-reroute-pdf-rss/scraper.bash
Line 3 in 11a9319
The alert gets saved as JSON to a temporary build/
directory. I then use a PHP script to parse the JSON file and add any new alerts to a CSV file listing all historical alerts.
cota-reroute-pdf-rss/parser.php
Lines 79 to 89 in 11a9319
Then with a different PHP script I convert that CSV to the RSS feed.
If the current alert's PDF URL isn't already in the CSV, the new line in the CSV results in a change to the RSS feed, and those changes result in a committable diff in Git. The GitHub Action which runs this scraper then commits the change to the main branch, which results in the updated RSS file becoming available at https://raw.githubusercontent.com/benlk/cota-reroute-pdf-rss/main/rss.xml
cota-reroute-pdf-rss/.github/workflows/parse-and-generate.yml
Lines 37 to 44 in 11a9319
This project uses GitHub Actions à la Simon Willison to perform the scrape,. The process of writing this was greatly aided by Lasse Benninga's blog post on GitHub Actions scrapers, which simplifies and expands on Simon Willison's model.