Prices from all retail chains in Israel are public. One can finds hints about the format here. Let's try to do something with this data!
POC: it works just well enough for an end-to-end demo. Click to see a video:
Scan a product barcode from the web application to see:
- Price/promotions (at the Shufersal next to where I live...)
- Ingredients, allergens, food quality warnings (data from Shufersal)
- openfoodfacts info if available
python3.8
pip install -r requirements.txt
(ideally usevirtualenv
...). When you update packages editrequirments.in
and callpip-compile requirements.in
.nodejs
for the web application
To run the tests:
green
To start the app:
npm install
npm run dev
#=> listenning on port 3000
Click here to display a test barcode. See the notes below for tips on how to access the app from a mobile device.
To start the backend:
cd prices
uvicorn server:app --reload
#=> 127.0.0.0:8000/barcode/0123456789
# Docs:
#=> 127.0.0.0:8000/docs
#=> 127.0.0.0:8000/redoc
#=> 127.0.0.0:8000/openapi.json
- fetch prices shufersal
- model to work nicely with all this data
- find best promotions
- openfoodfacts: fetch db, use it
- shufersal: images, descriptions
- save prices in a database, query it. makes sense to have a materialized view of "current prices"..?
- crawl all shufersal images and metadata (save html somewhere, then process..)
- setup daily crawl (for 1 store for now...)
- scan barcode
- browser webAPI - but we need a polyfill for wide support...
- scandit.com / dynamysoft work great but commercial
- best (almost only?) open-source project: quaggaJS
https://serratus.github.io/quaggaJS/
https://github.com/ericblade/quagga2 (https://github.com/ericblade/quagga2/commits/master/src)
https://morioh.com/p/1963935c62db
https://github.com/ericblade/quagga2-react-example
- query backend
- display results
- native?
- fetch prices of more retail chains and stores
- price distribution: histogram per product, show products with most variance, correlated high prices vs average...
- geolocation, choose store, show map...
- save prices over time, see them
In python:
-
korenLazar/supermarket-scraping has tons of scrappers and understood what the promos mean
-
beyond-io/superx with
- a DB (schema)
- mappings of misc attr names between chains
def standardize_weight_name(self, unit_in_hebrew)
-
elikochva/openprices, not much, just with all scrapers, and:
-
kimi-codes/PriceScan, mysql (schema), compose
Notable:
- imrigp/SuperPrice, pretty nice, working API, with a maintained demo online, search...
- ganoti/prices in java, with complete crawlers
- yonicd/supermarketprices with baskets to price including gas - in R.
To do it:
- Make sure your dev server is visible from your mobile on windows make sure your network is defined as a "private" network if you use VSCode remote debugging and forward ports from your laptop to your server, you'll need settings>forward>localPortHost>allInterfaces a quick test is pinging your laptop from your mobile...
- then, the main issue is that we use feature only available in "secure contexts"
to make it work, there are multiple options:
- disable checking for this..... for instance in chrome: https://stackoverflow.com/questions/34878749/in-androids-google-chrome-how-to-set-unsafely-treat-insecure-origin-as-secure
- connect via USB and forward localhost:3000 to your laptop. to do this, setup dev mode on your device, install adb, and call adb reverse tcp:3000 tcp:3000
- Get a self-signed certificate for your dev server (google "nextjs ssl dev"...)
- You can play with DNS to have dev.localhost resolve to your laptop's IP: A. On your device, try to edit /system/etc/hosts - but it is hard without a rooted phone.... B. fallback to apps that create a fake VPN, local, that manipulates DNS queries... personalDNSfilter but somehow *.localhost (only!) queries are not always (????) resolved by the app, so we're out of luck??