A private, automated, cloud-enabled price tracker.
Make a file called config.yaml
and store it somewhere safe, like ~/pricekeeper/config.yaml
.
Use this as a jumping off point:
storage:
type: azure
account: mystorageaccountname
key: abcde............................67890==
scheduler:
spreadtime: 30
rules:
- name: Complete Guide to Docker for Beginners
category: Amazon
selector: '.a-price .a-offscreen'
url: https://www.amazon.com/dp/B08BW5Y73D/
hours: '*'
- name: Learning Python 5th Edition
category: Amazon
selector: '.a-price .a-offscreen'
template: amazon
url: https://www.amazon.com/dp/1449355730/
hours: 0,3,6,9,12,15,18,21
- name: Pixel 7 Pro
category: Google
regex:
- meta itemprop="lowPrice" content="(.+?)"
- meta itemprop="price" content="(.+?)"
url: https://store.google.com/product/pixel_7_pro
hours: '0,12'
- name: Pixel 7 Pro Case
category: Google
regex:
- meta itemprop="lowPrice" content="(.+?)"
- meta itemprop="price" content="(.+?)"
url: https://store.google.com/product/casemate_tough_clear_cases
hours: '0,12'
For a complete list of configuration options, see below.
Item | Description |
---|---|
Image Name | qjake/pricekeeper:latest |
Port(s) | 9600 |
Mount(s) | /app/config.yaml |
Volume(s) | None |
Environment Vars | PKAPP_PORT (sets port number, default=9600 )PKAPP_LISTEN (sets listen addr, default=0.0.0.0 ) |
docker run -d -p 9600:9600 -v /path/to/your/config.yaml:/app/config.yaml --name pricekeeper qjake/pricekeeper:latest
Sample docker-compose.yml
:
version: "3.9"
services:
pricekeeper:
container_name: pricekeeper
restart: unless-stopped
image: qjake/pricekeeper:latest
volumes:
- /path/to/your/config.yaml:/app/config.yaml
ports:
- "9600:9600"
# Optional, if you want to include a health check
healthcheck:
test: curl -s --fail http://127.0.0.1:9600/_health || exit 1
interval: 10s
retries: 3
start_period: 5s
timeout: 5s
Type: object
Key | Type | Required | Value |
---|---|---|---|
type |
string | β | azure (Currently, only Azure Tables are supported.) |
account |
string | β | Azure Storage account name (do not include ".table.core.windows.net") |
key |
string | β | Azure Storage Accout access key (not a SAS, and not the connection string) |
table |
string | Optionally, the prefix name to give to the tables that are created. See the tables documentation below for more info. |
Type: object
Key | Type | Required | Value |
---|---|---|---|
spreadtime |
int | Number of random seconds to add to a job to spread multiple jobs out that would otherwise run at the same time. (Sometimes referred to as "jitter".) Omit to disable random spread. |
Type: array
Key | Type | Required | Templatable | Variables | Value |
---|---|---|---|---|---|
name |
string | β | A distinct name for this rule. Do not include any special (URL-unsafe) characters. | ||
category |
string | β | β | A category to file this rule under, for visually grouping rules. Rules with the exact same category name are displayed together. (e.g. "Amazon" or "Google") | |
url |
string | β | β | β | The URL to fetch that contains the price. The price must be in the source/response body of the page (dynamic prices rendered via JS will not work or will require a different/creative solution). Doesn't have to be HTML - you can fetch a public API endpoint too. |
link |
string | β | β | If set, will display a button on the UI that will open this link in a new tab. Useful for quickly navigating to the store page. | |
hours |
string | β | β | A cron-like expression for which hours to run the price fetch job. (e.g. '*' or '9,12,15' or '*/3' ) |
|
mins |
string | β | A cron-like expression for which minutes to run the price fetch job. Defaults to '0' . |
||
selector |
string | array[string] | β
(or regex ) |
β | β | One or more CSS selectors to look for a price value inside. The text value of all of the matched elements is taken as a single string, and a decimal value is extracted from the text. If multiple selectors are specified, they are executed from top to bottom until a price is found. |
regex |
string | array[string] | β
(or selector ) |
β | β | One or more regular expressions to look for a price value. The first capture group should contain a price-like value. (If you need other capture groups in your expression, make sure they are non-capture groups [(?:...) ].) If multiple expressions are specified, they are executed from top to bottom until a price is found. |
divide |
bool | β | true if the price is in a non-decimal format like 2000 but should be interpreted as $20.00 . The value will be divided by 100. |
||
referer |
string | β | β | If set, will send an HTTP Referer header with this value. |
|
template |
string | If set, will apply a template to this rule. See the template section below for more information. |
Type: object
You may want to track prices from a public API endpoint, or a page with multiple prices on it. Cache calls to a specific URL to improve performance.
The key is the URL and the value is the amount of seconds to cache that URL for.
For example:
cache:
https://example.org/store/prices: 600 # 5 mins
https://example.com/product/listing: 120 # 2 mins
Type: object
Templates are used to apply common attributes to multiple rules. For example, the selector '.a-price .a-offscreen'
can apply to most, if not all, Amazon listings. So instead of duplicating the rule many times, you can apply a template to it instead.
Define a template by giving it a name, as the key of a dictionary.
The properties inside the template definition will be copied to each rule that inherits from this template.
Variables can be defined on a rule, and applied to various properties on that rule (or on the parent template).
For example, if you have 10 rules for a fictional store such as https://mystore.example.com/product/123
, you can put the URL into the parent template like so:
templates:
...
mystore:
url: https://mystore.example.com/product/{id}
And then, instead of duplicating each URL, simply define the variable for the product ID in the rule:
rules:
- name: My Product
template: mystore
vars:
id: 123
You can define as many variables as you want. Refer to the table above for which proeprties support variable replacement.
The original config.yaml
example at the top of this Readme contains duplicated information. We can rewrite this example using templates to avoid duplication:
storage:
type: azure
account: mystorageaccountname
key: abcde............................67890==
scheduler:
spreadtime: 30
rules:
- name: Complete Guide to Docker for Beginners
template: amazon
vars:
productId: 'B08BW5Y73D'
- name: Learning Python 5th Edition
template: amazon
vars:
productId: '1449355730'
- name: Pixel 7 Pro
template: google
url: https://store.google.com/product/pixel_7_pro
- name: Pixel 7 Pro Case
template: google
url: https://store.google.com/product/casemate_tough_clear_cases
templates:
amazon:
category: Amazon
hours: '*'
selector: '.a-price .a-offscreen'
url: https://www.amazon.com/dp/{productId}/
link: https://www.amazon.com/dp/{productId}/
google:
category: Google
hours: '0,12'
regex:
- meta itemprop="lowPrice" content="(.+?)"
- meta itemprop="price" content="(.+?)"
PriceKeeper creates the following tables:
prices
- Used to store historical pricing datacurrent
- Used to store current pricing data (only for quick retrieval of summary data)sparklines
- Used to store the generated sparkline images of the price history for the past 48 hours (only one image is kept per rule)
If the optional config.storage.table
key is present, the tables will be prefixed by this value separated by an underscore. For example, if the configuration is:
config:
storage:
...
table: myvalue
Then the prices table would be created in Azure as myvalue_prices
instead of the default prices
.