website: https://indeed-scraper-react.netlify.app/
Website (frontend) is hosted on netlify and backend is hosted on heroku.
Scraping API (backend): https://indeed-scraper-react.herokuapp.com/api/scrape
The API was developed using flask, requests and BeautifulSoup. Currently, only 1 API is supported (above link). The API takes a JSON object with information descibed under documentation and returns an array of JSON objects (dict) of job listings. The website works for all current Indeed worldwide domains (List).
link: https://indeed-scraper-react.herokuapp.com/api/scrape
The API takes a JSON object with following paramater title [1], location [2], pages [3], country [4], distance [5], date [6]. (all parameters are mandatory)
headers: {'Content-Type':"application/json"}
[1] title (string): Job posting title ex: software developer, baker, cashier, ...
[2] location (string): city of job listing, ex: Toronto, new york, miami
[3] pages (int): No of pages worth of job listings user wants to scrape. 1 page contains 15 job positings.
[4] country: ISO country code (for united states, use 'www' instead of 'us'). Ex: Canada: ca.
[5] distance: Job search radius based on input location.
- Default (no distance): 'Distance in KM'
- Exact: 'exact'
- 5 KM : '5'
- 10 KM : '10'
- 15 KM : '15'
- 20 KM : '20'
- 50 KM : '50'
- 100 KM : '100'
[6] date: When the job listing was posted, ex: 3 days will show only job listings posted in last 3 days.
- Default (no preference): 'D'
- 24 hrs : '24'
- 3 days: '3'
- 7 days: '7'
- 14 days: '14'
Ex:
{
"title": "software developer",
"location": "Toronto",
"pages": 2,
"country": "ca",
"distance": "Distance in KM",
"date" : "14"
}
Ex:
{
"title": "baker",
"location": "new york",
"pages": 1,
"country": "www",
"distance": "10",
"date" : "D
}
The API will return an JSON object with array of objects containing each job posting as an object. Each job lisiting object contains: key [1], title [2], company [3], location [4], type [5], salary [6], job link [7], summary [8] and date [9].
- [1] key: unique key for each job listing object
- [2] title: job listing title
- [3] company: company who posted the listing
- [4] location: location of the positing
- [5] type: full time or part time (when not mentioned on the posting, it might return the salary)
- [6] salary: Salary if mentioned or else empty string
- [7] job link: job link for the positing
- [8] summary: few summary points mentioned on the card
- [9] date: when it was posted such as 1 day ago, 2 days ago, 30+ days,..
ex:
{
"result": [
{
"key": 1,
"title":"Bread Baker",
"company":"Forno Cultura",
"location": "Toronto, ON",
"type":"full-time",
"salary": "$42,000–$48,000 a year",
"date":"",
"jobLink":"https://ca.indeed.com/viewjob?jk=2bf8b793b6662d31"
},
{
"key": 2,
"title":"Baker",
"company":""Phipps Desserts"",
"location": "Toronto, ON",
"type":"full-time",
"salary": "From $16.50 an hour",
"date": "4 days ago",
"jobLink":"https://ca.indeed.com/viewjob?jk=621fb0c19055bb94"
},
...
]
}