This is a Proof of Concept to show how Beacons can expose Schema.org markup and how Schema.org metadata annotations can be easily harvested to automatically maintain an updated beacon registry.
For this Proof of concept to work you will need:
- Homebrew installed. Installation
- Python 3 with virtual environment installed. Installation
- Scrapy installed. Documentation
- Install Extruct. Installation
- Install a Web Server as XAMPP to host modified beacon's pages.
- Test your Web Pages Modifications Google Schema Validator
For this mock up we did chose the DataCatalog schema.org type and just few properties to prove the concept.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Beacon Sample Structure</title>
</head>
<body>
<div vocab="http://Schema.org/" typeof="DataCatalog" class="main-container">
<h1 property="name">Beacon's Name</h1>
<span property="description">Here you put the Beacon's despcription</span>
<select typeof="DataSet" property="dataset">
<option property="name" value="dataset_name_1">dataset_name_1</option>
<option property="name" value="dataset_name_2">dataset_name_2</option>
<option property="name" value="dataset_name_3">dataset_name_3</option>
<option property="name" value="dataset_name_4">dataset_name_4</option>
</select>
<span property="version">API version 0.0.0</span>
</div>
</body>
</html>
Process we followed to integrate metadata from several beacons mockups
Steps to reproduce this proof of concept
- Download beacon’s pages with schema, folder: beacons_pages_schema.
- Host this folder in your localhost httpdocs directory of XAMPP
- Download Python 3 virtual environment folder: python_code
- Create in python_code directory a folder called
beacon_registry
->mkdir beacon_registry
. - Open terminal or Console make a virtual environment called
beacon_registry
with the following commandpyvenv beacon_registry
- Go to python_code folder an execute
beacon_registry/bin/activate
- Execute spider
beacon_spider.py
located atbioschemas-beacons-mockup/python_code/beacon_spyder/beacon_spyder/spiders/
usando el siguiente comando:$ scrapy crawl beacons_links
- Open
index.html
file created at thebioschemas-beacons-mockup/results_page/
directory.
For futher information contact or @guicalman.