data-restification-kit

Boilerplate code that can be used for datasets restification.

The restification procedure uses the following steps:

Register the dataset and retrieve schema.
Create a REST service that supports the identified schema.
Use the service to populate a MongoDB with the raw data and provide API.

Dataset registration

In order to register a certain dataset use the following steps:

Create a folder containing the dataset name into the directory datasets/ e.g. (metrics) and place all the data files that contain the raw data.
Create the data schema, which is necessary for the restification service. In order to do so, you can use the following code:

# Initialize datasets handler
dH = datasetsHandler('datasets')

# Get the data files of the first dataset
data_files_path = dH.get_datasets_files_path(dH.datasets_to_import[0])

# Create the schema and store it in schemas directory
df = dH.read_data(data_files_path[0], columns = 'all')
dH.schema_extractor(df, 'schemas/class-metrics.schema')

Run REST Service

In order to run the REST service based on your created schemas, you can simply run service.py. The configuration details are documented in settings.py. The service is based on the Python REST API framework Eve.

Import data

Once you have run the REST service, you can import the data using the following commands:

# Initialize the datasets handler
dH = datasetsHandler('datasets')

# Initialize the data importer
dI = dataImporter.dataImporter(dH)

# Import data (change the url according with your configuration)
dI.import_data('http://localhost:5000/api/v1/')

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
ImportService		ImportService
RetrieveService		RetrieveService
datasets		datasets
schemas		schemas
utilities		utilities
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
properties.py		properties.py
requirements.txt		requirements.txt
service.py		service.py
settings.py		settings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data-restification-kit

Dataset registration

Run REST Service

Import data

About

Releases

Packages

Languages

License

MichaelPap/data-restification-kit

Folders and files

Latest commit

History

Repository files navigation

data-restification-kit

Dataset registration

Run REST Service

Import data

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages