Skip to content
Yin Qu (屈垠) edited this page Jul 18, 2016 · 7 revisions

BigSemantics Roadmap

This document outlines roadmap for BigSemantics.

Alpha 1 release

  • Features:
    • Web service for getting meta-metadata and extracting metadata, on Node.js
    • PhantomJS based metadata extraction + unit test coverage
    • Capability of using a distributed downloader network to improve throughput
  • Documentation:
    • Design doc
    • Code style guide
    • Service API doc
  • Merge current dev branch into master. Use master as the main branch for collaborative development. Use branches only for releases and bug fixes for previous releases.

Alpha 2 release

  • Address remaining extractor issues. This includes:
    • Migrate to a complete async version
    • Filter extracted URLs using location filters. This can only be correctly done with a completely async version of the extractor
    • Fix inherited XPath issue (this eventually goes back into the Java inheritance)
  • Web service + dashboard interface for viewing logs (by task / worker) and statistics
  • Cache integration
  • Distributed downloader network performance tuning

Beta release

  • Meta-metadata repository hosting and inheritance as a web service
  • Extractor pipeline integration, including microdata extractor
  • Continuous integration and deployment tools
  • Development support for:
    • BigSemantics users: API doc, examples.
    • Wrapper authors
    • BigSemantics internal developers

Future releases

  • Migrate BigSemantics core (current /BSJS/bsjsCore) to TypeScript
  • Migrate BigSemanticsJava to TypeScript