Storing the output of Spark on Neo4j #32

d34th4ck3r · 2015-03-10T08:41:40Z

Presently, mazerunner provides the ability to perform graph analysis on the data already stored in Neo4j Server. However, one important feature is the ability to store the data streaming out of Spark into Neo4j in real time. And also, perform operation on that.

Example of one such condition can be: http://stackoverflow.com/questions/28896898/using-neo4j-with-apache-spark

kbastani · 2015-03-11T20:55:51Z

Can you please provide an example of how this integration might work? What is your input to Spark? What is the output? What's the acceptance criteria for this feature?

ojairob · 2015-04-14T16:31:56Z

Hello, an example might be... I have Terabytes of data in HDFS. This data is comprised of Ad Impressions, Ad Clicks, ROI events driven by the interactions of Impressions / Clicks. There are concepts of a Browser, Ad, Impression, Click, ROI event.. and timestamps / ids for everything. Using a Spark job I would like to, at scale create a neo4j graph. The implementation of which I've tried to investigate on how to scale the creation / insertion of the neo4j data. It seems Mazerunner can take the output of a graphx job and resubmit via some Queue. It also seems like Mazerunner could build a graph from a basic spark / graphx query. And finally I looked into the batch-import project which seems really fast at possibly creating the necessary neo4j files. And subsequently, it would be great to re-batch in new data.

kbastani added the enhancement label Mar 11, 2015

kbastani self-assigned this Mar 11, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storing the output of Spark on Neo4j #32

Storing the output of Spark on Neo4j #32

d34th4ck3r commented Mar 10, 2015

kbastani commented Mar 11, 2015

ojairob commented Apr 14, 2015

Storing the output of Spark on Neo4j #32

Storing the output of Spark on Neo4j #32

Comments

d34th4ck3r commented Mar 10, 2015

kbastani commented Mar 11, 2015

ojairob commented Apr 14, 2015