Merge version 0.1.1 (#36) * Refactor structure (#1) * Restructure directories on high-level concepts. * Fix cross references from docs. * Copy bdr-data-science-stack contents into the data-science-box to start off with. (#2) * Merge basics of cents setup and anaconda (#3) * Copy bdr-data-science-stack contents into the data-science-box to start off with. * CentOS 7 with virtualbox shared folder * Added starting point for Jupyterhub in data science box * Basic single user jupyter working (#4) * Copy bdr-data-science-stack contents into the data-science-box to start off with. * JupyterHub with sudospawner working * Change command for single user server to standalone notebook and use root to run on port 80 for simplification (no need to run as separate user since host only anyway). * Spark clients installation module (incl. java 8) (#6) * Update README * Spark kernels added + conda pre-installed environments. (#7) * Quick fix nb extension updates not working when vagrant up initially * Add PYSPARK_PYTHON to kernel (#8) * Add PYSPARK_PYTHON to kernel * Overwrite kernel files with new values * Mount bdr-infra-stack's parent dir as notebook root instead of data-science-box dir. (#9) * Update README.md (#13) Extremely usefull tip included * Disabled requiretty in sudoers to fix sudo spawner as a service (#14) * Extracted spark_client_kernel from spark_client (#16) * Refactor to be conform variable conventions (#17) * init data science hub (#18) * Add basic Travis CI for box and hub (#19) Travis will now run the entire box and hub playbook from scratch on every push. This takes approximately 9 minutes to complete. We can think of optimising this later / making trade-offs between full integration testing and smaller role-specific tests. * Add build status for develop * Correct build status * Elastic Search Box (#23) * refactor to match bdr-infra style * Add search-box to TravisCI * added single node data science cluster box with kafka, spark, zookeepr (#22) * added single node data science cluster box with kafka, spark and zookeeper * Merged the spark_client tasks from cluster into common components * Added travis check for new data science cluster box * added ip's to travis dockers * user defined network test for travis * added subnet for travis * ignoring .pyc files * removed python compiled file from git * Ensure UTF-8 locale enabled (#24) * Configure elastic search to be accessible from outside (#25) * Install octave and octave-kernel for jupyter (#26) Looking great. Thanks for the contribution! * Feature/travis integration (#27) * Add slack notification * Try disabling yum update because of time * Feature/cql box (#28) * added cql-box * fixed sudo rights in cql-box tasks * updated cassandra version in cql-box * Simplified setting up the cql box * added cql-box to travis * fixxed csv, avro and xml support for pyspark moved csv package import before pyspark-shell execution, this was ignored. Added avro and xml support * typo update * Speed up travis build by using git diff to see which modules changed (#30) * WIP: Feature/embedded execution layer (#31) Feature/embedded execution layer * Docker Flow proxy for hosting multiple micro services under one http endpoint (#32) * Base for gateway or docker flow proxy. * Change default overlay subnet to not conflict with default aws subnet * Use rsync folder because of guest addition failures * Use rsync folder because of guest addition failures * Add data science api deployment script * Quick n dirty local docker registry working (#33) * Update README.md * Added virtualbox folder syncing instead of default rsync (#34) Now also works on Windows * Packer build for data-science-box (#35) * Packer build for data-science-box * Global box * Ensure jupyter is always started after a provision