fix kernel generation for Spark Yarn // TOREE-97 #141

ribamar-santarosa · 2017-09-12T13:11:20Z

It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user doesn't add manually in their kernel.json definition, the env vars for HADOOP_CONF_DIR. Without that env var, Spark doesn't know what to do with the option --master=yarn (set in __TOREE_SPARK_OPTS__). It would be desirable to have it by default, and this patch provides this functionality.

Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, HADOOP_CONF_DIR and SPARK_CONF_DIR look basic to be exported. Even for an Spark Standalone deployment, HADOOP_CONF_DIR won't hurt. So, here it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue.

ribamar-santarosa · 2017-09-12T13:36:10Z

There is a failure on the CI that doesn't look related to that patch:

failed to register layer: Error processing tar file(exit status 1): write /opt/conda/envs/python2/lib/python2.7/site-packages/Cython/Compiler/Code.so: no space left on device

Unless writing the paths of those 2 env vars are so big that is consuming all the storage! =)

lammic · 2017-09-12T15:00:20Z

It would be also useful to export JAVA_HOME, in the case I do not want to use the default one, but a specific release.

lresende · 2017-09-12T18:13:24Z

What is the difference with setting HADOOP_CONF_DIR in $SPARK_HOME/spark-env.sh ?
More generically, why manage system wide configurations in the kernelspec?

ribamar-santarosa · 2017-09-13T10:13:16Z

Good question. So, it's clear that, if Spark configuration aren't in the default location, the user needs to be able to inform Spark where they are -- for this, SPARK_CONF_DIR.

HADOOP_CONF_DIR is a bit trickier, because the standard idea is to think that Spark on Yarn is tied to a single instance of Hadoop -- so spark-env.sh suffices. Like anything in computing, somebody will try to expand a 1-1 relationship to 1-N -- we can make module load another_instance_of_hadoop , that will dynamically overwrite HADOOP_CONF_DIR . Then, we can go and install a Toree kernel for that tuple Hadoop-Spark.

lresende · 2017-09-13T14:27:51Z

Maybe I am misunderstanding this, while with vanilla Toree you have the option to get your own local spark started (e.g. local[...]), when considering an enterprise environment where there is a large Spark cluster managed by Yarn you just need to connect to it, thus all these configuration being managed by spark and Hadoop configuration files directly. Also, in most of the enterprise deployments, the cluster is deployed based on some distribution which includes many other components, and we don't want to, and we don't need to, make Toree aware of them.

Anyway, what is the scenario you are trying to accomplish with these changes?

ribamar-santarosa · 2017-09-13T14:57:10Z

", thus all these configuration being managed by spark and Hadoop configuration files directly." sure, but how do you tell Hadoop and Spark where to find the env.sh file, if they're not in the default location? With SPARK_CONF_DIR and HADOOP_CONF_DIR.

The scenario is very simple: in Bright Cluster Manager, users can have many Hadoop instances and many Spark instances. And they're able to connect any of those Spark instances with Yarn of any of those Hadoop instances. If there are many instances, there are many configurations files, and so, they cannot be in the default location, right? We could have a Jupyter/JupyterHub/Toree deployment (like we do with other tools) per Spark instance. But things are much simpler: we just need to add one kernel.json per Spark instance, with different SPARK_CONF_DIR and HADOOP_CONF_DIR. If those variables woudn't be there, all of those kernel.json would be identical, and then, how to tell which one to use?

Indeed, this PR is not really a requirement for us -- during our integration process, we already update the JSON files to contain those variables,so our user already benefit from the possibility of accessing a Toree kernel per Spark instance. We are just really trying to give back our 2 cents for the case there are Toree vanilla users trying to achieve the same without our product.

lresende · 2017-09-13T16:49:26Z

etc/pip_install/toree/toreeapp.py

+        help='''Specify where the hadoop config files can be found.'''
+    )
+    spark_conf_dir = Unicode(os.getenv(SPARK_CONF_DIR, '/usr/local/spark'), config=True,
+        help='''Specify where the spark config files can be found.'''


Should the default value actually be /usr/local/spark/conf ?

sure, it has to be the place where spark-env.sh is found.

lresende · 2017-09-13T16:51:13Z

etc/pip_install/toree/toreeapp.py

@@ -57,6 +59,12 @@ class ToreeInstall(InstallKernelSpec):
    spark_home = Unicode(os.getenv(SPARK_HOME, '/usr/local/spark'), config=True,


In general, do we actually want to use default values here? I am assuming we don't really have a standard default place to deploy Spark/Hadoop and maybe it would be better to use the env variables if they are available, otherwise, ignore or in case of required ones throw an error?

If they're set to be empty, it's just like they're not set -- so it's true that probably they're better empty.

lresende · 2017-09-13T16:52:07Z

@ribamar-santarosa Thanks for the explanation, I believe I understand the scenario now and have just minor comments on the changes.

fix kernel generation for Spark Yarn // TOREE-97

688e96e

lresende reviewed Sep 13, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix kernel generation for Spark Yarn // TOREE-97 #141

fix kernel generation for Spark Yarn // TOREE-97 #141

ribamar-santarosa commented Sep 12, 2017 •

edited

Loading

ribamar-santarosa commented Sep 12, 2017

lammic commented Sep 12, 2017

lresende commented Sep 12, 2017

ribamar-santarosa commented Sep 13, 2017

lresende commented Sep 13, 2017

ribamar-santarosa commented Sep 13, 2017

lresende Sep 13, 2017

ribamar-santarosa Sep 13, 2017

lresende Sep 13, 2017

ribamar-santarosa Sep 13, 2017

lresende commented Sep 13, 2017

		@@ -57,6 +59,12 @@ class ToreeInstall(InstallKernelSpec):
		spark_home = Unicode(os.getenv(SPARK_HOME, '/usr/local/spark'), config=True,

fix kernel generation for Spark Yarn // TOREE-97 #141

Are you sure you want to change the base?

fix kernel generation for Spark Yarn // TOREE-97 #141

Conversation

ribamar-santarosa commented Sep 12, 2017 • edited Loading

ribamar-santarosa commented Sep 12, 2017

lammic commented Sep 12, 2017

lresende commented Sep 12, 2017

ribamar-santarosa commented Sep 13, 2017

lresende commented Sep 13, 2017

ribamar-santarosa commented Sep 13, 2017

lresende Sep 13, 2017

Choose a reason for hiding this comment

ribamar-santarosa Sep 13, 2017

Choose a reason for hiding this comment

lresende Sep 13, 2017

Choose a reason for hiding this comment

ribamar-santarosa Sep 13, 2017

Choose a reason for hiding this comment

lresende commented Sep 13, 2017

ribamar-santarosa commented Sep 12, 2017 •

edited

Loading