spark over YARN
I don’t know why a lot of people start using Spark, but it is very easy to integrate it with YARN.
Deploy you
YarnandHDFSas usual, let’s say configuration folder is at$HADOOP_HOME/confwherecore_site.xmland other files could be found.Unzip you
Sparkbinary file and set the$HADOOP_CONF_DIRvariables correctly inconf/spark-env.sh.export HADOOP_HOME=/path/to/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopYeah, configuration completed, easy isn’t it?
Test, use command and example jar file in spark
./bin/spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn-cluster \ --num-executors 3 \ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ lib/spark-examples*.jar \ 10
It is really nice and easy!
spark over YARN
https://rug.al/2015/2015-04-14-spark-over-yarn/