spark over YARN
I don’t know why a lot of people start using Spark
, but it is very easy to integrate it with YARN
.
Deploy you
Yarn
andHDFS
as usual, let’s say configuration folder is at$HADOOP_HOME/conf
wherecore_site.xml
and other files could be found.Unzip you
Spark
binary file and set the$HADOOP_CONF_DIR
variables correctly inconf/spark-env.sh
.export HADOOP_HOME=/path/to/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
Yeah, configuration completed, easy isn’t it?
Test, use command and example jar file in spark
./bin/spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn-cluster \ --num-executors 3 \ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ lib/spark-examples*.jar \ 10
It is really nice and easy!
spark over YARN
https://rug.al/2015/2015-04-14-spark-over-yarn/