spark/run

#!/bin/bash

SCALA_VERSION=2.9.2

# Figure out where the Scala framework is installed
FWDIR="$(cd `dirname $0`; pwd)"

# Export this as SPARK_HOME
export SPARK_HOME="$FWDIR"

# Load environment variables from conf/spark-env.sh, if it exists
if [ -e $FWDIR/conf/spark-env.sh ] ; then
  . $FWDIR/conf/spark-env.sh
fi

if [ "$SPARK_LAUNCH_WITH_SCALA" == "1" ]; then
  if [ `command -v scala` ]; then
    RUNNER="scala"
  else
    if [ -z "$SCALA_HOME" ]; then
      echo "SCALA_HOME is not set" >&2
      exit 1
    fi
    RUNNER="${SCALA_HOME}/bin/scala"
  fi
else
  if [ `command -v java` ]; then
    RUNNER="java"
  else
    if [ -z "$JAVA_HOME" ]; then
      echo "JAVA_HOME is not set" >&2
      exit 1
    fi
    RUNNER="${JAVA_HOME}/bin/java"
  fi
  if [ -z "$SCALA_LIBRARY_PATH" ]; then
    if [ -z "$SCALA_HOME" ]; then
      echo "SCALA_HOME is not set" >&2
      exit 1
    fi
    SCALA_LIBRARY_PATH="$SCALA_HOME/lib"
  fi
fi

# Figure out how much memory to use per executor and set it as an environment
# variable so that our process sees it and can report it to Mesos
if [ -z "$SPARK_MEM" ] ; then
  SPARK_MEM="512m"
fi
export SPARK_MEM

# Set JAVA_OPTS to be able to load native libraries and to set heap size
JAVA_OPTS="$SPARK_JAVA_OPTS"
JAVA_OPTS+=" -Djava.library.path=$SPARK_LIBRARY_PATH"
JAVA_OPTS+=" -Xms$SPARK_MEM -Xmx$SPARK_MEM"
# Load extra JAVA_OPTS from conf/java-opts, if it exists
if [ -e $FWDIR/conf/java-opts ] ; then
  JAVA_OPTS+=" `cat $FWDIR/conf/java-opts`"
fi
export JAVA_OPTS

CORE_DIR="$FWDIR/core"
REPL_DIR="$FWDIR/repl"
EXAMPLES_DIR="$FWDIR/examples"
BAGEL_DIR="$FWDIR/bagel"

# Build up classpath
CLASSPATH="$SPARK_CLASSPATH"
CLASSPATH+=":$FWDIR/conf"
CLASSPATH+=":$CORE_DIR/target/scala-$SCALA_VERSION/classes"
if [ -n "$SPARK_TESTING" ] ; then
  CLASSPATH+=":$CORE_DIR/target/scala-$SCALA_VERSION/test-classes"
fi
CLASSPATH+=":$CORE_DIR/src/main/resources"
CLASSPATH+=":$REPL_DIR/target/scala-$SCALA_VERSION/classes"
CLASSPATH+=":$EXAMPLES_DIR/target/scala-$SCALA_VERSION/classes"
if [ -e "$FWDIR/lib_managed" ]; then
  for jar in `find "$FWDIR/lib_managed/jars" -name '*jar'`; do
    CLASSPATH+=":$jar"
  done
  for jar in `find "$FWDIR/lib_managed/bundles" -name '*jar'`; do
    CLASSPATH+=":$jar"
  done
fi
for jar in `find "$REPL_DIR/lib" -name '*jar'`; do
  CLASSPATH+=":$jar"
done
for jar in `find "$REPL_DIR/target" -name 'spark-repl-*-shaded-hadoop*.jar'`; do
  CLASSPATH+=":$jar"
done
CLASSPATH+=":$BAGEL_DIR/target/scala-$SCALA_VERSION/classes"
export CLASSPATH # Needed for spark-shell

# Figure out whether to run our class with java or with the scala launcher.
# In most cases, we'd prefer to execute our process with java because scala
# creates a shell script as the parent of its Java process, which makes it
# hard to kill the child with stuff like Process.destroy(). However, for
# the Spark shell, the wrapper is necessary to properly reset the terminal
# when we exit, so we allow it to set a variable to launch with scala.
if [ "$SPARK_LAUNCH_WITH_SCALA" == "1" ]; then
  EXTRA_ARGS=""     # Java options will be passed to scala as JAVA_OPTS
else
  CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-library.jar"
  CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-compiler.jar"
  CLASSPATH+=":$SCALA_LIBRARY_PATH/jline.jar"
  # The JVM doesn't read JAVA_OPTS by default so we need to pass it in
  EXTRA_ARGS="$JAVA_OPTS"
fi

exec "$RUNNER" -cp "$CLASSPATH" $EXTRA_ARGS "$@"
Initial commit 2010-03-30 03:17:55 +04:00			`#!/bin/bash`

Update Scala version dependency to 2.9.2 2012-09-25 01:12:48 +04:00			`SCALA_VERSION=2.9.2`
Update run to work with SBT managed dependencies and the newly introduced repl module. 2011-05-27 13:20:34 +04:00
Initial commit 2010-03-30 03:17:55 +04:00			`# Figure out where the Scala framework is installed`
Set absolute path for SPARK_HOME 2010-10-16 23:18:02 +04:00			FWDIR="$(cd `dirname $0`; pwd)"
Initial commit 2010-03-30 03:17:55 +04:00
Added code so that Spark jobs can be launched from outside the Spark directory by setting SPARK_HOME and locating the executor relative to that. Entries on SPARK_CLASSPATH and SPARK_LIBRARY_PATH are also passed along to worker nodes. 2010-10-16 06:42:26 +04:00			`# Export this as SPARK_HOME`
			`export SPARK_HOME="$FWDIR"`

Made it possible to set various Spark options and environment variables in general through a conf/spark-env.sh script. 2010-07-20 05:00:30 +04:00			`# Load environment variables from conf/spark-env.sh, if it exists`
			`if [ -e $FWDIR/conf/spark-env.sh ] ; then`
			`. $FWDIR/conf/spark-env.sh`
			`fi`

Tweaked run file to live more happily with typesafe's debian package 2012-10-23 00:10:47 +04:00			`if [ "$SPARK_LAUNCH_WITH_SCALA" == "1" ]; then`
			if [ `command -v scala` ]; then
			`RUNNER="scala"`
			`else`
			`if [ -z "$SCALA_HOME" ]; then`
			`echo "SCALA_HOME is not set" >&2`
			`exit 1`
			`fi`
			`RUNNER="${SCALA_HOME}/bin/scala"`
			`fi`
			`else`
			if [ `command -v java` ]; then
			`RUNNER="java"`
			`else`
			`if [ -z "$JAVA_HOME" ]; then`
			`echo "JAVA_HOME is not set" >&2`
			`exit 1`
			`fi`
			`RUNNER="${JAVA_HOME}/bin/java"`
			`fi`
			`if [ -z "$SCALA_LIBRARY_PATH" ]; then`
			`if [ -z "$SCALA_HOME" ]; then`
			`echo "SCALA_HOME is not set" >&2`
			`exit 1`
			`fi`
			`SCALA_LIBRARY_PATH="$SCALA_HOME/lib"`
			`fi`
More work to allow Spark to run on the standalone deploy cluster. 2012-07-09 01:00:04 +04:00			`fi`

Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`# Figure out how much memory to use per executor and set it as an environment`
			`# variable so that our process sees it and can report it to Mesos`
More work to allow Spark to run on the standalone deploy cluster. 2012-07-09 01:00:04 +04:00			`if [ -z "$SPARK_MEM" ] ; then`
Updated to newest Mesos API, which includes better memory accounting by specifying per-executor memory. 2011-08-02 00:54:48 +04:00			`SPARK_MEM="512m"`
Made it possible to set various Spark options and environment variables in general through a conf/spark-env.sh script. 2010-07-20 05:00:30 +04:00			`fi`
Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`export SPARK_MEM`
Made it possible to set various Spark options and environment variables in general through a conf/spark-env.sh script. 2010-07-20 05:00:30 +04:00
			`# Set JAVA_OPTS to be able to load native libraries and to set heap size`
			`JAVA_OPTS="$SPARK_JAVA_OPTS"`
Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`JAVA_OPTS+=" -Djava.library.path=$SPARK_LIBRARY_PATH"`
Made it possible to set various Spark options and environment variables in general through a conf/spark-env.sh script. 2010-07-20 05:00:30 +04:00			`JAVA_OPTS+=" -Xms$SPARK_MEM -Xmx$SPARK_MEM"`
			`# Load extra JAVA_OPTS from conf/java-opts, if it exists`
Initial commit 2010-03-30 03:17:55 +04:00			`if [ -e $FWDIR/conf/java-opts ] ; then`
			JAVA_OPTS+=" `cat $FWDIR/conf/java-opts`"
			`fi`
			`export JAVA_OPTS`

Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`CORE_DIR="$FWDIR/core"`
			`REPL_DIR="$FWDIR/repl"`
			`EXAMPLES_DIR="$FWDIR/examples"`
			`BAGEL_DIR="$FWDIR/bagel"`
Made examples and core subprojects 2011-02-02 02:11:08 +03:00
			`# Build up classpath`
Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`CLASSPATH="$SPARK_CLASSPATH"`
			`CLASSPATH+=":$FWDIR/conf"`
			`CLASSPATH+=":$CORE_DIR/target/scala-$SCALA_VERSION/classes"`
Made run script add test-classes onto the classpath only if SPARK_TESTING is set; fixes #216 2012-10-07 08:19:16 +04:00			`if [ -n "$SPARK_TESTING" ] ; then`
			`CLASSPATH+=":$CORE_DIR/target/scala-$SCALA_VERSION/test-classes"`
			`fi`
More work on deploy code (adding Worker class) 2012-07-01 03:43:27 +04:00			`CLASSPATH+=":$CORE_DIR/src/main/resources"`
Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`CLASSPATH+=":$REPL_DIR/target/scala-$SCALA_VERSION/classes"`
			`CLASSPATH+=":$EXAMPLES_DIR/target/scala-$SCALA_VERSION/classes"`
Make "run" script work with Maven builds 2012-12-11 03:12:59 +04:00			`if [ -e "$FWDIR/lib_managed" ]; then`
			for jar in `find "$FWDIR/lib_managed/jars" -name '*jar'`; do
			`CLASSPATH+=":$jar"`
			`done`
			for jar in `find "$FWDIR/lib_managed/bundles" -name '*jar'`; do
			`CLASSPATH+=":$jar"`
			`done`
			`fi`
			for jar in `find "$REPL_DIR/lib" -name '*jar'`; do
Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`CLASSPATH+=":$jar"`
Update run to work with SBT managed dependencies and the newly introduced repl module. 2011-05-27 13:20:34 +04:00			`done`
Make "run" script work with Maven builds 2012-12-11 03:12:59 +04:00			for jar in `find "$REPL_DIR/target" -name 'spark-repl--shaded-hadoop.jar'`; do
Further fixes to how Mesos is found and used 2012-03-18 00:39:14 +04:00			`CLASSPATH+=":$jar"`
Initial commit 2010-03-30 03:17:55 +04:00			`done`
More work to allow Spark to run on the standalone deploy cluster. 2012-07-09 01:00:04 +04:00			`CLASSPATH+=":$BAGEL_DIR/target/scala-$SCALA_VERSION/classes"`
Changed printlns to log statements and fixed a bug in run that was causing it to fail on a Mesos cluster 2010-09-29 10:22:07 +04:00			`export CLASSPATH # Needed for spark-shell`
Initial commit 2010-03-30 03:17:55 +04:00
More work to allow Spark to run on the standalone deploy cluster. 2012-07-09 01:00:04 +04:00			`# Figure out whether to run our class with java or with the scala launcher.`
			`# In most cases, we'd prefer to execute our process with java because scala`
			`# creates a shell script as the parent of its Java process, which makes it`
			`# hard to kill the child with stuff like Process.destroy(). However, for`
			`# the Spark shell, the wrapper is necessary to properly reset the terminal`
			`# when we exit, so we allow it to set a variable to launch with scala.`
			`if [ "$SPARK_LAUNCH_WITH_SCALA" == "1" ]; then`
Fixed SPARK_MEM not being passed when runner is java 2012-07-29 06:53:31 +04:00			`EXTRA_ARGS="" # Java options will be passed to scala as JAVA_OPTS`
Initial commit 2010-03-30 03:17:55 +04:00			`else`
Tweaked run file to live more happily with typesafe's debian package 2012-10-23 00:10:47 +04:00			`CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-library.jar"`
			`CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-compiler.jar"`
			`CLASSPATH+=":$SCALA_LIBRARY_PATH/jline.jar"`
Fixed SPARK_MEM not being passed when runner is java 2012-07-29 06:53:31 +04:00			`# The JVM doesn't read JAVA_OPTS by default so we need to pass it in`
			`EXTRA_ARGS="$JAVA_OPTS"`
Initial commit 2010-03-30 03:17:55 +04:00			`fi`

Fixed SPARK_MEM not being passed when runner is java 2012-07-29 06:53:31 +04:00			`exec "$RUNNER" -cp "$CLASSPATH" $EXTRA_ARGS "$@"`