зеркало из https://github.com/Azure/aztk.git
Removed r
This commit is contained in:
Родитель
9a6ba14915
Коммит
648a05aabc
|
@ -5,7 +5,7 @@ import os
|
|||
"""
|
||||
CLI_EXE = 'aztk'
|
||||
|
||||
DEFAULT_DOCKER_REPO = "jiata/aztk-vanilla:0.1.0-spark2.2.0"
|
||||
DEFAULT_DOCKER_REPO = "jiata/aztk-base:0.1.0-spark2.2.0"
|
||||
DOCKER_SPARK_CONTAINER_NAME = "spark"
|
||||
|
||||
# DOCKER
|
||||
|
|
|
@ -15,7 +15,7 @@ size: 2
|
|||
username: spark
|
||||
|
||||
# docker_repo: <name of docker image repo (for more information, see https://github.com/Azure/aztk/blob/master/docs/12-docker-image.md)>
|
||||
docker_repo: jiata/aztk:0.1.0-spark2.2.0-python3.5.4
|
||||
docker_repo: jiata/aztk-base:0.1.0-spark2.2.0
|
||||
|
||||
# # optional custom scripts to run on the Spark master, Spark worker or all nodes in the cluster
|
||||
# custom_scripts:
|
||||
|
|
|
@ -3,9 +3,9 @@
|
|||
# This custom script only works on images where jupyter is pre-installed on the Docker image
|
||||
#
|
||||
# This custom script has been tested to work on the following docker images:
|
||||
# - jiata/aztk-python:0.1.0-spark2.2.0-anaconda3-5.0.0 (python3.6.2)
|
||||
# - jiata/aztk-python:0.1.0-spark2.1.0-anaconda3-5.0.0 (python3.6.2)
|
||||
# - jiata/aztk-python:0.1.0-spark1.6.3-anaconda3-5.0.0 (python3.6.2)
|
||||
# - jiata/aztk-python:0.1.0-spark2.2.0-python3.6.2
|
||||
# - jiata/aztk-python:0.1.0-spark2.1.0-python3.6.2
|
||||
# - jiata/aztk-python:0.1.0-spark1.6.3-python3.6.2
|
||||
|
||||
if [ "$IS_MASTER" = "1" ]; then
|
||||
|
||||
|
|
|
@ -10,7 +10,7 @@ On top of that, we also provide two flavors of Spark images, one geared towards
|
|||
|
||||
Docker Image | Image Type | User Language(s) | What's Included?
|
||||
:-- | :-- | :-- | :--
|
||||
[aztk-vanilla](https://hub.docker.com/r/jiata/aztk-vanilla/) | Vanilla | Java, Scala | `Spark`
|
||||
[aztk-base](https://hub.docker.com/r/jiata/aztk-base/) | Base | Java, Scala | `Spark`
|
||||
[aztk-python](https://hub.docker.com/r/jiata/aztk-python/) | Pyspark | Python | `Anaconda`</br>`Jupyter Notebooks` </br> `PySpark`
|
||||
[aztk-r](https://hub.docker.com/r/jiata/aztk-r/) | SparklyR | R | `CRAN`</br>`RStudio Server`</br>`SparklyR and SparkR`
|
||||
|
||||
|
@ -22,15 +22,15 @@ Today, all the AZTK images are hosted on Docker Hub under [jiata](https://hub.do
|
|||
|
||||
Docker Repo (hosted on Docker Hub) | Spark Version | Python Version | R Version
|
||||
:-- | :-- | :-- | :--
|
||||
jiata/aztk-vanilla:0.1.0-spark2.2.0 __(defaul)__ | v2.2.0 | -- | --
|
||||
jiata/aztk-vanilla:0.1.0-spark2.1.0 | v2.1.0 | -- | --
|
||||
jiata/aztk-vanilla:0.1.0-spark1.6.3 | v1.6.3 | -- | --
|
||||
jiata/aztk-base:0.1.0-spark2.2.0 __(defaul)__ | v2.2.0 | -- | --
|
||||
jiata/aztk-base:0.1.0-spark2.1.0 | v2.1.0 | -- | --
|
||||
jiata/aztk-base:0.1.0-spark1.6.3 | v1.6.3 | -- | --
|
||||
jiata/aztk-python:0.1.0-spark2.2.0-anaconda3-5.0.0 | v2.2.0 | v3.6.2 | --
|
||||
jiata/aztk-python:0.1.0-spark2.1.0-anaconda3-5.0.0 | v2.1.0 | v3.6.2 | --
|
||||
jiata/aztk-python:0.1.0-spark1.6.3-anaconda3-5.0.0 | v1.6.3 | v3.6.2 | --
|
||||
jiata/aztk-r:0.1.0-spark2.2.0-r3.4.1 | v2.2.0 | -- | v3.4.1
|
||||
jiata/aztk-r:0.1.0-spark2.1.0-r3.4.1 | v2.1.0 | -- | v3.4.1
|
||||
jiata/aztk-r:0.1.0-spark1.6.3-r3.4.1 | v1.6.3 | -- | v3.4.1
|
||||
[coming soon] jiata/aztk-r:0.1.0-spark2.2.0-r3.4.1 | v2.2.0 | -- | v3.4.1
|
||||
[coming soon] jiata/aztk-r:0.1.0-spark2.1.0-r3.4.1 | v2.1.0 | -- | v3.4.1
|
||||
[coming soon] jiata/aztk-r:0.1.0-spark1.6.3-r3.4.1 | v1.6.3 | -- | v3.4.1
|
||||
|
||||
If you have requests to add to the list of supported images, please file a Github issue.
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
# Python
|
||||
This Dockerfile is used to build the __aztk-python__ Docker image used by this toolkit. This image uses Anaconda, providing access to a wide range of popular python packages.
|
||||
|
||||
You can modify these Dockerfiles to build your own image. However, in mose cases, building on top of the __aztk-vanilla__ image is recommended.
|
||||
You can modify these Dockerfiles to build your own image. However, in mose cases, building on top of the __aztk-base__ image is recommended.
|
||||
|
||||
NOTE: If you plan to use Jupyter Notebooks with your Spark cluster, we recommend using this image as Jupyter Notebook comes pre-installed with Anaconda.
|
||||
|
||||
|
|
|
@ -6,15 +6,7 @@ ARG ANACONDA_VERSION=anaconda3-5.0.0
|
|||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION \
|
||||
&& apt-get install unzip \
|
||||
# Fetch h2o_pysparkling
|
||||
&& pip install http://h2o-release.s3.amazonaws.com/h2o/rel-weierstrass/7/Python/h2o-3.14.0.7-py2.py3-none-any.whl \
|
||||
&& pip install h2o_pysparkling_2.2 \
|
||||
# Install Sparkling water 2.2.2
|
||||
&& cd /home \
|
||||
&& wget http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.2/2/sparkling-water-2.2.2.zip \
|
||||
&& unzip sparkling-water-2.2.2.zip
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
|
|
@ -6,18 +6,9 @@ ARG ANACONDA_VERSION=anaconda3-5.0.0
|
|||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION \
|
||||
&& apt-get install unzip \
|
||||
# Fetch h2o_pysparkling
|
||||
&& pip install http://h2o-release.s3.amazonaws.com/h2o/rel-weierstrass/7/Python/h2o-3.14.0.7-py2.py3-none-any.whl \
|
||||
&& pip install h2o_pysparkling_2.2 \
|
||||
# Install Sparkling water 2.2.2
|
||||
&& cd /home \
|
||||
&& wget http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.2/2/sparkling-water-2.2.2.zip \
|
||||
&& unzip sparkling-water-2.2.2.zip
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV SPARKLING_WATER /home/sparkling-water-2.2.2/assembly/build/libs/sparkling-water-assembly_2.11-2.2.2-all.jar
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,21 +0,0 @@
|
|||
# R
|
||||
This Dockerfile is used to build the __aztk-r__ Docker image used by this toolkit. This image uses R and RStudio Server, providing access to a wide range of popular R packages.
|
||||
|
||||
You can modify these Dockerfiles to build your own image. However, in mose cases, building on top of the __aztk-vanilla__ image is recommended.
|
||||
|
||||
## How to build this image
|
||||
This Dockerfile takes in two variables at build time that allow you to specify your desired Rstudio server versions and R versions: **RSTUDIO_SERVER_VERSION** and **R_VERSION**
|
||||
|
||||
By default, we set **R_VERSION=3.4.2** and **RSTUDIO_SERVER_VERSION=1.1.383**.
|
||||
|
||||
For example, if I wanted to use Rstudio Server v1.1.383 and R 3.2.1 with Spark v2.1.0, I would select the appropriate Dockerfile and build the image as follows:
|
||||
```sh
|
||||
# spark2.1.0/Dockerfile
|
||||
docker build \
|
||||
--build-arg RSTUDIO_SERVER_VERSION=1.1.383 \
|
||||
--build-arg R_VERSION=3.2.1 \
|
||||
-t <my_image_tag> .
|
||||
```
|
||||
|
||||
**R_VERSION** is used to set the version of R version for your cluster.
|
||||
**RSTUDIO_SERVER_VERSION** is used to set the version of rstudio server for your cluster.
|
|
@ -1,150 +0,0 @@
|
|||
FROM jiata/aztk-base:0.1.0-spark1.6.3
|
||||
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG BUILD_DATE
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV R_VERSION=${R_VERSION:-3.4.2} \
|
||||
LC_ALL=en_US.UTF-8 \
|
||||
LANG=en_US.UTF-8 \
|
||||
TERM=xterm
|
||||
|
||||
ADD bootstrap.sh /bootstrap.sh
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libbz2-1.0 \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpcre3 \
|
||||
libpng16-16 \
|
||||
libtiff5 \
|
||||
liblzma5 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
zlib1g \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
git \
|
||||
libssl-dev \
|
||||
sudo \
|
||||
wget \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="curl \
|
||||
default-jdk \
|
||||
libbz2-dev \
|
||||
libcairo2-dev \
|
||||
libcurl4-openssl-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libreadline-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb \
|
||||
zlib1g-dev" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
&& cd tmp/ \
|
||||
## Download source code
|
||||
&& /bootstrap.sh ${R_VERSION} \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& chown root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod g+wx /usr/local/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt'), repo = '$MRAN')" \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get remove --purge -y $BUILDDEPS \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
|
||||
## Downloading and Installing RStudio Server
|
||||
RUN Rscript -e "install.packages(c('tidyverse', 'sparklyr'))" \
|
||||
&& wget https://download2.rstudio.org/rstudio-server-$RSTUDIO_SERVER_VERSION-amd64.deb \
|
||||
&& gdebi rstudio-server-1.1.383-amd64.deb --non-interactive \
|
||||
&& echo "server-app-armor-enabled=0" | tee -a /etc/rstudio/rserver.conf \
|
||||
&& echo "Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> ${R_HOME}/etc/Rprofile.site \
|
||||
## Preparing default user for Rstudio Server
|
||||
&& set -e \
|
||||
&& useradd -m -d /home/rstudio rstudio \
|
||||
&& echo rstudio:rstudio | chpasswd
|
||||
|
||||
CMD ["R"]
|
||||
EXPOSE 8787
|
|
@ -1,3 +0,0 @@
|
|||
#!/bin/bash
|
||||
IFS='.' read -r -a baseVersion << $1
|
||||
curl -O https://cran.r-project.org/src/base/R-${baseVersion[0]}/R-$1.tar.gz \
|
|
@ -1,148 +0,0 @@
|
|||
FROM jiata/aztk-base:0.1.0-spark2.1.0
|
||||
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG BUILD_DATE
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV R_VERSION=${R_VERSION:-3.4.2} \
|
||||
LC_ALL=en_US.UTF-8 \
|
||||
LANG=en_US.UTF-8 \
|
||||
TERM=xterm
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libbz2-1.0 \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpcre3 \
|
||||
libpng16-16 \
|
||||
libtiff5 \
|
||||
liblzma5 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
zlib1g \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
git \
|
||||
libssl-dev \
|
||||
sudo \
|
||||
wget \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="curl \
|
||||
default-jdk \
|
||||
libbz2-dev \
|
||||
libcairo2-dev \
|
||||
libcurl4-openssl-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libreadline-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb \
|
||||
zlib1g-dev" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
&& cd tmp/ \
|
||||
## Download source code
|
||||
&& IFS='.' read -r -a baseVersion << ${R_VERSION} \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${baseVersion[0]}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& chown root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod g+wx /usr/local/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt'), repo = '$MRAN')" \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get remove --purge -y $BUILDDEPS \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
## Downloading and Installing RStudio Server
|
||||
&& Rscript -e "install.packages(c('tidyverse', 'sparklyr'))" \
|
||||
&& wget https://download2.rstudio.org/rstudio-server-$RSTUDIO_SERVER_VERSION-amd64.deb \
|
||||
&& gdebi rstudio-server-1.1.383-amd64.deb --non-interactive \
|
||||
&& echo "server-app-armor-enabled=0" | tee -a /etc/rstudio/rserver.conf \
|
||||
&& echo "Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> ${R_HOME}/etc/Rprofile.site \
|
||||
## Preparing default user for Rstudio Server
|
||||
&& set -e \
|
||||
&& useradd -m -d /home/rstudio rstudio \
|
||||
&& echo rstudio:rstudio | chpasswd
|
||||
|
||||
CMD ["R"]
|
||||
EXPOSE 8787
|
|
@ -1,148 +0,0 @@
|
|||
FROM jiata/aztk-base:0.1.0-spark2.2.0
|
||||
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG BUILD_DATE
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV R_VERSION=${R_VERSION:-3.4.2} \
|
||||
LC_ALL=en_US.UTF-8 \
|
||||
LANG=en_US.UTF-8 \
|
||||
TERM=xterm
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libbz2-1.0 \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpcre3 \
|
||||
libpng16-16 \
|
||||
libtiff5 \
|
||||
liblzma5 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
zlib1g \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
git \
|
||||
libssl-dev \
|
||||
sudo \
|
||||
wget \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="curl \
|
||||
default-jdk \
|
||||
libbz2-dev \
|
||||
libcairo2-dev \
|
||||
libcurl4-openssl-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libreadline-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb \
|
||||
zlib1g-dev" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
&& cd tmp/ \
|
||||
## Download source code
|
||||
&& IFS='.' read -r -a baseVersion << ${R_VERSION} \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${baseVersion[0]}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& chown root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod g+wx /usr/local/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt'), repo = '$MRAN')" \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get remove --purge -y $BUILDDEPS \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
## Downloading and Installing RStudio Server
|
||||
&& Rscript -e "install.packages(c('tidyverse', 'sparklyr'))" \
|
||||
&& wget https://download2.rstudio.org/rstudio-server-$RSTUDIO_SERVER_VERSION-amd64.deb \
|
||||
&& gdebi rstudio-server-1.1.383-amd64.deb --non-interactive \
|
||||
&& echo "server-app-armor-enabled=0" | tee -a /etc/rstudio/rserver.conf \
|
||||
&& echo "Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> ${R_HOME}/etc/Rprofile.site \
|
||||
## Preparing default user for Rstudio Server
|
||||
&& set -e \
|
||||
&& useradd -m -d /home/rstudio rstudio \
|
||||
&& echo rstudio:rstudio | chpasswd
|
||||
|
||||
CMD ["R"]
|
||||
EXPOSE 8787
|
|
@ -1,25 +1,25 @@
|
|||
# Docker
|
||||
Azure Distributed Data Engineering Toolkit runs Spark on Docker.
|
||||
|
||||
Supported Azure Distributed Data Engineering Toolkit images are hosted publicly on [Docker Hub](https://hub.docker.com/r/jiata/aztk/tags).
|
||||
Supported Azure Distributed Data Engineering Toolkit images are hosted publicly on [Docker Hub](https://hub.docker.com/r/jiata/aztk-base/tags).
|
||||
|
||||
## Versioning with Docker
|
||||
The default image that this package uses is a the __aztk-vanilla__ Docker image that comes with **Spark v2.2.0**.
|
||||
The default image that this package uses is a the __aztk-base__ Docker image that comes with **Spark v2.2.0**.
|
||||
|
||||
You can use several versions of the __aztk-vanilla__ image:
|
||||
- Spark 2.2.0 - jiata/aztk-vanilla:0.1.0-spark2.2.0 (default)
|
||||
- Spark 2.1.0 - jiata/aztk-vanilla:0.1.0-spark2.1.0
|
||||
- Spark 1.6.3 - jiata/aztk-vanilla:0.1.0-spark1.6.3
|
||||
You can use several versions of the __aztk-base__ image:
|
||||
- Spark 2.2.0 - jiata/aztk-base:0.1.0-spark2.2.0 (default)
|
||||
- Spark 2.1.0 - jiata/aztk-base:0.1.0-spark2.1.0
|
||||
- Spark 1.6.3 - jiata/aztk-base:0.1.0-spark1.6.3
|
||||
|
||||
We also provide two other image types tailored for the Python and R users: __aztk-r__ and __aztk-python__. You can choose between the following:
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.2.0 - jiata/aztk-python:0.1.0-spark2.2.0-anaconda3-5.0.0
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.1.0 - jiata/aztk-python:0.1.0-spark2.1.0-anaconda3-5.0.0
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 1.6.3 - jiata/aztk-python:0.1.0-spark1.6.3-anaconda3-5.0.0
|
||||
- R 3.4.0 / Spark v2.2.0 - jiata/aztk-r:0.1.0-spark2.2.0-r3.4.0
|
||||
- R 3.4.0 / Spark v2.1.0 - jiata/aztk-r:0.1.0-spark2.1.0-r3.4.0
|
||||
- R 3.4.0 / Spark v1.6.3 - jiata/aztk-r:0.1.0-spark1.6.3-r3.4.0
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.2.0 - jiata/aztk-python:0.1.0-spark2.2.0-python3.6.2
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.1.0 - jiata/aztk-python:0.1.0-spark2.1.0-python3.6.2
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 1.6.3 - jiata/aztk-python:0.1.0-spark1.6.3-python3.6.2
|
||||
- [coming soon] R 3.4.0 / Spark v2.2.0 - jiata/aztk-r:0.1.0-spark2.2.0-r3.4.1
|
||||
- [coming soon] R 3.4.0 / Spark v2.1.0 - jiata/aztk-r:0.1.0-spark2.1.0-r3.4.1
|
||||
- [coming soon] R 3.4.0 / Spark v1.6.3 - jiata/aztk-r:0.1.0-spark1.6.3-r3.4.1
|
||||
|
||||
*Today, these supported images are hosted on Docker Hub under the repo ["jiata/aztk-vanilla/r/python:<tag>"](https://hub.docker.com/r/jiata).*
|
||||
*Today, these supported images are hosted on Docker Hub under the repo ["jiata/aztk-base/r/python:<tag>"](https://hub.docker.com/r/jiata).*
|
||||
|
||||
To select an image other than the default, you can set your Docker image at cluster creation time with the optional **--docker-repo** parameter:
|
||||
|
||||
|
@ -29,7 +29,7 @@ aztk spark cluster create ... --docker-repo <name_of_docker_image_repo>
|
|||
|
||||
For example, if I am using the image version 0.1.0, and wanted to use Spark v1.6.3, I could run the following cluster create command:
|
||||
```sh
|
||||
aztk spark cluster create ... --docker-repo jiata/aztk:0.1.0-spark1.6.3
|
||||
aztk spark cluster create ... --docker-repo jiata/aztk-base:0.1.0-spark1.6.3
|
||||
```
|
||||
|
||||
## Using a custom Docker Image
|
||||
|
|
Загрузка…
Ссылка в новой задаче