зеркало из https://github.com/Azure/aztk.git
Feature: refactor docker images (#510)
* add spark2.3.0 hadoop2.8.3 dockerfile * start update to docker image * add SPARK_DIST_CLASSPATH to bashrc, source .bashrc in docker run * add maven install for jars * docker image update and code fix * add libthrift (still broken) * start image refactor, build from source, * add refactor to r base image * finish refactor r image * add storage jars and deps * exclude netty to get rid of dependency conflict * add miniconda image * update 2.2.0 base, anaconda image * remove unused cuda-8.0 image * start pipenv implementation * miniconda version arg * update anaconda and miniconda image * style * pivot to virtualenv * remove virtualenv from path when submitting apps * flatten layers * explicit calls to aztk python instead of activating virtualenv * update base, miniconda, anaconda * add compatibility version for base aztk images * typo fix * update pom * update environment variable name * update environment variables * add anaconda images base & gpu * update gpu and miniconda base images * create venv in cluster create * update base docker files, remove virtualenv * fix path * add exclusion to base images * update r images * delete python images (in favor of anaconda and miniconda) * add miniconda gpu images * update comment * update aztk_version_compatibility to dokcer image version * add a build script * virutalenv->pipenv, add pipfile & pipfile.lock remove secretstorage * aztk/staging->aztk/spark * remove jars, add .null to keep directory * update pipfile, update jupyter and jupyterlab * update default images * update base images to fix hdfs * update build script with correct path * add spark1.6.3 anaconda, miniconda, r base and gpu images * update build script to include spark1.6.3 * mkdir out * exclude commons lang and slf4j dependencies * mkdir out * no fail if dir exists * update node_scripts * update env var name * update env var name * fix the docker_repo docs * master->0.7.0
This commit is contained in:
Родитель
47000a5c7d
Коммит
779bffb2da
|
@ -27,7 +27,7 @@ This toolkit is built on top of Azure Batch but does not require any Azure Batch
|
|||
```
|
||||
3. Login or register for an [Azure Account](https://azure.microsoft.com), navigate to [Azure Cloud Shell](https://shell.azure.com), and run:
|
||||
```sh
|
||||
wget -q https://raw.githubusercontent.com/Azure/aztk/master/account_setup.sh -O account_setup.sh &&
|
||||
wget -q https://raw.githubusercontent.com/Azure/aztk/v0.7.0/account_setup.sh -O account_setup.sh &&
|
||||
chmod 755 account_setup.sh &&
|
||||
/bin/bash account_setup.sh
|
||||
```
|
||||
|
|
|
@ -4,7 +4,7 @@ echo "Installing depdendencies..." &&
|
|||
pip install --force-reinstall --upgrade --user pyyaml==3.12 azure==3.0.0 azure-cli-core==2.0.30 msrestazure==0.4.25 > /dev/null 2>&1 &&
|
||||
echo "Finished installing depdencies." &&
|
||||
echo "Getting account setup script..." &&
|
||||
wget -q https://raw.githubusercontent.com/Azure/aztk/master/account_setup.py -O account_setup.py &&
|
||||
wget -q https://raw.githubusercontent.com/Azure/aztk/v0.7.0/account_setup.py -O account_setup.py &&
|
||||
chmod 755 account_setup.py &&
|
||||
echo "Finished getting account setup script." &&
|
||||
echo "Running account setup script..." &&
|
||||
|
|
|
@ -0,0 +1,17 @@
|
|||
[[source]]
|
||||
url = "https://pypi.python.org/simple"
|
||||
verify_ssl = true
|
||||
name = "pypi"
|
||||
|
||||
[packages]
|
||||
azure-batch = "==4.1.3"
|
||||
azure-mgmt-batch = "==5.0.0"
|
||||
azure-mgmt-storage = "==1.5.0"
|
||||
azure-storage-blob = "==1.1.0"
|
||||
pycryptodome = "==3.4.7"
|
||||
PyYAML = "==3.12"
|
||||
|
||||
[dev-packages]
|
||||
|
||||
[requires]
|
||||
python_version = "3.5"
|
|
@ -0,0 +1,291 @@
|
|||
{
|
||||
"_meta": {
|
||||
"hash": {
|
||||
"sha256": "6ec054e45a39a75baeae8d6c48097a02a4d690c77a48d79a24c4a396b3799565"
|
||||
},
|
||||
"pipfile-spec": 6,
|
||||
"requires": {
|
||||
"python_version": "3.5"
|
||||
},
|
||||
"sources": [
|
||||
{
|
||||
"name": "pypi",
|
||||
"url": "https://pypi.python.org/simple",
|
||||
"verify_ssl": true
|
||||
}
|
||||
]
|
||||
},
|
||||
"default": {
|
||||
"adal": {
|
||||
"hashes": [
|
||||
"sha256:83b746883f3bd7216664463af70c05e847abd8e5b259d91eb49d692bec519a24",
|
||||
"sha256:dd3ecb2dfb2de9393320d0ed4e6115ed07a6984a28e18adf46499b91d3c3a494"
|
||||
],
|
||||
"version": "==0.5.1"
|
||||
},
|
||||
"asn1crypto": {
|
||||
"hashes": [
|
||||
"sha256:2f1adbb7546ed199e3c90ef23ec95c5cf3585bac7d11fb7eb562a3fe89c64e87",
|
||||
"sha256:9d5c20441baf0cb60a4ac34cc447c6c189024b6b4c6cd7877034f4965c464e49"
|
||||
],
|
||||
"version": "==0.24.0"
|
||||
},
|
||||
"azure-batch": {
|
||||
"hashes": [
|
||||
"sha256:017be21a9e6db92473d2e33170d5dd445596fc70d706f73552ac9c6b57a6ef1c",
|
||||
"sha256:cd71c7ebb5beab174b6225bbf79ae18d6db0c8d63227a7e514da0a75f138364c"
|
||||
],
|
||||
"index": "pypi",
|
||||
"version": "==4.1.3"
|
||||
},
|
||||
"azure-common": {
|
||||
"hashes": [
|
||||
"sha256:4fdc3a6d94d7073a76e04d59435e279decb91022520550ef08f2b6f316b72563",
|
||||
"sha256:5124ab76357452356164ef1a10e7fe69f686eaf1647ef57b37c2ede50df2cc02"
|
||||
],
|
||||
"version": "==1.1.9"
|
||||
},
|
||||
"azure-mgmt-batch": {
|
||||
"hashes": [
|
||||
"sha256:bc8ab35d21a07e17a4007efeb14a607a86315be5577d521fac53239f2270a633",
|
||||
"sha256:e83988711449d1ad4fe3db5c88c2b08aede073b113f2c5b423af155b1bd5f944"
|
||||
],
|
||||
"index": "pypi",
|
||||
"version": "==5.0.0"
|
||||
},
|
||||
"azure-mgmt-nspkg": {
|
||||
"hashes": [
|
||||
"sha256:0bd439a8e9529387246c3e335920d6474fb67e12f963e4a40bec54933b347220",
|
||||
"sha256:e36488d4f5d7d668ef5cc3e6e86f081448fd60c9bf4e051d06ff7cfc5a653e6f"
|
||||
],
|
||||
"version": "==2.0.0"
|
||||
},
|
||||
"azure-mgmt-storage": {
|
||||
"hashes": [
|
||||
"sha256:b1fc3a293051dee35dffe12d618f925581d6536c94ca5c05b69461ce941125a1",
|
||||
"sha256:d7a60f0675d49f70e74927814e0f1112e6482073c31a95478a55f5bb6e0691db"
|
||||
],
|
||||
"index": "pypi",
|
||||
"version": "==1.5.0"
|
||||
},
|
||||
"azure-nspkg": {
|
||||
"hashes": [
|
||||
"sha256:4bd758e649f57cc188db4f3c64becaca16195e057e4362b6caad56fe1e7934e9",
|
||||
"sha256:fe19ee5d8c66ee8ef62557fc7310f59cffb7230f0a94701eef79f6e3191fdc7b"
|
||||
],
|
||||
"version": "==2.0.0"
|
||||
},
|
||||
"azure-storage-blob": {
|
||||
"hashes": [
|
||||
"sha256:4fdcdc20e36d0f97a58bdffe1b26fc2b8b983c59ff8625e961c188c925891c66",
|
||||
"sha256:71d08a195a8cc732cbc0a45a552c7c8d495a2ef3721cbc993d0e586d0493d529"
|
||||
],
|
||||
"index": "pypi",
|
||||
"version": "==1.1.0"
|
||||
},
|
||||
"azure-storage-common": {
|
||||
"hashes": [
|
||||
"sha256:2aad9fdaa6052867f19515a5d0acaa650103532cc50a8a8974b0d76e485525a0",
|
||||
"sha256:8c67a4b0ad9ef16c4da3ca050ac7ad2117818797365d7e3bb4f371bdb78040cf"
|
||||
],
|
||||
"version": "==1.1.0"
|
||||
},
|
||||
"azure-storage-nspkg": {
|
||||
"hashes": [
|
||||
"sha256:4fc4685aef941eab2f7fb53824254cca2e38f2a1bf33cda0c8ae654fe15827d6",
|
||||
"sha256:855315c038c0e695868025127e1b3057a1f984af9ccfbaeac4fbfd6c5dd3b466"
|
||||
],
|
||||
"version": "==3.0.0"
|
||||
},
|
||||
"certifi": {
|
||||
"hashes": [
|
||||
"sha256:13e698f54293db9f89122b0581843a782ad0934a4fe0172d2a980ba77fc61bb7",
|
||||
"sha256:9fa520c1bacfb634fa7af20a76bcbd3d5fb390481724c597da32c719a7dca4b0"
|
||||
],
|
||||
"version": "==2018.4.16"
|
||||
},
|
||||
"cffi": {
|
||||
"hashes": [
|
||||
"sha256:151b7eefd035c56b2b2e1eb9963c90c6302dc15fbd8c1c0a83a163ff2c7d7743",
|
||||
"sha256:1553d1e99f035ace1c0544050622b7bc963374a00c467edafac50ad7bd276aef",
|
||||
"sha256:1b0493c091a1898f1136e3f4f991a784437fac3673780ff9de3bcf46c80b6b50",
|
||||
"sha256:2ba8a45822b7aee805ab49abfe7eec16b90587f7f26df20c71dd89e45a97076f",
|
||||
"sha256:3c85641778460581c42924384f5e68076d724ceac0f267d66c757f7535069c93",
|
||||
"sha256:3eb6434197633b7748cea30bf0ba9f66727cdce45117a712b29a443943733257",
|
||||
"sha256:4c91af6e967c2015729d3e69c2e51d92f9898c330d6a851bf8f121236f3defd3",
|
||||
"sha256:770f3782b31f50b68627e22f91cb182c48c47c02eb405fd689472aa7b7aa16dc",
|
||||
"sha256:79f9b6f7c46ae1f8ded75f68cf8ad50e5729ed4d590c74840471fc2823457d04",
|
||||
"sha256:7a33145e04d44ce95bcd71e522b478d282ad0eafaf34fe1ec5bbd73e662f22b6",
|
||||
"sha256:857959354ae3a6fa3da6651b966d13b0a8bed6bbc87a0de7b38a549db1d2a359",
|
||||
"sha256:87f37fe5130574ff76c17cab61e7d2538a16f843bb7bca8ebbc4b12de3078596",
|
||||
"sha256:95d5251e4b5ca00061f9d9f3d6fe537247e145a8524ae9fd30a2f8fbce993b5b",
|
||||
"sha256:9d1d3e63a4afdc29bd76ce6aa9d58c771cd1599fbba8cf5057e7860b203710dd",
|
||||
"sha256:a36c5c154f9d42ec176e6e620cb0dd275744aa1d804786a71ac37dc3661a5e95",
|
||||
"sha256:ae5e35a2c189d397b91034642cb0eab0e346f776ec2eb44a49a459e6615d6e2e",
|
||||
"sha256:b0f7d4a3df8f06cf49f9f121bead236e328074de6449866515cea4907bbc63d6",
|
||||
"sha256:b75110fb114fa366b29a027d0c9be3709579602ae111ff61674d28c93606acca",
|
||||
"sha256:ba5e697569f84b13640c9e193170e89c13c6244c24400fc57e88724ef610cd31",
|
||||
"sha256:be2a9b390f77fd7676d80bc3cdc4f8edb940d8c198ed2d8c0be1319018c778e1",
|
||||
"sha256:d5d8555d9bfc3f02385c1c37e9f998e2011f0db4f90e250e5bc0c0a85a813085",
|
||||
"sha256:e55e22ac0a30023426564b1059b035973ec82186ddddbac867078435801c7801",
|
||||
"sha256:e90f17980e6ab0f3c2f3730e56d1fe9bcba1891eeea58966e89d352492cc74f4",
|
||||
"sha256:ecbb7b01409e9b782df5ded849c178a0aa7c906cf8c5a67368047daab282b184",
|
||||
"sha256:ed01918d545a38998bfa5902c7c00e0fee90e957ce036a4000a88e3fe2264917",
|
||||
"sha256:edabd457cd23a02965166026fd9bfd196f4324fe6032e866d0f3bd0301cd486f",
|
||||
"sha256:fdf1c1dc5bafc32bc5d08b054f94d659422b05aba244d6be4ddc1c72d9aa70fb"
|
||||
],
|
||||
"markers": "platform_python_implementation != 'pypy'",
|
||||
"version": "==1.11.5"
|
||||
},
|
||||
"chardet": {
|
||||
"hashes": [
|
||||
"sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
|
||||
"sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
|
||||
],
|
||||
"version": "==3.0.4"
|
||||
},
|
||||
"cryptography": {
|
||||
"hashes": [
|
||||
"sha256:3f3b65d5a16e6b52fba63dc860b62ca9832f51f1a2ae5083c78b6840275f12dd",
|
||||
"sha256:551a3abfe0c8c6833df4192a63371aa2ff43afd8f570ed345d31f251d78e7e04",
|
||||
"sha256:5cb990056b7cadcca26813311187ad751ea644712022a3976443691168781b6f",
|
||||
"sha256:60bda7f12ecb828358be53095fc9c6edda7de8f1ef571f96c00b2363643fa3cd",
|
||||
"sha256:6fef51ec447fe9f8351894024e94736862900d3a9aa2961528e602eb65c92bdb",
|
||||
"sha256:77d0ad229d47a6e0272d00f6bf8ac06ce14715a9fd02c9a97f5a2869aab3ccb2",
|
||||
"sha256:808fe471b1a6b777f026f7dc7bd9a4959da4bfab64972f2bbe91e22527c1c037",
|
||||
"sha256:9b62fb4d18529c84b961efd9187fecbb48e89aa1a0f9f4161c61b7fc42a101bd",
|
||||
"sha256:9e5bed45ec6b4f828866ac6a6bedf08388ffcfa68abe9e94b34bb40977aba531",
|
||||
"sha256:9fc295bf69130a342e7a19a39d7bbeb15c0bcaabc7382ec33ef3b2b7d18d2f63",
|
||||
"sha256:abd070b5849ed64e6d349199bef955ee0ad99aefbad792f0c587f8effa681a5e",
|
||||
"sha256:ba6a774749b6e510cffc2fb98535f717e0e5fd91c7c99a61d223293df79ab351",
|
||||
"sha256:c332118647f084c983c6a3e1dba0f3bcb051f69d12baccac68db8d62d177eb8a",
|
||||
"sha256:d6f46e862ee36df81e6342c2177ba84e70f722d9dc9c6c394f9f1f434c4a5563",
|
||||
"sha256:db6013746f73bf8edd9c3d1d3f94db635b9422f503db3fc5ef105233d4c011ab",
|
||||
"sha256:f57008eaff597c69cf692c3518f6d4800f0309253bb138b526a37fe9ef0c7471",
|
||||
"sha256:f6c821ac253c19f2ad4c8691633ae1d1a17f120d5b01ea1d256d7b602bc59887"
|
||||
],
|
||||
"version": "==2.2.2"
|
||||
},
|
||||
"idna": {
|
||||
"hashes": [
|
||||
"sha256:2c6a5de3089009e3da7c5dde64a141dbc8551d5b7f6cf4ed7c2568d0cc520a8f",
|
||||
"sha256:8c7309c718f94b3a625cb648ace320157ad16ff131ae0af362c9f21b80ef6ec4"
|
||||
],
|
||||
"version": "==2.6"
|
||||
},
|
||||
"isodate": {
|
||||
"hashes": [
|
||||
"sha256:2e364a3d5759479cdb2d37cce6b9376ea504db2ff90252a2e5b7cc89cc9ff2d8",
|
||||
"sha256:aa4d33c06640f5352aca96e4b81afd8ab3b47337cc12089822d6f322ac772c81"
|
||||
],
|
||||
"version": "==0.6.0"
|
||||
},
|
||||
"msrest": {
|
||||
"hashes": [
|
||||
"sha256:2920c4eee294a901a59480c72e70092ebbac4849bc2237e064cb9feed174deeb",
|
||||
"sha256:65bdde2ea8aa3312eb4ce6142d5da65d455f561a7676eee678c1a6e00416f5a0"
|
||||
],
|
||||
"version": "==0.4.28"
|
||||
},
|
||||
"msrestazure": {
|
||||
"hashes": [
|
||||
"sha256:4e336150730f9a512f1432c4e0c5293d618ffcbf92767c07525bd8a8200fa9d5",
|
||||
"sha256:5b33886aaaf068acec17d76127d95290c9eaca7942711184da991cabd3929854"
|
||||
],
|
||||
"version": "==0.4.28"
|
||||
},
|
||||
"oauthlib": {
|
||||
"hashes": [
|
||||
"sha256:09d438bcac8f004ae348e721e9d8a7792a9e23cd574634e973173344046287f5",
|
||||
"sha256:909665297635fa11fe9914c146d875f2ed41c8c2d78e21a529dd71c0ba756508"
|
||||
],
|
||||
"version": "==2.0.7"
|
||||
},
|
||||
"pycparser": {
|
||||
"hashes": [
|
||||
"sha256:99a8ca03e29851d96616ad0404b4aad7d9ee16f25c9f9708a11faf2810f7b226"
|
||||
],
|
||||
"version": "==2.18"
|
||||
},
|
||||
"pycryptodome": {
|
||||
"hashes": [
|
||||
"sha256:15ced95a00b55bb2fc22f3dddde1c8d6f270089f35c3af0e07306bc2ba1e1c4e",
|
||||
"sha256:18d8dfe31bf0cb53d58694903e526be68f3cf48e6e3c6dfbbc1e7042b1693af7",
|
||||
"sha256:2174fa555916b5ae8bcc7747ecfe2a4d5943b42c9dcf4878e269baaae264e85d",
|
||||
"sha256:6f64d8b63034fd9289bae4cb48aa8f7049f6b8db702c7af50cb3718821d28147",
|
||||
"sha256:8440a35ccd52f0eab0f4ece284bd13a587d86d79bd404d8914f81eda74a66de1",
|
||||
"sha256:8851b1e1d85e4fb981048c8a8a8431839103f43ea3c35f1b46bae2e41699f439",
|
||||
"sha256:9fc97cd0f6eeec59af736b3df81e5811d836fa646b89a4325672dcaf997250b3",
|
||||
"sha256:a9e3e3e9ab0241b0303206656a74d5cd6bd00fcad6f9ffd0ba6b8e35072f74d7",
|
||||
"sha256:ec560e62258358afd7a1a3d34c8860fdf478e28c0999173f2d5c618fd2fd60d3",
|
||||
"sha256:f0196124f83221f9c5e06a68e247019466395d35d92d4ce4482c835f75302851",
|
||||
"sha256:f7befe2249df41e012a3d8079ab3c7089be21969591eb77b21767fa24557a7b7"
|
||||
],
|
||||
"index": "pypi",
|
||||
"version": "==3.4.7"
|
||||
},
|
||||
"pyjwt": {
|
||||
"hashes": [
|
||||
"sha256:bca523ef95586d3a8a5be2da766fe6f82754acba27689c984e28e77a12174593",
|
||||
"sha256:dacba5786fe3bf1a0ae8673874e29f9ac497860955c501289c63b15d3daae63a"
|
||||
],
|
||||
"version": "==1.6.1"
|
||||
},
|
||||
"python-dateutil": {
|
||||
"hashes": [
|
||||
"sha256:3220490fb9741e2342e1cf29a503394fdac874bc39568288717ee67047ff29df",
|
||||
"sha256:9d8074be4c993fbe4947878ce593052f71dac82932a677d49194d8ce9778002e"
|
||||
],
|
||||
"version": "==2.7.2"
|
||||
},
|
||||
"pyyaml": {
|
||||
"hashes": [
|
||||
"sha256:0c507b7f74b3d2dd4d1322ec8a94794927305ab4cebbe89cc47fe5e81541e6e8",
|
||||
"sha256:16b20e970597e051997d90dc2cddc713a2876c47e3d92d59ee198700c5427736",
|
||||
"sha256:3262c96a1ca437e7e4763e2843746588a965426550f3797a79fca9c6199c431f",
|
||||
"sha256:326420cbb492172dec84b0f65c80942de6cedb5233c413dd824483989c000608",
|
||||
"sha256:4474f8ea030b5127225b8894d626bb66c01cda098d47a2b0d3429b6700af9fd8",
|
||||
"sha256:592766c6303207a20efc445587778322d7f73b161bd994f227adaa341ba212ab",
|
||||
"sha256:5ac82e411044fb129bae5cfbeb3ba626acb2af31a8d17d175004b70862a741a7",
|
||||
"sha256:5f84523c076ad14ff5e6c037fe1c89a7f73a3e04cf0377cb4d017014976433f3",
|
||||
"sha256:827dc04b8fa7d07c44de11fabbc888e627fa8293b695e0f99cb544fdfa1bf0d1",
|
||||
"sha256:b4c423ab23291d3945ac61346feeb9a0dc4184999ede5e7c43e1ffb975130ae6",
|
||||
"sha256:bc6bced57f826ca7cb5125a10b23fd0f2fff3b7c4701d64c439a300ce665fff8",
|
||||
"sha256:c01b880ec30b5a6e6aa67b09a2fe3fb30473008c85cd6a67359a1b15ed6d83a4",
|
||||
"sha256:ca233c64c6e40eaa6c66ef97058cdc80e8d0157a443655baa1b2966e812807ca",
|
||||
"sha256:e863072cdf4c72eebf179342c94e6989c67185842d9997960b3e69290b2fa269"
|
||||
],
|
||||
"index": "pypi",
|
||||
"version": "==3.12"
|
||||
},
|
||||
"requests": {
|
||||
"hashes": [
|
||||
"sha256:6a1b267aa90cac58ac3a765d067950e7dbbf75b1da07e895d1f594193a40a38b",
|
||||
"sha256:9c443e7324ba5b85070c4a818ade28bfabedf16ea10206da1132edaa6dda237e"
|
||||
],
|
||||
"version": "==2.18.4"
|
||||
},
|
||||
"requests-oauthlib": {
|
||||
"hashes": [
|
||||
"sha256:50a8ae2ce8273e384895972b56193c7409601a66d4975774c60c2aed869639ca",
|
||||
"sha256:883ac416757eada6d3d07054ec7092ac21c7f35cb1d2cf82faf205637081f468"
|
||||
],
|
||||
"version": "==0.8.0"
|
||||
},
|
||||
"six": {
|
||||
"hashes": [
|
||||
"sha256:70e8a77beed4562e7f14fe23a786b54f6296e34344c23bc42f07b15018ff98e9",
|
||||
"sha256:832dc0e10feb1aa2c68dcc57dbb658f1c7e65b9b61af69048abc87a2db00a0eb"
|
||||
],
|
||||
"version": "==1.11.0"
|
||||
},
|
||||
"urllib3": {
|
||||
"hashes": [
|
||||
"sha256:06330f386d6e4b195fbfc736b297f58c5a892e4440e54d294d7004e3a9bbea1b",
|
||||
"sha256:cc44da8e1145637334317feebd728bd869a35285b93cbb4cca2577da7e62db4f"
|
||||
],
|
||||
"version": "==1.22"
|
||||
}
|
||||
},
|
||||
"develop": {}
|
||||
}
|
|
@ -3,6 +3,7 @@
|
|||
# This file is the entry point of the docker container.
|
||||
|
||||
set -e
|
||||
source ~/.bashrc
|
||||
echo "Initializing spark container"
|
||||
|
||||
# --------------------
|
||||
|
@ -25,15 +26,14 @@ done
|
|||
# ----------------------------
|
||||
# Run aztk setup python scripts
|
||||
# ----------------------------
|
||||
# use python v3.5.4 to run aztk software
|
||||
# setup docker container
|
||||
echo "Starting setup using Docker"
|
||||
|
||||
$(pyenv root)/versions/$AZTK_PYTHON_VERSION/bin/pip install -r $(dirname $0)/requirements.txt
|
||||
export PYTHONPATH=$PYTHONPATH:$AZTK_WORKING_DIR
|
||||
echo 'export PYTHONPATH=$PYTHONPATH:$AZTK_WORKING_DIR' >> ~/.bashrc
|
||||
|
||||
echo "Running main.py script"
|
||||
$(pyenv root)/versions/$AZTK_PYTHON_VERSION/bin/python $(dirname $0)/main.py setup-spark-container
|
||||
$AZTK_WORKING_DIR/.aztk-env/.venv/bin/python $(dirname $0)/main.py setup-spark-container
|
||||
|
||||
# sleep to keep container running
|
||||
while true; do sleep 1; done
|
||||
|
|
|
@ -4,3 +4,4 @@ azure-mgmt-storage==1.5.0
|
|||
azure-storage-blob==1.1.0
|
||||
pyyaml==3.12
|
||||
pycryptodome==3.4.7
|
||||
|
||||
|
|
|
@ -11,12 +11,13 @@ container_name=$1
|
|||
docker_repo_name=$2
|
||||
|
||||
echo "Installing pre-reqs"
|
||||
apt-get -y install linux-image-extra-$(uname -r) linux-image-extra-virtual
|
||||
apt-get -y install apt-transport-https
|
||||
apt-get -y install curl
|
||||
apt-get -y install ca-certificates
|
||||
apt-get -y install software-properties-common
|
||||
apt-get -y install python3-pip python-dev build-essential libssl-dev
|
||||
apt-get -y update
|
||||
apt-get install -y --no-install-recommends linux-image-extra-$(uname -r) linux-image-extra-virtual
|
||||
apt-get install -y --no-install-recommends apt-transport-https
|
||||
apt-get install -y --no-install-recommends curl
|
||||
apt-get install -y --no-install-recommends ca-certificates
|
||||
apt-get install -y --no-install-recommends software-properties-common
|
||||
apt-get install -y --no-install-recommends python3-pip python3-venv python-dev build-essential libssl-dev
|
||||
echo "Done installing pre-reqs"
|
||||
|
||||
# Install docker
|
||||
|
@ -78,12 +79,25 @@ else
|
|||
|
||||
echo "Node python version:"
|
||||
python3 --version
|
||||
|
||||
# set up aztk python environment
|
||||
export LC_ALL=C.UTF-8
|
||||
export LANG=C.UTF-8
|
||||
python3 -m pip install pipenv
|
||||
mkdir -p $AZTK_WORKING_DIR/.aztk-env
|
||||
cp $AZTK_WORKING_DIR/aztk/node_scripts/Pipfile $AZTK_WORKING_DIR/.aztk-env
|
||||
cp $AZTK_WORKING_DIR/aztk/node_scripts/Pipfile.lock $AZTK_WORKING_DIR/.aztk-env
|
||||
cd $AZTK_WORKING_DIR/.aztk-env
|
||||
export PIPENV_VENV_IN_PROJECT=true
|
||||
pipenv install --python /usr/bin/python3.5m
|
||||
pipenv run pip install --upgrade setuptools wheel #TODO: add pip when pipenv is compatible with pip10
|
||||
|
||||
# Install python dependencies
|
||||
pip3 install -r $(dirname $0)/requirements.txt
|
||||
$AZTK_WORKING_DIR/.aztk-env/.venv/bin/pip install -r $(dirname $0)/requirements.txt
|
||||
export PYTHONPATH=$PYTHONPATH:$AZTK_WORKING_DIR
|
||||
|
||||
echo "Running setup python script"
|
||||
python3 $(dirname $0)/main.py setup-node $docker_repo_name
|
||||
$AZTK_WORKING_DIR/.aztk-env/.venv/bin/python $(dirname $0)/main.py setup-node $docker_repo_name
|
||||
|
||||
# wait until container is running
|
||||
until [ "`/usr/bin/docker inspect -f {{.State.Running}} $container_name`"=="true" ]; do
|
||||
|
@ -94,7 +108,7 @@ else
|
|||
|
||||
# wait until container setup is complete
|
||||
echo "Waiting for spark docker container to setup."
|
||||
docker exec spark /bin/bash -c 'python $AZTK_WORKING_DIR/aztk/node_scripts/wait_until_setup_complete.py'
|
||||
docker exec spark /bin/bash -c '$AZTK_WORKING_DIR/.aztk-env/.venv/bin/python $AZTK_WORKING_DIR/aztk/node_scripts/wait_until_setup_complete.py'
|
||||
|
||||
# Setup symbolic link for the docker logs
|
||||
docker_log=$(docker inspect --format='{{.LogPath}}' $container_name)
|
||||
|
|
|
@ -101,7 +101,7 @@ def __cluster_install_cmd(zip_resource_file: batch_models.ResourceFile,
|
|||
'apt-get -y update',
|
||||
'apt-get install --fix-missing',
|
||||
'apt-get -y install unzip',
|
||||
'unzip $AZ_BATCH_TASK_WORKING_DIR/{0}'.format(
|
||||
'unzip -o $AZ_BATCH_TASK_WORKING_DIR/{0}'.format(
|
||||
zip_resource_file.file_path),
|
||||
'chmod 777 $AZ_BATCH_TASK_WORKING_DIR/aztk/node_scripts/setup_host.sh',
|
||||
'/bin/bash $AZ_BATCH_TASK_WORKING_DIR/aztk/node_scripts/setup_host.sh {0} {1}'.format(
|
||||
|
|
|
@ -19,9 +19,10 @@ def __app_cmd():
|
|||
docker_exec.add_argument("-i")
|
||||
docker_exec.add_option("-e", "AZ_BATCH_TASK_WORKING_DIR=$AZ_BATCH_TASK_WORKING_DIR")
|
||||
docker_exec.add_option("-e", "AZ_BATCH_JOB_ID=$AZ_BATCH_JOB_ID")
|
||||
docker_exec.add_argument("spark /bin/bash >> output.log 2>&1 -c \""\
|
||||
"source ~/.bashrc; "\
|
||||
"python \$AZTK_WORKING_DIR/aztk/node_scripts/job_submission.py\"")
|
||||
docker_exec.add_argument("spark /bin/bash >> output.log 2>&1 -c \"" \
|
||||
"source ~/.bashrc; " \
|
||||
"export PYTHONPATH=$PYTHONPATH:\$AZTK_WORKING_DIR; " \
|
||||
"$AZTK_WORKING_DIR/.aztk-env/.venv/bin/python \$AZTK_WORKING_DIR/aztk/node_scripts/job_submission.py\"")
|
||||
return docker_exec.to_str()
|
||||
|
||||
|
||||
|
|
|
@ -82,10 +82,10 @@ def generate_task(spark_client, container_id, application):
|
|||
task_cmd.add_option('-e', 'AZ_BATCH_TASK_WORKING_DIR=$AZ_BATCH_TASK_WORKING_DIR')
|
||||
task_cmd.add_option('-e', 'STORAGE_LOGS_CONTAINER={0}'.format(container_id))
|
||||
task_cmd.add_argument('spark /bin/bash >> output.log 2>&1')
|
||||
task_cmd.add_argument('-c "source ~/.bashrc; '\
|
||||
task_cmd.add_argument('-c "source ~/.bashrc; ' \
|
||||
'export PYTHONPATH=$PYTHONPATH:\$AZTK_WORKING_DIR; ' \
|
||||
'cd $AZ_BATCH_TASK_WORKING_DIR; ' \
|
||||
'\$(pyenv root)/versions/\$AZTK_PYTHON_VERSION/bin/python ' \
|
||||
'\$AZTK_WORKING_DIR/aztk/node_scripts/submit.py"')
|
||||
'\$AZTK_WORKING_DIR/.aztk-env/.venv/bin/python \$AZTK_WORKING_DIR/aztk/node_scripts/submit.py"')
|
||||
|
||||
# Create task
|
||||
task = batch_models.TaskAddParameter(
|
||||
|
|
|
@ -13,13 +13,13 @@ echo "Is master: $AZTK_IS_MASTER"
|
|||
if [ "$AZTK_IS_MASTER" = "true" ]; then
|
||||
pip install jupyter --upgrade
|
||||
pip install notebook --upgrade
|
||||
|
||||
PYSPARK_DRIVER_PYTHON="/.pyenv/versions/${USER_PYTHON_VERSION}/bin/jupyter"
|
||||
JUPYTER_KERNELS="/.pyenv/versions/${USER_PYTHON_VERSION}/share/jupyter/kernels"
|
||||
|
||||
PYSPARK_DRIVER_PYTHON="/opt/conda/bin/jupyter"
|
||||
JUPYTER_KERNELS="/opt/conda/share/jupyter/kernels"
|
||||
|
||||
# disable password/token on jupyter notebook
|
||||
jupyter notebook --generate-config --allow-root
|
||||
JUPYTER_CONFIG='/.jupyter/jupyter_notebook_config.py'
|
||||
JUPYTER_CONFIG='/root/.jupyter/jupyter_notebook_config.py'
|
||||
echo >> $JUPYTER_CONFIG
|
||||
echo -e 'c.NotebookApp.token=""' >> $JUPYTER_CONFIG
|
||||
echo -e 'c.NotebookApp.password=""' >> $JUPYTER_CONFIG
|
||||
|
|
|
@ -9,12 +9,12 @@
|
|||
if [ "$AZTK_IS_MASTER" = "true" ]; then
|
||||
conda install -c conda-force jupyterlab
|
||||
|
||||
PYSPARK_DRIVER_PYTHON="/.pyenv/versions/${USER_PYTHON_VERSION}/bin/jupyter"
|
||||
JUPYTER_KERNELS="/.pyenv/versions/${USER_PYTHON_VERSION}/share/jupyter/kernels"
|
||||
PYSPARK_DRIVER_PYTHON="/opt/conda/bin/jupyter"
|
||||
JUPYTER_KERNELS="/opt/conda/share/jupyter/kernels"
|
||||
|
||||
# disable password/token on jupyter notebook
|
||||
jupyter lab --generate-config --allow-root
|
||||
JUPYTER_CONFIG='/.jupyter/jupyter_notebook_config.py'
|
||||
JUPYTER_CONFIG='/root/.jupyter/jupyter_notebook_config.py'
|
||||
echo >> $JUPYTER_CONFIG
|
||||
echo -e 'c.NotebookApp.token=""' >> $JUPYTER_CONFIG
|
||||
echo -e 'c.NotebookApp.password=""' >> $JUPYTER_CONFIG
|
||||
|
|
|
@ -11,6 +11,7 @@ if [ "$AZTK_IS_MASTER" = "true" ]; then
|
|||
|
||||
## Download and install Rstudio Server
|
||||
wget https://download2.rstudio.org/rstudio-server-$RSTUDIO_SERVER_VERSION-amd64.deb
|
||||
apt-get install -y --no-install-recommends gdebi-core
|
||||
gdebi rstudio-server-$RSTUDIO_SERVER_VERSION-amd64.deb --non-interactive
|
||||
echo "server-app-armor-enabled=0" | tee -a /etc/rstudio/rserver.conf
|
||||
rm rstudio-server-$RSTUDIO_SERVER_VERSION-amd64.deb
|
||||
|
|
|
@ -1,2 +1,2 @@
|
|||
#!/bin/bash
|
||||
python $DOCKER_WORKING_DIR/plugins/spark_ui_proxy/spark_ui_proxy.py $1 $2 &
|
||||
python $AZTK_WORKING_DIR/plugins/spark_ui_proxy/spark_ui_proxy.py $1 $2 &
|
||||
|
|
|
@ -2,10 +2,10 @@ import os
|
|||
"""
|
||||
DOCKER
|
||||
"""
|
||||
DEFAULT_DOCKER_REPO = "aztk/base:latest"
|
||||
DEFAULT_DOCKER_REPO_GPU = "aztk/gpu:latest"
|
||||
DEFAULT_SPARK_PYTHON_DOCKER_REPO = "aztk/python:latest"
|
||||
DEFAULT_SPARK_R_BASE_DOCKER_REPO = "aztk/r-base:latest"
|
||||
DEFAULT_DOCKER_REPO = "aztk/spark:v0.1.0-spark2.3.0-base"
|
||||
DEFAULT_DOCKER_REPO_GPU = "aztk/spark:v0.1.0-spark2.3.0-gpu"
|
||||
DEFAULT_SPARK_PYTHON_DOCKER_REPO = "aztk/spark:v0.1.0-spark2.3.0-miniconda-base"
|
||||
DEFAULT_SPARK_R_BASE_DOCKER_REPO = "aztk/spark:v0.1.0-spark2.3.0-r-base"
|
||||
DOCKER_SPARK_CONTAINER_NAME = "spark"
|
||||
|
||||
# DOCKER SPARK
|
||||
|
|
|
@ -14,7 +14,7 @@ size: 2
|
|||
# username: <username for the linux user to be created> (optional)
|
||||
username: spark
|
||||
|
||||
# docker_repo: <name of docker image repo (for more information, see https://github.com/Azure/aztk/blob/master/docs/12-docker-image.md)>
|
||||
# docker_repo: <name of docker image repo (for more information, see https://github.com/Azure/aztk/blob/v0.7.0/docs/12-docker-image.md)>
|
||||
docker_repo:
|
||||
|
||||
# # optional custom scripts to run on the Spark master, Spark worker or all nodes in the cluster
|
||||
|
|
Двоичные данные
aztk_cli/config/jars/azure-data-lake-store-sdk-2.0.11.jar
Двоичные данные
aztk_cli/config/jars/azure-data-lake-store-sdk-2.0.11.jar
Двоичный файл не отображается.
Двоичные данные
aztk_cli/config/jars/azure-storage-2.0.0.jar
Двоичные данные
aztk_cli/config/jars/azure-storage-2.0.0.jar
Двоичный файл не отображается.
Двоичные данные
aztk_cli/config/jars/hadoop-azure-2.7.3.jar
Двоичные данные
aztk_cli/config/jars/hadoop-azure-2.7.3.jar
Двоичный файл не отображается.
Двоичные данные
aztk_cli/config/jars/hadoop-azure-datalake-3.0.0-alpha2.jar
Двоичные данные
aztk_cli/config/jars/hadoop-azure-datalake-3.0.0-alpha2.jar
Двоичный файл не отображается.
|
@ -1,7 +1,7 @@
|
|||
# Job Configuration
|
||||
# An Aztk Job is a cluster and an array of Spark applications to run on that cluster
|
||||
# AZTK Spark Jobs will automatically manage the lifecycle of the cluster
|
||||
# For more information see the documentation at: https://github.com/Azure/aztk/blob/master/docs/70-jobs.md
|
||||
# For more information see the documentation at: https://github.com/Azure/aztk/blob/v0.7.0/docs/70-jobs.md
|
||||
|
||||
job:
|
||||
id:
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
# For instructions on creating a Batch and Storage account, see
|
||||
# Getting Started (https://github.com/Azure/aztk/blob/master/docs/00-getting-started.md)
|
||||
# Getting Started (https://github.com/Azure/aztk/blob/v0.7.0/docs/00-getting-started.md)
|
||||
# NOTE - YAML requires a space after the colon. Ex: "batchaccountname: mybatchaccount"
|
||||
|
||||
service_principal:
|
||||
|
|
|
@ -12,12 +12,12 @@ if [ "$AZTK_IS_MASTER" = "true" ]; then
|
|||
pip install jupyter --upgrade
|
||||
pip install notebook --upgrade
|
||||
|
||||
PYSPARK_DRIVER_PYTHON="/.pyenv/versions/${USER_PYTHON_VERSION}/bin/jupyter"
|
||||
JUPYTER_KERNELS="/.pyenv/versions/${USER_PYTHON_VERSION}/share/jupyter/kernels"
|
||||
PYSPARK_DRIVER_PYTHON="/opt/conda/bin/jupyter"
|
||||
JUPYTER_KERNELS="/opt/conda/share/jupyter/kernels"
|
||||
|
||||
# disable password/token on jupyter notebook
|
||||
jupyter notebook --generate-config --allow-root
|
||||
JUPYTER_CONFIG='/.jupyter/jupyter_notebook_config.py'
|
||||
JUPYTER_CONFIG='/root/.jupyter/jupyter_notebook_config.py'
|
||||
echo >> $JUPYTER_CONFIG
|
||||
echo -e 'c.NotebookApp.token=""' >> $JUPYTER_CONFIG
|
||||
echo -e 'c.NotebookApp.password=""' >> $JUPYTER_CONFIG
|
||||
|
|
|
@ -2,114 +2,3 @@
|
|||
Azure Distributed Data Engineering Toolkit uses Docker containers to run Spark.
|
||||
|
||||
Please refer to the docs for details on [how to select a docker-repo at cluster creation time](../docs/12-docker-image.md).
|
||||
|
||||
## Supported Images
|
||||
By default, this toolkit will use the base Spark image, __aztk/base__. This image contains the bare mininum to get Spark up and running in standalone mode.
|
||||
|
||||
On top of that, we also provide additional flavors of Spark images, one geared towards the Python user (PySpark), and the other, geared towards the R user (SparklyR or SparkR).
|
||||
|
||||
Docker Image | Image Type | User Language(s) | What's Included?
|
||||
:-- | :-- | :-- | :--
|
||||
[aztk/base](https://hub.docker.com/r/aztk/base/) | Base | Java, Scala | `Spark`
|
||||
[aztk/python](https://hub.docker.com/r/aztk/python/) | Pyspark | Python | `Anaconda`</br>`Jupyter Notebooks` </br> `PySpark`
|
||||
[aztk/r-base](https://hub.docker.com/r/aztk/r-base/) | SparklyR | R | `CRAN`</br>`RStudio Server`</br>`SparklyR and SparkR`
|
||||
|
||||
__aztk/gpu__, __aztk/python__ and __aztk/r-base__ images are built on top of the __aztk/base__ image.
|
||||
|
||||
All the AZTK images are hosted on Docker Hub under [aztk](https://hub.docker.com/r/aztk).
|
||||
|
||||
### Matrix of Supported Container Images:
|
||||
|
||||
Docker Repo (hosted on Docker Hub) | Spark Version | Python Version | R Version | CUDA Version | cudNN Version
|
||||
:-- | :-- | :-- | :-- | :-- | :--
|
||||
aztk/base:spark2.2.0 __(default)__ | v2.2.0 | -- | -- | -- | --
|
||||
aztk/base:spark2.1.0 | v2.1.0 | -- | -- | -- | --
|
||||
aztk/base:spark1.6.3 | v1.6.3 | -- | -- | -- | --
|
||||
aztk/gpu:spark2.2.0 | v2.2.0 | -- | -- | 8.0 | 6.0
|
||||
aztk/gpu:spark2.1.0 | v2.1.0 | -- | -- | 8.0 | 6.0
|
||||
aztk/gpu:spark1.6.3 | v1.6.3 | -- | -- | 8.0 | 6.0
|
||||
aztk/python:spark2.2.0-python3.6.2-base | v2.2.0 | v3.6.2 | -- | -- | -- | --
|
||||
aztk/python:spark2.1.0-python3.6.2-base | v2.1.0 | v3.6.2 | -- | -- | -- | --
|
||||
aztk/python:spark1.6.3-python3.6.2-base | v1.6.3 | v3.6.2 | -- | -- | -- | --
|
||||
aztk/python:spark2.2.0-python3.6.2-gpu | v2.2.0 | v3.6.2 | -- | 8.0 | 6.0
|
||||
aztk/python:spark2.1.0-python3.6.2-gpu | v2.1.0 | v3.6.2 | -- | 8.0 | 6.0
|
||||
aztk/python:spark1.6.3-python3.6.2-gpu | v1.6.3 | v3.6.2 | -- | 8.0 | 6.0
|
||||
aztk/r-base:spark2.2.0-r3.4.1-base | v2.2.0 | -- | v3.4.1 | -- | --
|
||||
aztk/r-base:spark2.1.0-r3.4.1-base | v2.1.0 | -- | v3.4.1 | -- | --
|
||||
aztk/r-base:spark1.6.3-r3.4.1-base | v1.6.3 | -- | v3.4.1 | -- | --
|
||||
|
||||
If you have requests to add to the list of supported images, please file a Github issue.
|
||||
|
||||
NOTE: Spark clusters that use the __aztk/gpu__, __aztk/python__ or __aztk/r-base__ images take longer to provision because these Docker images are significantly larger than the __aztk/base__ image.
|
||||
|
||||
### Gallery of 3rd Party Images
|
||||
Since this toolkit uses Docker containers to run Spark, users can bring their own images. Here's a list of 3rd party images:
|
||||
- *coming soon*
|
||||
|
||||
(See below for a how-to guide on building your own images for the Azure Distributed Data Engineering Toolkit)
|
||||
|
||||
# How do I use my own Docker Image?
|
||||
Building your own Docker Image to use with this toolkit has many advantages for users who want more customization over their environment. For some, this may look like installing specific, and even private, libraries that their Spark jobs require. For others, it may just be setting up a version of Spark, Python or R that fits their particular needs.
|
||||
|
||||
This section is for users who want to build their own docker images.
|
||||
|
||||
## Building Your Own Docker Image
|
||||
The Azure Distributed Data Engineering Toolkit supports custom Docker images. To guarantee that your Spark deployment works, we recommend that you build on top of one of our __aztk/base__ images. You can also build on top of our __aztk/python__ or __aztk/r-base__ images, but note that they are also built on top of the __aztk_base__ image.
|
||||
|
||||
To build your own image, can either build _on top_ or _beneath_ one of our supported images _OR_ you can just modify one of the supported Dockerfiles to build your own.
|
||||
|
||||
### Building on top
|
||||
You can build on top of our images by referencing the __aztk/base__ image in the **FROM** keyword of your Dockerfile:
|
||||
```sh
|
||||
# Your custom Dockerfile
|
||||
|
||||
FROM aztk/base:spark2.2.0
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
### Building beneath
|
||||
To build beneath one of our images, modify one of our Dockerfiles so that the **FROM** keyword pulls from your Docker image's location (as opposed to the default which is a base Ubuntu image):
|
||||
```sh
|
||||
# One of the Dockerfiles that AZTK supports
|
||||
# Change the FROM statement to point to your hosted image repo
|
||||
|
||||
FROM my_username/my_repo:latest
|
||||
...
|
||||
```
|
||||
|
||||
Please note that for this method to work, your Docker image must have been built on Ubuntu.
|
||||
|
||||
## Required Environment Variables
|
||||
When layering your own Docker image, make sure your image does not intefere with the environment variables set in the __aztk_base__ Dockerfile, otherwise it may not work on AZTK.
|
||||
|
||||
Please make sure that the following environment variables are set:
|
||||
- AZTK_PYTHON_VERSION
|
||||
- JAVA_HOME
|
||||
- SPARK_HOME
|
||||
|
||||
You also need to make sure that __PATH__ is correctly configured with $SPARK_HOME
|
||||
- PATH=$SPARK_HOME/bin:$PATH
|
||||
|
||||
By default, these are set as follows:
|
||||
``` sh
|
||||
ENV AZTK_PYTHON_VERSION 3.5.4
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
```
|
||||
|
||||
If you are using your own version of Spark, make that it is symlinked by "/home/spark-current". **$SPARK_HOME**, must also point to "/home/spark-current".
|
||||
|
||||
## Hosting your Docker Image
|
||||
By default, this toolkit assumes that your Docker images are publicly hosted on Docker Hub. However, we also support hosting your images privately.
|
||||
|
||||
See [here](https://github.com/Azure/aztk/blob/master/docs/12-docker-image.md#using-a-custom-docker-image-that-is-privately-hosted) to learn more about using privately hosted Docker Images.
|
||||
|
||||
## Learn More
|
||||
The Dockerfiles in this directory are used to build the Docker images used by this toolkit. Please reference the individual directories for more information on each Dockerfile:
|
||||
- [Base](./base)
|
||||
- [Python](./python)
|
||||
- [R](./r)
|
||||
|
||||
|
||||
|
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark1.6.3-base
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark1.6.3-gpu
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.1.0-base
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.1.0-gpu
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.2.0-base
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.2.0-gpu
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.3.0-base
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,23 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.3.0-gpu
|
||||
|
||||
ARG ANACONDA_VERSION=Anaconda3-5.1.0
|
||||
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
|
||||
libglib2.0-0 libxext6 libsm6 libxrender1 \
|
||||
git mercurial subversion \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/archive/${ANACONDA_VERSION}-Linux-x86_64.sh -O ~/anaconda.sh \
|
||||
&& /bin/bash ~/anaconda.sh -b -p /opt/conda \
|
||||
&& rm ~/anaconda.sh \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc \
|
||||
# reset default python to 3.5
|
||||
&& rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,16 +1,22 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM ubuntu:16.04
|
||||
|
||||
# set version of python required for thunderbolt application
|
||||
ENV AZTK_PYTHON_VERSION=3.5.4
|
||||
# set AZTK version compatibility
|
||||
ENV AZTK_DOCKER_IMAGE_VERSION 0.1.0
|
||||
|
||||
# set version of python required for aztk
|
||||
ENV AZTK_PYTHON_VERSION=3.5.2
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG SPARK_VERSION_KEY=spark-1.6.3-bin-hadoop2.6
|
||||
ENV SPARK_VERSION_KEY 1.6.3
|
||||
ENV SPARK_FULL_VERSION spark-${SPARK_VERSION_KEY}-bin-without-hadoop
|
||||
ENV HADOOP_VERSION 2.8.3
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
# set up env vars for pyenv
|
||||
ENV HOME /
|
||||
ENV PYENV_ROOT $HOME/.pyenv
|
||||
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
|
||||
RUN apt-get clean \
|
||||
&& apt-get update -y \
|
||||
|
@ -23,39 +29,130 @@ RUN apt-get clean \
|
|||
libbz2-dev \
|
||||
libreadline-dev \
|
||||
libsqlite3-dev \
|
||||
maven \
|
||||
wget \
|
||||
curl \
|
||||
llvm \
|
||||
git \
|
||||
libncurses5-dev \
|
||||
libncursesw5-dev \
|
||||
python3-pip \
|
||||
python3-venv \
|
||||
xz-utils \
|
||||
tk-dev \
|
||||
&& apt-get update -y \
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# from which we install Java8
|
||||
&& apt-get install -y --no-install-recommends software-properties-common \
|
||||
&& apt-add-repository ppa:webupd8team/java -y \
|
||||
&& apt-get update -y \
|
||||
# install java
|
||||
&& apt-get install -y --no-install-recommends default-jdk \
|
||||
# download pyenv
|
||||
&& git clone git://github.com/yyuu/pyenv.git .pyenv \
|
||||
&& git clone https://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv \
|
||||
# install & setup pyenv
|
||||
&& eval "$(pyenv init -)" \
|
||||
&& echo 'eval "$(pyenv init -)"' >> ~/.bashrc \
|
||||
# install aztk required python version
|
||||
&& env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install -f $AZTK_PYTHON_VERSION \
|
||||
&& pyenv global $AZTK_PYTHON_VERSION \
|
||||
# install spark & setup symlink to SPARK_HOME
|
||||
&& curl https://d3kbcqa49mib13.cloudfront.net/$SPARK_VERSION_KEY.tgz | tar xvz -C /home \
|
||||
&& ln -s /home/$SPARK_VERSION_KEY /home/spark-current
|
||||
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
# set up user python and aztk python
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python \
|
||||
&& /usr/bin/python -m pip install --upgrade pip setuptools wheel \
|
||||
&& apt-get remove -y python3-pip \
|
||||
# build and install spark
|
||||
&& git clone https://github.com/apache/spark.git \
|
||||
&& cd spark \
|
||||
&& git checkout tags/v${SPARK_VERSION_KEY} \
|
||||
&& export MAVEN_OPTS="-Xmx3g -XX:ReservedCodeCacheSize=1024m" \
|
||||
&& ./make-distribution.sh --name custom-spark --tgz -Phive -Phive-thriftserver -Dhadoop.version=${HADOOP_VERSION} -Phadoop-2.6 -DskipTests \
|
||||
&& tar -xvzf /spark/spark-${SPARK_VERSION_KEY}-bin-custom-spark.tgz --directory=/home \
|
||||
&& ln -s "/home/spark-${SPARK_VERSION_KEY}-bin-custom-spark" /home/spark-current \
|
||||
&& rm -rf /spark \
|
||||
# copy azure storage jars and dependencies to $SPARK_HOME/jars
|
||||
&& echo "<project>" \
|
||||
"<modelVersion>4.0.0</modelVersion>" \
|
||||
"<groupId>groupId</groupId>" \
|
||||
"<artifactId>artifactId</artifactId>" \
|
||||
"<version>1.0</version>" \
|
||||
"<dependencies>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure-datalake</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.sqlserver</groupId>" \
|
||||
"<artifactId>mssql-jdbc</artifactId>" \
|
||||
"<version>6.4.0.jre8</version>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-storage</artifactId>" \
|
||||
"<version>2.2.0</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.commons</groupId>" \
|
||||
"<artifactId>commons-lang3</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.slf4j</groupId>" \
|
||||
"<artifactId>slf4j-api</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-cosmosdb-spark_2.1.0_2.11</artifactId>" \
|
||||
"<version>1.1.1</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>tinkergraph-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>spark-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>io.netty</groupId>" \
|
||||
"<artifactId>*</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-annotations</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"</dependencies>" \
|
||||
"</project>" > /tmp/pom.xml \
|
||||
&& cd /tmp \
|
||||
&& mvn dependency:copy-dependencies -DoutputDirectory="${SPARK_HOME}/jars/" \
|
||||
# cleanup
|
||||
&& apt-get --purge autoremove -y maven python3-pip \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /tmp/* \
|
||||
&& rm -rf /root/.cache \
|
||||
&& rm -rf /root/.m2 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,16 +1,22 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM ubuntu:16.04
|
||||
|
||||
# set version of python required for thunderbolt application
|
||||
ENV AZTK_PYTHON_VERSION=3.5.4
|
||||
# set AZTK version compatibility
|
||||
ENV AZTK_DOCKER_IMAGE_VERSION 0.1.0
|
||||
|
||||
# set version of python required for aztk
|
||||
ENV AZTK_PYTHON_VERSION=3.5.2
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG SPARK_VERSION_KEY=spark-2.1.0-bin-hadoop2.7
|
||||
ENV SPARK_VERSION_KEY 2.1.0
|
||||
ENV SPARK_FULL_VERSION spark-${SPARK_VERSION_KEY}-bin-without-hadoop
|
||||
ENV HADOOP_VERSION 2.8.3
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
# set up env vars for pyenv
|
||||
ENV HOME /
|
||||
ENV PYENV_ROOT $HOME/.pyenv
|
||||
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
|
||||
RUN apt-get clean \
|
||||
&& apt-get update -y \
|
||||
|
@ -23,39 +29,130 @@ RUN apt-get clean \
|
|||
libbz2-dev \
|
||||
libreadline-dev \
|
||||
libsqlite3-dev \
|
||||
maven \
|
||||
wget \
|
||||
curl \
|
||||
llvm \
|
||||
git \
|
||||
libncurses5-dev \
|
||||
libncursesw5-dev \
|
||||
python3-pip \
|
||||
python3-venv \
|
||||
xz-utils \
|
||||
tk-dev \
|
||||
&& apt-get update -y \
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# from which we install Java8
|
||||
&& apt-get install -y --no-install-recommends software-properties-common \
|
||||
&& apt-add-repository ppa:webupd8team/java -y \
|
||||
&& apt-get update -y \
|
||||
# install java
|
||||
&& apt-get install -y --no-install-recommends default-jdk \
|
||||
# download pyenv
|
||||
&& git clone git://github.com/yyuu/pyenv.git .pyenv \
|
||||
&& git clone https://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv \
|
||||
# install & setup pyenv
|
||||
&& eval "$(pyenv init -)" \
|
||||
&& echo 'eval "$(pyenv init -)"' >> ~/.bashrc \
|
||||
# install aztk required python version
|
||||
&& env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install -f $AZTK_PYTHON_VERSION \
|
||||
&& pyenv global $AZTK_PYTHON_VERSION \
|
||||
# install spark & setup symlink to SPARK_HOME
|
||||
&& curl https://d3kbcqa49mib13.cloudfront.net/$SPARK_VERSION_KEY.tgz | tar xvz -C /home \
|
||||
&& ln -s /home/$SPARK_VERSION_KEY /home/spark-current
|
||||
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
# set up user python and aztk python
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python \
|
||||
&& /usr/bin/python -m pip install --upgrade pip setuptools wheel \
|
||||
&& apt-get remove -y python3-pip \
|
||||
# build and install spark
|
||||
&& git clone https://github.com/apache/spark.git \
|
||||
&& cd spark \
|
||||
&& git checkout tags/v${SPARK_VERSION_KEY} \
|
||||
&& export MAVEN_OPTS="-Xmx3g -XX:ReservedCodeCacheSize=1024m" \
|
||||
&& ./dev/make-distribution.sh --name custom-spark --pip --tgz -Phive -Phive-thriftserver -Dhadoop.version=${HADOOP_VERSION} -DskipTests \
|
||||
&& tar -xvzf /spark/spark-${SPARK_VERSION_KEY}-bin-custom-spark.tgz --directory=/home \
|
||||
&& ln -s "/home/spark-${SPARK_VERSION_KEY}-bin-custom-spark" /home/spark-current \
|
||||
&& rm -rf /spark \
|
||||
# copy azure storage jars and dependencies to $SPARK_HOME/jars
|
||||
&& echo "<project>" \
|
||||
"<modelVersion>4.0.0</modelVersion>" \
|
||||
"<groupId>groupId</groupId>" \
|
||||
"<artifactId>artifactId</artifactId>" \
|
||||
"<version>1.0</version>" \
|
||||
"<dependencies>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure-datalake</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.sqlserver</groupId>" \
|
||||
"<artifactId>mssql-jdbc</artifactId>" \
|
||||
"<version>6.4.0.jre8</version>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-storage</artifactId>" \
|
||||
"<version>2.2.0</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.commons</groupId>" \
|
||||
"<artifactId>commons-lang3</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.slf4j</groupId>" \
|
||||
"<artifactId>slf4j-api</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-cosmosdb-spark_${SPARK_VERSION_KEY}_2.11</artifactId>" \
|
||||
"<version>1.1.1</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>tinkergraph-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>spark-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>io.netty</groupId>" \
|
||||
"<artifactId>*</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-annotations</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"</dependencies>" \
|
||||
"</project>" > /tmp/pom.xml \
|
||||
&& cd /tmp \
|
||||
&& mvn dependency:copy-dependencies -DoutputDirectory="${SPARK_HOME}/jars/" \
|
||||
# cleanup
|
||||
&& apt-get --purge autoremove -y maven python3-pip \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /tmp/* \
|
||||
&& rm -rf /root/.cache \
|
||||
&& rm -rf /root/.m2 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,16 +1,22 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM ubuntu:16.04
|
||||
|
||||
# set version of python required for thunderbolt application
|
||||
ENV AZTK_PYTHON_VERSION=3.5.4
|
||||
# set AZTK version compatibility
|
||||
ENV AZTK_DOCKER_IMAGE_VERSION 0.1.0
|
||||
|
||||
# set version of python required for aztk
|
||||
ENV AZTK_PYTHON_VERSION=3.5.2
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG SPARK_VERSION_KEY=spark-2.2.0-bin-hadoop2.7
|
||||
ENV SPARK_VERSION_KEY 2.2.0
|
||||
ENV SPARK_FULL_VERSION spark-${SPARK_VERSION_KEY}-bin-without-hadoop
|
||||
ENV HADOOP_VERSION 2.8.3
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
# set up env vars for pyenv
|
||||
ENV HOME /
|
||||
ENV PYENV_ROOT $HOME/.pyenv
|
||||
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
|
||||
RUN apt-get clean \
|
||||
&& apt-get update -y \
|
||||
|
@ -23,39 +29,129 @@ RUN apt-get clean \
|
|||
libbz2-dev \
|
||||
libreadline-dev \
|
||||
libsqlite3-dev \
|
||||
maven \
|
||||
wget \
|
||||
curl \
|
||||
llvm \
|
||||
git \
|
||||
libncurses5-dev \
|
||||
libncursesw5-dev \
|
||||
python3-pip \
|
||||
python3-venv \
|
||||
xz-utils \
|
||||
tk-dev \
|
||||
&& apt-get update -y \
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# from which we install Java8
|
||||
&& apt-get install -y --no-install-recommends software-properties-common \
|
||||
&& apt-add-repository ppa:webupd8team/java -y \
|
||||
&& apt-get update -y \
|
||||
# install java
|
||||
&& apt-get install -y --no-install-recommends default-jdk \
|
||||
# download pyenv
|
||||
&& git clone git://github.com/yyuu/pyenv.git .pyenv \
|
||||
&& git clone https://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv \
|
||||
# install & setup pyenv
|
||||
&& eval "$(pyenv init -)" \
|
||||
&& echo 'eval "$(pyenv init -)"' >> ~/.bashrc \
|
||||
# install aztk required python version
|
||||
&& env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install -f $AZTK_PYTHON_VERSION \
|
||||
&& pyenv global $AZTK_PYTHON_VERSION \
|
||||
# install spark & setup symlink to SPARK_HOME
|
||||
&& curl https://d3kbcqa49mib13.cloudfront.net/$SPARK_VERSION_KEY.tgz | tar xvz -C /home \
|
||||
&& ln -s /home/$SPARK_VERSION_KEY /home/spark-current
|
||||
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
# set up user python
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python \
|
||||
&& /usr/bin/python -m pip install --upgrade pip setuptools wheel \
|
||||
# build and install spark
|
||||
&& git clone https://github.com/apache/spark.git \
|
||||
&& cd spark \
|
||||
&& git checkout tags/v${SPARK_VERSION_KEY} \
|
||||
&& export MAVEN_OPTS="-Xmx3g -XX:ReservedCodeCacheSize=1024m" \
|
||||
&& ./dev/make-distribution.sh --name custom-spark --pip --tgz -Phive -Phive-thriftserver -Dhadoop.version=${HADOOP_VERSION} -DskipTests \
|
||||
&& tar -xvzf /spark/spark-${SPARK_VERSION_KEY}-bin-custom-spark.tgz --directory=/home \
|
||||
&& ln -s "/home/spark-${SPARK_VERSION_KEY}-bin-custom-spark" /home/spark-current \
|
||||
&& rm -rf /spark \
|
||||
# copy azure storage jars and dependencies to $SPARK_HOME/jars
|
||||
&& echo "<project>" \
|
||||
"<modelVersion>4.0.0</modelVersion>" \
|
||||
"<groupId>groupId</groupId>" \
|
||||
"<artifactId>artifactId</artifactId>" \
|
||||
"<version>1.0</version>" \
|
||||
"<dependencies>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure-datalake</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.sqlserver</groupId>" \
|
||||
"<artifactId>mssql-jdbc</artifactId>" \
|
||||
"<version>6.4.0.jre8</version>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-storage</artifactId>" \
|
||||
"<version>2.2.0</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.commons</groupId>" \
|
||||
"<artifactId>commons-lang3</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.slf4j</groupId>" \
|
||||
"<artifactId>slf4j-api</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-cosmosdb-spark_${SPARK_VERSION_KEY}_2.11</artifactId>" \
|
||||
"<version>1.1.1</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>tinkergraph-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>spark-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>io.netty</groupId>" \
|
||||
"<artifactId>*</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-annotations</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"</dependencies>" \
|
||||
"</project>" > /tmp/pom.xml \
|
||||
&& cd /tmp \
|
||||
&& mvn dependency:copy-dependencies -DoutputDirectory="${SPARK_HOME}/jars/" \
|
||||
# cleanup
|
||||
&& apt-get --purge autoremove -y maven python3-pip \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /tmp/* \
|
||||
&& rm -rf /root/.cache \
|
||||
&& rm -rf /root/.m2 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -0,0 +1,158 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM ubuntu:16.04
|
||||
|
||||
# set AZTK version compatibility
|
||||
ENV AZTK_DOCKER_IMAGE_VERSION 0.1.0
|
||||
|
||||
# set version of python required for aztk
|
||||
ENV AZTK_PYTHON_VERSION=3.5.2
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ENV SPARK_VERSION_KEY 2.3.0
|
||||
ENV SPARK_FULL_VERSION spark-${SPARK_VERSION_KEY}-bin-without-hadoop
|
||||
ENV HADOOP_VERSION 2.8.3
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
|
||||
# set env vars
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
|
||||
RUN apt-get clean \
|
||||
&& apt-get update -y \
|
||||
# install dependency packages
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
make \
|
||||
build-essential \
|
||||
zlib1g-dev \
|
||||
libssl-dev \
|
||||
libbz2-dev \
|
||||
libreadline-dev \
|
||||
libsqlite3-dev \
|
||||
maven \
|
||||
wget \
|
||||
curl \
|
||||
llvm \
|
||||
git \
|
||||
libncurses5-dev \
|
||||
libncursesw5-dev \
|
||||
python3-pip \
|
||||
python3-venv \
|
||||
xz-utils \
|
||||
tk-dev \
|
||||
&& apt-get update -y \
|
||||
# install [software-properties-common]
|
||||
# so we can use [apt-add-repository] to add the repository [ppa:webupd8team/java]
|
||||
# from which we install Java8
|
||||
&& apt-get install -y --no-install-recommends software-properties-common \
|
||||
&& apt-add-repository ppa:webupd8team/java -y \
|
||||
&& apt-get update -y \
|
||||
# install java
|
||||
&& apt-get install -y --no-install-recommends default-jdk \
|
||||
# set up user python and aztk python
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python \
|
||||
&& /usr/bin/python -m pip install --upgrade pip setuptools wheel \
|
||||
&& apt-get remove -y python3-pip \
|
||||
# build and install spark
|
||||
&& git clone https://github.com/apache/spark.git \
|
||||
&& cd spark \
|
||||
&& git checkout tags/v${SPARK_VERSION_KEY} \
|
||||
&& export MAVEN_OPTS="-Xmx3g -XX:ReservedCodeCacheSize=1024m" \
|
||||
&& ./dev/make-distribution.sh --name custom-spark --pip --tgz -Phive -Phive-thriftserver -Dhadoop.version=${HADOOP_VERSION} -DskipTests \
|
||||
&& tar -xvzf /spark/spark-${SPARK_VERSION_KEY}-bin-custom-spark.tgz --directory=/home \
|
||||
&& ln -s "/home/spark-${SPARK_VERSION_KEY}-bin-custom-spark" /home/spark-current \
|
||||
&& rm -rf /spark \
|
||||
# copy azure storage jars and dependencies to $SPARK_HOME/jars
|
||||
&& echo "<project>" \
|
||||
"<modelVersion>4.0.0</modelVersion>" \
|
||||
"<groupId>groupId</groupId>" \
|
||||
"<artifactId>artifactId</artifactId>" \
|
||||
"<version>1.0</version>" \
|
||||
"<dependencies>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure-datalake</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-azure</artifactId>" \
|
||||
"<version>${HADOOP_VERSION}</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.hadoop</groupId>" \
|
||||
"<artifactId>hadoop-common</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.sqlserver</groupId>" \
|
||||
"<artifactId>mssql-jdbc</artifactId>" \
|
||||
"<version>6.4.0.jre8</version>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-storage</artifactId>" \
|
||||
"<version>2.2.0</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-core</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.commons</groupId>" \
|
||||
"<artifactId>commons-lang3</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.slf4j</groupId>" \
|
||||
"<artifactId>slf4j-api</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions>" \
|
||||
"</dependency>" \
|
||||
"<dependency>" \
|
||||
"<groupId>com.microsoft.azure</groupId>" \
|
||||
"<artifactId>azure-cosmosdb-spark_2.2.0_2.11</artifactId>" \
|
||||
"<version>1.1.1</version>" \
|
||||
"<exclusions>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>tinkergraph-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>org.apache.tinkerpop</groupId>" \
|
||||
"<artifactId>spark-gremlin</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>io.netty</groupId>" \
|
||||
"<artifactId>*</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"<exclusion>" \
|
||||
"<groupId>com.fasterxml.jackson.core</groupId>" \
|
||||
"<artifactId>jackson-annotations</artifactId>" \
|
||||
"</exclusion>" \
|
||||
"</exclusions> " \
|
||||
"</dependency>" \
|
||||
"</dependencies>" \
|
||||
"</project>" > /tmp/pom.xml \
|
||||
&& cd /tmp \
|
||||
&& mvn dependency:copy-dependencies -DoutputDirectory="${SPARK_HOME}/jars/" \
|
||||
# cleanup
|
||||
&& apt-get --purge autoremove -y maven python3-pip \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /tmp/* \
|
||||
&& rm -rf /root/.cache \
|
||||
&& rm -rf /root/.m2 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,143 @@
|
|||
#/bin/bash
|
||||
|
||||
# setup docker to build on /mnt instead of /var/lib/docker
|
||||
echo '{
|
||||
"graph": "/mnt",
|
||||
"storage-driver": "overlay"
|
||||
}' > /etc/docker/daemon.json
|
||||
|
||||
service docker restart
|
||||
|
||||
mkdir -p out
|
||||
|
||||
# base 1.6.3
|
||||
docker build base/spark1.6.3/ --tag aztk/spark:v0.1.0-spark1.6.3-base > out/base-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-base
|
||||
|
||||
# base 2.1.0
|
||||
docker build base/spark2.1.0/ --tag aztk/spark:v0.1.0-spark2.1.0-base > out/base-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-base
|
||||
|
||||
# base 2.2.0
|
||||
docker build base/spark2.2.0/ --tag aztk/spark:v0.1.0-spark2.2.0-base > out/base-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-base
|
||||
|
||||
# base 2.3.0
|
||||
docker build base/spark2.3.0/ --tag aztk/spark:v0.1.0-spark2.3.0-base > out/base-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-base
|
||||
|
||||
# miniconda-base 1.6.3
|
||||
docker build miniconda/spark1.6.3/base/ --tag aztk/spark:v0.1.0-spark1.6.3-miniconda-base > out/miniconda-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-miniconda-base
|
||||
|
||||
# miniconda-base 2.1.0
|
||||
docker build miniconda/spark2.1.0/base/ --tag aztk/spark:v0.1.0-spark2.1.0-miniconda-base > out/miniconda-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-miniconda-base
|
||||
|
||||
# miniconda-base 2.2.0
|
||||
docker build miniconda/spark2.2.0/base --tag aztk/spark:v0.1.0-spark2.2.0-miniconda-base > out/miniconda-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-miniconda-base
|
||||
|
||||
# miniconda-base 2.3.0
|
||||
docker build miniconda/spark2.3.0/base/ --tag aztk/spark:v0.1.0-spark2.3.0-miniconda-base > out/miniconda-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-miniconda-base
|
||||
|
||||
# anaconda-base 1.6.3
|
||||
docker build anaconda/spark1.6.3/base/ --tag aztk/spark:v0.1.0-spark1.6.3-anaconda-base > out/anaconda-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-anaconda-base
|
||||
|
||||
# anaconda-base 2.1.0
|
||||
docker build anaconda/spark2.1.0/base/ --tag aztk/spark:v0.1.0-spark2.1.0-anaconda-base > out/anaconda-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-anaconda-base
|
||||
|
||||
# anaconda-base 2.2.0
|
||||
docker build anaconda/spark2.2.0/base/ --tag aztk/spark:v0.1.0-spark2.2.0-anaconda-base > out/anaconda-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-anaconda-base
|
||||
|
||||
# anaconda-base 2.3.0
|
||||
docker build anaconda/spark2.3.0/base/ --tag aztk/spark:v0.1.0-spark2.3.0-anaconda-base > out/anaconda-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-anaconda-base
|
||||
|
||||
# r-base 1.6.3
|
||||
docker build r/spark1.6.3/base/ --tag aztk/spark:v0.1.0-spark1.6.3-r-base > out/r-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-r-base
|
||||
|
||||
# r-base 2.1.0
|
||||
docker build r/spark2.1.0/base/ --tag aztk/spark:v0.1.0-spark2.1.0-r-base > out/r-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-r-base
|
||||
|
||||
# r-base 2.2.0
|
||||
docker build r/spark2.2.0/base/ --tag aztk/spark:v0.1.0-spark2.2.0-r-base > out/r-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-r-base
|
||||
|
||||
# r-base 2.3.0
|
||||
docker build r/spark2.3.0/base/ --tag aztk/spark:v0.1.0-spark2.3.0-r-base > out/r-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-r-base
|
||||
|
||||
##################
|
||||
# GPU #
|
||||
##################
|
||||
|
||||
# gpu 1.6.3
|
||||
docker build gpu/spark1.6.3/ --tag aztk/spark:v0.1.0-spark1.6.3-gpu > out/gpu-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-gpu
|
||||
|
||||
# gpu 2.1.0
|
||||
docker build gpu/spark2.1.0/ --tag aztk/spark:v0.1.0-spark2.1.0-gpu > out/gpu-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-gpu
|
||||
|
||||
# gpu 2.2.0
|
||||
docker build gpu/spark2.2.0/ --tag aztk/spark:v0.1.0-spark2.2.0-gpu > out/gpu-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-gpu
|
||||
|
||||
# gpu 2.3.0
|
||||
docker build gpu/spark2.3.0/ --tag aztk/spark:v0.1.0-spark2.3.0-gpu > out/gpu-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-gpu
|
||||
|
||||
# miniconda-gpu 1.6.3
|
||||
docker build miniconda/spark1.6.3/gpu/ --tag aztk/spark:v0.1.0-spark1.6.3-miniconda-gpu > out/miniconda-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-miniconda-gpu
|
||||
|
||||
# miniconda-gpu 2.1.0
|
||||
docker build miniconda/spark2.1.0/gpu/ --tag aztk/spark:v0.1.0-spark2.1.0-miniconda-gpu > out/miniconda-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-miniconda-gpu
|
||||
|
||||
# miniconda-gpu 2.2.0
|
||||
docker build miniconda/spark2.2.0/gpu --tag aztk/spark:v0.1.0-spark2.2.0-miniconda-gpu > out/miniconda-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-miniconda-gpu
|
||||
|
||||
# miniconda-gpu 2.3.0
|
||||
docker build miniconda/spark2.3.0/gpu/ --tag aztk/spark:v0.1.0-spark2.3.0-miniconda-gpu > out/miniconda-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-miniconda-gpu
|
||||
|
||||
# anaconda-gpu 1.6.3
|
||||
docker build anaconda/spark1.6.3/gpu/ --tag aztk/spark:v0.1.0-spark1.6.3-anaconda-gpu > out/anaconda-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-anaconda-gpu
|
||||
|
||||
# anaconda-gpu 2.1.0
|
||||
docker build anaconda/spark2.1.0/gpu/ --tag aztk/spark:v0.1.0-spark2.1.0-anaconda-gpu > out/anaconda-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-anaconda-gpu
|
||||
|
||||
# anaconda-gpu 2.2.0
|
||||
docker build anaconda/spark2.2.0/gpu/ --tag aztk/spark:v0.1.0-spark2.2.0-anaconda-gpu > out/anaconda-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-anaconda-gpu
|
||||
|
||||
# anaconda-gpu 2.3.0
|
||||
docker build anaconda/spark2.3.0/gpu/ --tag aztk/spark:v0.1.0-spark2.3.0-anaconda-gpu > out/anaconda-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-anaconda-gpu
|
||||
|
||||
# r-gpu 1.6.3
|
||||
docker build r/spark1.6.3/gpu/ --tag aztk/spark:v0.1.0-spark1.6.3-r-gpu > out/r-spark1.6.3.out &&
|
||||
docker push aztk/spark:v0.1.0-spark1.6.3-r-gpu
|
||||
|
||||
# r-gpu 2.1.0
|
||||
docker build r/spark2.1.0/gpu/ --tag aztk/spark:v0.1.0-spark2.1.0-r-gpu > out/r-spark2.1.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.1.0-r-gpu
|
||||
|
||||
# r-gpu 2.2.0
|
||||
docker build r/spark2.2.0/gpu/ --tag aztk/spark:v0.1.0-spark2.2.0-r-gpu > out/r-spark2.2.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.2.0-r-gpu
|
||||
|
||||
# r-gpu 2.3.0
|
||||
docker build r/spark2.3.0/gpu/ --tag aztk/spark:v0.1.0-spark2.3.0-r-gpu > out/r-spark2.3.0.out &&
|
||||
docker push aztk/spark:v0.1.0-spark2.3.0-r-gpu
|
|
@ -1,4 +1,4 @@
|
|||
FROM aztk/base:spark1.6.3
|
||||
FROM aztk/spark:v0.1.0-spark1.6.3-base
|
||||
|
||||
LABEL com.nvidia.volumes.needed="nvidia_driver"
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
FROM aztk/base:spark2.1.0
|
||||
FROM aztk/spark:v0.1.0-spark2.1.0-base
|
||||
|
||||
LABEL com.nvidia.volumes.needed="nvidia_driver"
|
||||
|
||||
|
@ -76,4 +76,4 @@ ENV NUMBAPRO_CUDALIB /usr/local/cuda-8.0/targets/x86_64-linux/lib/
|
|||
# RUN pip install --upgrade tensorflow-gpu
|
||||
|
||||
WORKDIR $SPARK_HOME
|
||||
CMD ["bin/spark-class", "org.apache.spark.deploy.master.Master"]
|
||||
CMD ["bin/spark-class", "org.apache.spark.deploy.master.Master"]
|
|
@ -1,4 +1,4 @@
|
|||
FROM aztk/base:spark2.2.0
|
||||
FROM aztk/spark:v0.1.0-spark2.2.0-base
|
||||
|
||||
LABEL com.nvidia.volumes.needed="nvidia_driver"
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
FROM aztk/base:latest
|
||||
FROM aztk/spark:v0.1.0-spark2.3.0-base
|
||||
|
||||
LABEL com.nvidia.volumes.needed="nvidia_driver"
|
||||
|
||||
|
@ -76,4 +76,4 @@ ENV NUMBAPRO_CUDALIB /usr/local/cuda-8.0/targets/x86_64-linux/lib/
|
|||
# RUN pip install --upgrade tensorflow-gpu
|
||||
|
||||
WORKDIR $SPARK_HOME
|
||||
CMD ["bin/spark-class", "org.apache.spark.deploy.master.Master"]
|
||||
CMD ["bin/spark-class", "org.apache.spark.deploy.master.Master"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark1.6.3-base
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark1.6.3-gpu
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.1.0-base
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.1.0-gpu
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.2.0-base
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.2.0-gpu
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.3.0-base
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,22 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.3.0-gpu
|
||||
|
||||
ARG MINICONDA_VERISON=Miniconda3-4.4.10
|
||||
|
||||
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
|
||||
ENV PATH /opt/conda/bin:$PATH
|
||||
|
||||
RUN apt-get update --fix-missing \
|
||||
&& apt-get install -y wget bzip2 ca-certificates curl git \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN wget --quiet https://repo.continuum.io/miniconda/${MINICONDA_VERISON}-Linux-x86_64.sh -O ~/miniconda.sh \
|
||||
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
|
||||
&& rm ~/miniconda.sh \
|
||||
&& /opt/conda/bin/conda clean -tipsy \
|
||||
&& ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh \
|
||||
&& echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
|
||||
# install extras
|
||||
# && conda install numba pandas scikit-learn
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,23 +0,0 @@
|
|||
# Python
|
||||
This Dockerfile is used to build the __aztk-python__ Docker image used by this toolkit. This image uses Anaconda, providing access to a wide range of popular python packages.
|
||||
|
||||
You can modify these Dockerfiles to build your own image. However, in mose cases, building on top of the __aztk-base__ image is recommended.
|
||||
|
||||
NOTE: If you plan to use Jupyter Notebooks with your Spark cluster, we recommend using this image as Jupyter Notebook comes pre-installed with Anaconda.
|
||||
|
||||
## How to build this image
|
||||
This Dockerfile takes in a variable at build time that allow you to specify your desired Anaconda versions: **ANACONDA_VERSION**
|
||||
|
||||
By default, we set **ANACONDA_VERSION=anaconda3-5.0.0**.
|
||||
|
||||
For example, if I wanted to use Anaconda3 v5.0.0 with Spark v2.1.0, I would select the appropriate Dockerfile and build the image as follows:
|
||||
```sh
|
||||
# spark2.1.0/Dockerfile
|
||||
docker build \
|
||||
--build-arg ANACONDA_VERSION=anaconda3-5.0.0 \
|
||||
-t <my_image_tag> .
|
||||
```
|
||||
|
||||
**ANACONDA_VERSION** is used to set the version of Anaconda for your cluster.
|
||||
|
||||
NOTE: Most versions of Python will work. However, when selecting your Python version, please make sure that the it is compatible with your selected version of Spark.
|
|
@ -1,14 +0,0 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/base:spark1.6.3
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG ANACONDA_VERSION=anaconda3-5.0.0
|
||||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,14 +0,0 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/gpu:spark1.6.3
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG ANACONDA_VERSION=anaconda3-5.0.0
|
||||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,14 +0,0 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/base:spark2.1.0
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG ANACONDA_VERSION=anaconda3-5.0.0
|
||||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,14 +0,0 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/gpu:spark2.1.0
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG ANACONDA_VERSION=anaconda3-5.0.0
|
||||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,14 +0,0 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/base:spark2.2.0
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG ANACONDA_VERSION=anaconda3-5.0.0
|
||||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,14 +0,0 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/gpu:spark2.2.0
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG ANACONDA_VERSION=anaconda3-5.0.0
|
||||
|
||||
# install user specificed version of anaconda
|
||||
RUN pyenv install -f $ANACONDA_VERSION \
|
||||
&& pyenv global $ANACONDA_VERSION
|
||||
|
||||
# set env vars
|
||||
ENV USER_PYTHON_VERSION $ANACONDA_VERSION
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -1,126 +1,56 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/base:spark1.6.3
|
||||
FROM aztk/spark:v0.1.0-spark1.6.3-base
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
# set env vars
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV RSTUDIO_SERVER_VERSION $RSTUDIO_SERVER_VERSION
|
||||
ENV R_VERSION $R_VERSION
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpng16-16 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
sudo \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="libcairo2-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
## Download source code
|
||||
&& cd tmp/ \
|
||||
&& majorVersion=$(echo $R_VERSION | cut -f1 -d.) \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${majorVersion}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN')" \
|
||||
&& chown -R root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/local/lib/R/site-library \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
&& apt-get autoclean -y
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,142 +1,56 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/gpu:spark1.6.3
|
||||
FROM aztk/spark:v0.1.0-spark1.6.3-gpu
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG TENSORFLOW_VERSION=tensorflow-gpu
|
||||
ARG CNTK_VERSION=https://cntk.ai/PythonWheel/GPU/cntk-2.3.1-cp35-cp35m-linux_x86_64.whl
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
# set env vars
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV RSTUDIO_SERVER_VERSION $RSTUDIO_SERVER_VERSION
|
||||
ENV R_VERSION $R_VERSION
|
||||
|
||||
RUN useradd -m -d /home/rstudio rstudio -G sudo,staff \
|
||||
&& echo rstudio:rstudio | chpasswd \
|
||||
&& chmod -R 777 /home/rstudio \
|
||||
&& chmod -R 777 //.pyenv/
|
||||
|
||||
# Setting up rstudio user with Tensorflow and CNTK
|
||||
USER rstudio
|
||||
RUN echo "PATH='"$PATH"'" > /home/rstudio/.Renviron \
|
||||
&& pip3 install \
|
||||
$CNTK_VERSION \
|
||||
$TENSORFLOW_VERSION \
|
||||
keras
|
||||
|
||||
USER root
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpng16-16 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
sudo \
|
||||
openmpi-bin \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="libcairo2-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
## Download source code
|
||||
&& cd /tmp/ \
|
||||
&& majorVersion=$(echo $R_VERSION | cut -f1 -d.) \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${majorVersion}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl');" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& echo "Sys.setenv(SPARK_HOME ='"$SPARK_HOME"');" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt', 'tidyverse', 'sparklyr', 'keras', 'tensorflow'), repo = '$MRAN')" \
|
||||
&& chown -R root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/local/lib/R/site-library \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
&& apt-get autoclean -y
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,126 +1,56 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/base:spark2.1.0
|
||||
FROM aztk/spark:v0.1.0-spark2.1.0-base
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
# set env vars
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV RSTUDIO_SERVER_VERSION $RSTUDIO_SERVER_VERSION
|
||||
ENV R_VERSION $R_VERSION
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpng16-16 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
sudo \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="libcairo2-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
## Download source code
|
||||
&& cd tmp/ \
|
||||
&& majorVersion=$(echo $R_VERSION | cut -f1 -d.) \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${majorVersion}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN')" \
|
||||
&& chown -R root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/local/lib/R/site-library \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
&& apt-get autoclean -y
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,142 +1,56 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/gpu:spark2.1.0
|
||||
FROM aztk/spark:v0.1.0-spark2.1.0-gpu
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG TENSORFLOW_VERSION=tensorflow-gpu
|
||||
ARG CNTK_VERSION=https://cntk.ai/PythonWheel/GPU/cntk-2.3.1-cp35-cp35m-linux_x86_64.whl
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
# set env vars
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV RSTUDIO_SERVER_VERSION $RSTUDIO_SERVER_VERSION
|
||||
ENV R_VERSION $R_VERSION
|
||||
|
||||
RUN useradd -m -d /home/rstudio rstudio -G sudo,staff \
|
||||
&& echo rstudio:rstudio | chpasswd \
|
||||
&& chmod -R 777 /home/rstudio \
|
||||
&& chmod -R 777 //.pyenv/
|
||||
|
||||
# Setting up rstudio user with Tensorflow and CNTK
|
||||
USER rstudio
|
||||
RUN echo "PATH='"$PATH"'" > /home/rstudio/.Renviron \
|
||||
&& pip3 install \
|
||||
$CNTK_VERSION \
|
||||
$TENSORFLOW_VERSION \
|
||||
keras
|
||||
|
||||
USER root
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpng16-16 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
sudo \
|
||||
openmpi-bin \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="libcairo2-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
## Download source code
|
||||
&& cd /tmp/ \
|
||||
&& majorVersion=$(echo $R_VERSION | cut -f1 -d.) \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${majorVersion}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl');" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& echo "Sys.setenv(SPARK_HOME ='"$SPARK_HOME"');" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt', 'tidyverse', 'sparklyr', 'keras', 'tensorflow'), repo = '$MRAN')" \
|
||||
&& chown -R root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/local/lib/R/site-library \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
&& apt-get autoclean -y
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,126 +1,56 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/base:spark2.2.0
|
||||
FROM aztk/spark:v0.1.0-spark2.2.0-base
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
# set env vars
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV RSTUDIO_SERVER_VERSION $RSTUDIO_SERVER_VERSION
|
||||
ENV R_VERSION $R_VERSION
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpng16-16 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
sudo \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="libcairo2-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
## Download source code
|
||||
&& cd tmp/ \
|
||||
&& majorVersion=$(echo $R_VERSION | cut -f1 -d.) \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${majorVersion}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN')" \
|
||||
&& chown -R root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/local/lib/R/site-library \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
&& apt-get autoclean -y
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -1,142 +1,56 @@
|
|||
# Ubuntu 16.04 (Xenial)
|
||||
FROM aztk/gpu:spark2.2.0
|
||||
FROM aztk/spark:v0.1.0-spark2.2.0-gpu
|
||||
|
||||
# modify these ARGs on build time to specify your desired versions of Spark/Hadoop
|
||||
ARG R_VERSION=3.4.1
|
||||
ARG RSTUDIO_SERVER_VERSION=1.1.383
|
||||
ARG TENSORFLOW_VERSION=tensorflow-gpu
|
||||
ARG CNTK_VERSION=https://cntk.ai/PythonWheel/GPU/cntk-2.3.1-cp35-cp35m-linux_x86_64.whl
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
# set env vars
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
ENV BUILD_DATE ${BUILD_DATE:-}
|
||||
ENV RSTUDIO_SERVER_VERSION $RSTUDIO_SERVER_VERSION
|
||||
ENV R_VERSION $R_VERSION
|
||||
|
||||
RUN useradd -m -d /home/rstudio rstudio -G sudo,staff \
|
||||
&& echo rstudio:rstudio | chpasswd \
|
||||
&& chmod -R 777 /home/rstudio \
|
||||
&& chmod -R 777 //.pyenv/
|
||||
|
||||
# Setting up rstudio user with Tensorflow and CNTK
|
||||
USER rstudio
|
||||
RUN echo "PATH='"$PATH"'" > /home/rstudio/.Renviron \
|
||||
&& pip3 install \
|
||||
$CNTK_VERSION \
|
||||
$TENSORFLOW_VERSION \
|
||||
keras
|
||||
|
||||
USER root
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash-completion \
|
||||
ca-certificates \
|
||||
file \
|
||||
fonts-texgyre \
|
||||
g++ \
|
||||
gfortran \
|
||||
gsfonts \
|
||||
libcurl3 \
|
||||
libopenblas-dev \
|
||||
libpangocairo-1.0-0 \
|
||||
libpng16-16 \
|
||||
locales \
|
||||
make \
|
||||
unzip \
|
||||
zip \
|
||||
libcurl4-openssl-dev \
|
||||
libxml2-dev \
|
||||
libapparmor1 \
|
||||
gdebi-core \
|
||||
lsb-release \
|
||||
psmisc \
|
||||
sudo \
|
||||
openmpi-bin \
|
||||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8 \
|
||||
&& BUILDDEPS="libcairo2-dev \
|
||||
libpango1.0-dev \
|
||||
libjpeg-dev \
|
||||
libicu-dev \
|
||||
libpcre3-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
liblzma-dev \
|
||||
libx11-dev \
|
||||
libxt-dev \
|
||||
perl \
|
||||
tcl8.6-dev \
|
||||
tk8.6-dev \
|
||||
texinfo \
|
||||
texlive-extra-utils \
|
||||
texlive-fonts-recommended \
|
||||
texlive-fonts-extra \
|
||||
texlive-latex-recommended \
|
||||
x11proto-core-dev \
|
||||
xauth \
|
||||
xfonts-base \
|
||||
xvfb" \
|
||||
&& apt-get install -y --no-install-recommends $BUILDDEPS \
|
||||
## Download source code
|
||||
&& cd /tmp/ \
|
||||
&& majorVersion=$(echo $R_VERSION | cut -f1 -d.) \
|
||||
&& curl -O https://cran.r-project.org/src/base/R-${majorVersion}/R-${R_VERSION}.tar.gz \
|
||||
## Extract source code
|
||||
&& tar -xf R-${R_VERSION}.tar.gz \
|
||||
&& cd R-${R_VERSION} \
|
||||
## Set compiler flags
|
||||
&& R_PAPERSIZE=letter \
|
||||
R_BATCHSAVE="--no-save --no-restore" \
|
||||
R_BROWSER=xdg-open \
|
||||
PAGER=/usr/bin/pager \
|
||||
PERL=/usr/bin/perl \
|
||||
R_UNZIPCMD=/usr/bin/unzip \
|
||||
R_ZIPCMD=/usr/bin/zip \
|
||||
R_PRINTCMD=/usr/bin/lpr \
|
||||
LIBnn=lib \
|
||||
AWK=/usr/bin/awk \
|
||||
CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g" \
|
||||
## Configure options
|
||||
./configure --enable-R-shlib \
|
||||
--enable-memory-profiling \
|
||||
--with-readline \
|
||||
--with-blas="-lopenblas" \
|
||||
--disable-nls \
|
||||
--without-recommended-packages \
|
||||
## Build and install
|
||||
&& make \
|
||||
&& make install \
|
||||
## Add a default CRAN mirror
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/local/lib/R/site-library \
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/local/lib/R/site-library'" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library'}" >> /usr/local/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl');" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& echo "Sys.setenv(SPARK_HOME ='"$SPARK_HOME"');" >> /usr/local/lib/R/etc/Rprofile.site \
|
||||
&& Rscript -e "install.packages(c('littler', 'docopt', 'tidyverse', 'sparklyr', 'keras', 'tensorflow'), repo = '$MRAN')" \
|
||||
&& chown -R root:staff /usr/local/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/local/lib/R/site-library \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/local/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## TEMPORARY WORKAROUND to get more robust error handling for install2.r prior to littler update
|
||||
&& curl -O /usr/local/bin/install2.r https://github.com/eddelbuettel/littler/raw/master/inst/examples/install2.r \
|
||||
&& chmod +x /usr/local/bin/install2.r \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
&& apt-get autoclean -y
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
||||
|
|
|
@ -0,0 +1,56 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.3.0-base
|
||||
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -0,0 +1,56 @@
|
|||
FROM aztk/spark:v0.1.0-spark2.3.0-gpu
|
||||
|
||||
ARG R_VERSION=3.4.4
|
||||
ARG R_BASE_VERSION=${R_VERSION}-1xenial0
|
||||
ARG BUILD_DATE
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends apt-transport-https \
|
||||
libxml2-dev \
|
||||
libcairo2-dev \
|
||||
libsqlite-dev \
|
||||
libmariadbd-dev \
|
||||
libmariadb-client-lgpl-dev \
|
||||
libpq-dev \
|
||||
libssh2-1-dev \
|
||||
libcurl4-openssl-dev \
|
||||
locales \
|
||||
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
|
||||
&& add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial/' \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends r-base=${R_BASE_VERSION} r-base-dev=${R_BASE_VERSION}
|
||||
|
||||
RUN mkdir -p /usr/lib/R/etc/ \
|
||||
&& echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Add a library directory (for user-installed packages)
|
||||
&& mkdir -p /usr/lib/R/site-library \
|
||||
## Fix library path
|
||||
&& echo "R_LIBS_USER='/usr/lib/R/site-library'" >> /usr/lib/R/etc/Renviron \
|
||||
&& echo "R_LIBS=\${R_LIBS-'/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library'}" >> /usr/lib/R/etc/Renviron \
|
||||
## install packages from date-locked MRAN snapshot of CRAN
|
||||
&& [ -z "$BUILD_DATE" ] && BUILD_DATE=$(TZ="America/Los_Angeles" date -I) || true \
|
||||
&& MRAN=https://mran.microsoft.com/snapshot/${BUILD_DATE} \
|
||||
&& echo MRAN=$MRAN >> /etc/environment \
|
||||
&& export MRAN=$MRAN \
|
||||
&& echo "options(repos = c(CRAN='$MRAN'), download.file.method = 'libcurl'); Sys.setenv(SPARK_HOME ='"$SPARK_HOME"')" >> /usr/lib/R/etc/Rprofile.site \
|
||||
## Use littler installation scripts
|
||||
&& Rscript -e "install.packages(c('dplyr', 'docopt', 'tidyverse', 'sparklyr'), repo = '$MRAN', dependencies=TRUE)" \
|
||||
&& chown -R root:staff /usr/lib/R/site-library \
|
||||
&& chmod -R g+wx /usr/lib/R/site-library \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r \
|
||||
&& ln -s /usr/lib/R/site-library/littler/bin/r /usr/local/bin/r \
|
||||
## Clean up from R source install
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/* \
|
||||
&& apt-get autoremove -y \
|
||||
&& apt-get autoclean -y
|
||||
|
||||
RUN rm /usr/bin/python \
|
||||
&& ln -s /usr/bin/python3.5 /usr/bin/python
|
||||
|
||||
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
|
||||
&& locale-gen en_US.utf8 \
|
||||
&& /usr/sbin/update-locale LANG=en_US.UTF-8
|
||||
|
||||
CMD ["/bin/bash"]
|
|
@ -14,7 +14,7 @@ The script outputs all of the necessary information to use `aztk`, just copy the
|
|||
## Usage
|
||||
Copy and paste the following into an [Azure Cloud Shell](https://shell.azure.com):
|
||||
```sh
|
||||
wget -q https://raw.githubusercontent.com/Azure/aztk/master/account_setup.sh &&
|
||||
wget -q https://raw.githubusercontent.com/Azure/aztk/v0.7.0/account_setup.sh &&
|
||||
chmod 755 account_setup.sh &&
|
||||
/bin/bash account_setup.sh
|
||||
```
|
||||
|
|
|
@ -1,36 +1,9 @@
|
|||
# Docker
|
||||
Azure Distributed Data Engineering Toolkit runs Spark on Docker.
|
||||
|
||||
Supported Azure Distributed Data Engineering Toolkit images are hosted publicly on [Docker Hub](https://hub.docker.com/r/aztk/base/tags).
|
||||
Supported Azure Distributed Data Engineering Toolkit images are hosted publicly on [Docker Hub](https://hub.docker.com/r/aztk/spark/).
|
||||
|
||||
## Versioning with Docker
|
||||
The default image that this package uses is a the __aztk-base__ Docker image that comes with **Spark v2.2.0**.
|
||||
|
||||
You can use several versions of the __aztk-base__ image:
|
||||
- Spark 2.2.0 - aztk/base:spark2.2.0 (default)
|
||||
- Spark 2.1.0 - aztk/base:spark2.1.0
|
||||
- Spark 1.6.3 - aztk/base:spark1.6.3
|
||||
|
||||
To enable GPUs you may use any of the following images, which are based upong the __aztk-base__ images. Each of these images are contain CUDA-8.0 and cuDNN-6.0. By default, these images are used if the VM type used has a GPU.
|
||||
- Spark 2.2.0 - aztk/gpu:spark2.2.0 (default)
|
||||
- Spark2.1.0 - aztk/gpu:spark2.1.0
|
||||
- Spark 1.6.3 - aztk/gpu:spark1.6.3
|
||||
|
||||
We also provide two other image types tailored for the Python and R users: __aztk-r__ and __aztk-python__. You can choose between the following:
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.2.0 - aztk/python:spark2.2.0-python3.6.2-base
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.1.0 - aztk/python:spark2.1.0-python3.6.2-base
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 1.6.3 - aztk/python:spark1.6.3-python3.6.2-base
|
||||
- R 3.4.1 / Spark v2.2.0 - aztk/r-base:spark2.2.0-r3.4.1-base
|
||||
- R 3.4.1 / Spark v2.1.0 - aztk/r-base:spark2.1.0-r3.4.1-base
|
||||
- R 3.4.1 / Spark v1.6.3 - aztk/r-base:spark1.6.3-r3.4.1-base
|
||||
|
||||
|
||||
Please note that each of these images also have GPU enabled versions. To use these versions, replace the "-base" part of the Docker image tag with "-gpu":
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.2.0 (GPU) - aztk/python:spark2.2.0-python3.6.2-gpu
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 2.1.0 (GPU) - aztk/python:spark2.1.0-python3.6.2-gpu
|
||||
- Anaconda3-5.0.0 (Python 3.6.2) / Spark 1.6.3 (GPU) - aztk/python:spark1.6.3-python3.6.2-gpu
|
||||
|
||||
*Today, these supported images are hosted on Docker Hub under the repo ["base/gpu/python/r-base:<tag>"](https://hub.docker.com/r/aztk).*
|
||||
By default, the `aztk/spark:v0.1.0-spark2.3.0-base` image will be used.
|
||||
|
||||
To select an image other than the default, you can set your Docker image at cluster creation time with the optional **--docker-repo** parameter:
|
||||
|
||||
|
@ -38,17 +11,13 @@ To select an image other than the default, you can set your Docker image at clus
|
|||
aztk spark cluster create ... --docker-repo <name_of_docker_image_repo>
|
||||
```
|
||||
|
||||
For example, if I wanted to use Spark v1.6.3, I could run the following cluster create command:
|
||||
For example, if I wanted to use Spark v2.2.0, I could run the following cluster create command:
|
||||
```sh
|
||||
aztk spark cluster create ... --docker-repo aztk/base:spark1.6.3
|
||||
```
|
||||
|
||||
## Using a custom Docker Image
|
||||
What if I wanted to use my own Docker image?
|
||||
|
||||
You can build your own Docker image on top or beneath one of our supported base images _OR_ you can modify the [supported Dockerfile](../docker-image) and build your own image that way.
|
||||
|
||||
Please refer to ['../docker-image'](../docker-image) for more information on building your own image.
|
||||
You can build your own Docker image on top or beneath one of our supported base images _OR_ you can modify the [supported Dockerfiles](https://github.com/Azure/aztk/tree/v0.7.0/docker-image) and build your own image that way.
|
||||
|
||||
Once you have your Docker image built and hosted publicly, you can then use the **--docker-repo** parameter in your **aztk spark cluster create** command to point to it.
|
||||
|
||||
|
@ -70,3 +39,57 @@ docker:
|
|||
password: <mypassword>
|
||||
endpoint: <https://my-custom-docker-endpoint.com>
|
||||
```
|
||||
|
||||
### Building Your Own Docker Image
|
||||
Building your own Docker Image provides more customization over your cluster's environment. For some, this may look like installing specific, and even private, libraries that their Spark jobs require. For others, it may just be setting up a version of Spark, Python or R that fits their particular needs.
|
||||
|
||||
The Azure Distributed Data Engineering Toolkit supports custom Docker images. To guarantee that your Spark deployment works, we recommend that you build on top of one of our supported images.
|
||||
|
||||
To build your own image, can either build _on top_ or _beneath_ one of our supported images _OR_ you can just modify one of the supported Dockerfiles to build your own.
|
||||
|
||||
### Building on top
|
||||
You can build on top of our images by referencing the __aztk/spark__ image in the **FROM** keyword of your Dockerfile:
|
||||
```sh
|
||||
# Your custom Dockerfile
|
||||
|
||||
FROM aztk/spark:v0.1.0-spark2.3.0-base
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
### Building beneath
|
||||
To build beneath one of our images, modify one of our Dockerfiles so that the **FROM** keyword pulls from your Docker image's location (as opposed to the default which is a base Ubuntu image):
|
||||
```sh
|
||||
# One of the Dockerfiles that AZTK supports
|
||||
# Change the FROM statement to point to your hosted image repo
|
||||
|
||||
FROM my_username/my_repo:latest
|
||||
...
|
||||
```
|
||||
|
||||
Please note that for this method to work, your Docker image must have been built on Ubuntu.
|
||||
|
||||
## Custom Docker Image Rquirements
|
||||
If you are building your own custom image and __not__ building on top of a supported image, the following requirements are necessary.
|
||||
|
||||
Please make sure that the following environment variables are set:
|
||||
- AZTK_DOCKER_IMAGE_VERSION
|
||||
- JAVA_HOME
|
||||
- SPARK_HOME
|
||||
|
||||
You also need to make sure that __PATH__ is correctly configured with $SPARK_HOME
|
||||
- PATH=$SPARK_HOME/bin:$PATH
|
||||
|
||||
By default, these are set as follows:
|
||||
``` sh
|
||||
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
|
||||
ENV SPARK_HOME /home/spark-current
|
||||
ENV PATH $SPARK_HOME/bin:$PATH
|
||||
```
|
||||
|
||||
If you are using your own version of Spark, make that it is symlinked by "/home/spark-current". **$SPARK_HOME**, must also point to "/home/spark-current".
|
||||
|
||||
## Hosting your Docker Image
|
||||
By default, this toolkit assumes that your Docker images are publicly hosted on Docker Hub. However, we also support hosting your images privately.
|
||||
|
||||
See [here](https://github.com/Azure/aztk/blob/v0.7.0/docs/12-docker-image.md#using-a-custom-docker-image-that-is-privately-hosted) to learn more about using privately hosted Docker Images.
|
||||
|
|
|
@ -22,8 +22,8 @@ size: 2
|
|||
# username: <username for the linux user to be created> (optional)
|
||||
username: spark
|
||||
|
||||
# docker_repo: <name of docker image repo (for more information, see https://github.com/Azure/aztk/blob/master/docs/12-docker-image.html)>
|
||||
docker_repo: aztk/base:spark2.2.0
|
||||
# docker_repo: <name of docker image repo (for more information, see https://github.com/Azure/aztk/blob/v0.7.0/docs/12-docker-image.md)>
|
||||
docker_repo: aztk/base:v0.1.0-spark2.3.0-base
|
||||
|
||||
# custom_script: <path to custom script to run on each node> (optional)
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче