Граф коммитов

54 Коммитов

Автор SHA1 Сообщение Дата
Timothee Guerin 90aceaefb5 merge master 2018-04-13 10:05:31 -07:00
Jacob Freck 7ef721f0c1
Feature: getting started script (#475)
* initial changes for getting started scripts

* add temp error handling

* rename file - fix typo

* add debug strings

* add handling for existing user

* WIP: wait for subprocess to complete to get exit code

* WIP: handle existing user and refactor code

* WIP: add missing return statements

* WIP: fix typo

* start sdk refactor

* mostly working create

* working happy create path

* handle errors for vnet, aad application

* make account setup interactive

* add prompt

* add docs

* rename account_setup_refac to account_setup

* add some logging

* pip install msrest, azure-cli-core, import issues

* remove in script pip, add shell wrapper program

* ellipsis to period

* update branch name for account_setup.sh

* docstring

* retry resource group creation

* fix typo, update retry

* explicitly set output location

* wget overwrite flag, docs update

* add prompt for multi tenants

* fix bug with batch account creation

* add spinner, print statements, fix formatting bug

* fix param bug
2018-04-11 13:27:55 -07:00
Jacob Freck 44a07654aa
Feature: spark debug tool (#455)
* start implementation of cluster debug utility

* update debug program

* update debug

* fix output directory structure

* cleanup output, add error checking

* sort imports

* start untar

* extract tar

* add debug.py to pylintc ignore, line too long

* crlf->lf

* add app logs

* call get_spark_app_logs, typos

* add docs

* remove debug.py from pylintrc ignore

* added debug.py back to pylint ignore

* change pylint ignore

* remove commented log

* update cluster_run

* refactor cluster_copy

* update debug, add spinner for run and copy

* make new sdk cluster_download endpoint
2018-04-09 15:02:43 -07:00
Jacob Freck 1eaa1b6e42
Feature: add internal flag to node commands (#482)
* add internal ssh flag

* add --internal flag to cluster get

* cluster run internal flag

* fix add command back

* cluster copy internal

* fix method params

* fix method params

* add debug statement

* fix params

* remove debug statement

* fixes

* add debug statement

* remove debug statement

* add hostname to /etc/hosts

* remove hostname from /etc/hosts

* add sdk docs for internal switch in cluster run and copy
2018-04-06 15:59:13 -07:00
Jacob Freck 4ef3dd09df
Bug: add spark.history.fs.logDirectory to required keys (#456)
* add spark.history.fs.logDirectory to requried keys

* add spark_event_log_enabled_key to required_keys

* docs, add history server config to spark-defaults.conf

* fix bad logic

* crlf->lf
2018-04-05 14:11:35 -07:00
Timothee Guerin 7f1e77ec15 Added types 2018-04-05 09:38:31 -07:00
Timothee Guerin 0f06d69397 Define plugin docs 2018-04-05 09:28:22 -07:00
Jacob Freck 82ad0296af
Bug: add gitattributes file (#470)
Bug: line endings, add gitattributes file
2018-04-04 13:44:26 -07:00
Jacob Freck 8aa1843f23
Feature: managed storage for clusters and jobs (#443)
* add in storage management for clusters, jobs

* add warning logs on cli delete

* whitespace

* add keep-logs flag

* add docs on storage lifetime
2018-03-20 10:45:49 -07:00
Dmitry Stratiychuk 4be5ac2f44 Fix job configuration option for `aztk spark job submit` command (#435)
`--job-conf` option mentioned in the docs wasn't working.

CLI help was showing that option is named `--configuration-c`
which seems to be a result of a missing comma in option definition.
2018-03-13 11:07:48 -07:00
Jacob Freck 17755e0ffa
Feature: Basic Cluster and Job Submission SDK Tests (#344)
* add initial cluster tests

* add cluster tests, add simple job submission test scenario

* sort imports

* fix job tests

* fix job tests

* remove pytest from travis build

* cluster per test, parallel pytest plugin

* delete cluster after tests, wait until deleted

* fix bugs

* catch right error, change cluster_id to base_cluster_id

* fix test name

* fixes

*  move tests to intregration_tests dir

* update travis to run non-integration tests

* directory structure, decoupled job tests

* fix job tests, issue with submit_job

* fix bug

* add test docs

* add cluster and job delete to finally clause
2018-02-22 14:06:16 -08:00
Jacob Freck 6b1c86195d
Feature: SDK support for file-like configuration objects (#373)
* add support for filelike objects for conifguration files

* fix custom scripts

* remove os.pathlike

* merge error
2018-02-21 16:55:26 -08:00
Jacob Freck 85e444ce34
Feature: Cluster Run and Copy (#304)
* start implementation of cluster run

* fix cluster_run

* start debug sequential user add and delete

* parallelize user creation and deletion, start implementation of cluster scp

* continue cluster_scp implementation

* debug statements, disconnect error: permission denied

* untesteed parakimo implementation of clus_run

* continue debugging user creation bug

* fix bug with pool user creation, start concurrent implementation

* start fix of paramiko cluster_run and cluster_copy

* working paramiko cluster_run implementation, start cluster_scp

* fix cluster_scp command

* update requirements, rename cluster_run function

* remove unused shell functions

* parallelize run and scp, add container_name, create logs wrapper

* change scp to copy, clean up

* sort imports

* remove asyncssh from node requirements

* remove old import

* remove bad error handling

* make cluster user management methods private

* remove comment

* remove accidental commit

* fix merge, move delete to finally clause

* add docs

* formatting
2018-02-07 16:19:33 -08:00
Jacob Freck 1a15245eb4
Feature: spark init docker repo customization (#358)
* customize docker_repo based on init args

* whitespace

* add some docs

* r-base to r

* case insensitive r flag, typo fix
2018-02-07 10:56:20 -08:00
Jacob Freck 748a1269fa
Feature: Spark mixed mode support (#350)
* add support for aad creds for storage on node

* add mixed mode support

* add docs

* switch error order

* add dedicated to get_cluster

* remove mixed mode in print_cluster_conf
2018-02-07 10:41:51 -08:00
Emlyn Corrin 63fb81b2b6 Allow submitting jobs into a VNET (#365)
* Add subnet_id to job submission cluster config

* add some docs
2018-02-05 15:43:13 -08:00
Jacob Freck f490f9643f
Feature: AAD tutorial docs (#353)
* aad tutorial docs

* typo, specify application type
2018-01-25 13:58:59 -08:00
Jacob Freck 1aae3b5a8b
Feature: expose exit_code (#320)
* expose the batch task exit_code in ApplicationLog

* add exit_code to job list-apps output

* add execution info to applicationlog object

* whitespace
2018-01-19 16:53:45 -05:00
Jacob Freck 882418a928
Feature: Spark job submission (#278)
* refactor submit, initial job-submission commit

* fix merge conflicts

* comment out gpu_enabled

* update schedule, user job manager task, wait until container ready

* wait until docker container running

* add get_job_log

* add job id

* start multitask job submission

* start multitask implementation

* wait for node setup to complete before completing start task

* fix incorrect logic

* add environment variable, fix loading task definition, fix waiting for container

* fix early job completion, autokill pool, cleanup code

* include debug print error

* define job submission sdk stubs

* Job and JobConfiguration models

* remove timestamp, implement job function stubs , add Application model

* add delete_job, bug fixes

* add wait_until_job_finished, wait_until_all_jobs_finished

* fix storage output logs location, function names

* whitespace

* add cli file stubs

* start job cli  implementation

* better output for job cli, added job.yaml configuration file

* error catch for missing blob, add sdk docs, models updates

* start Jobs tutorial doc

* template for rest of job tutorial doc

* rename app_id to application_name, add job sdk docs

* docs fix

* add support for low pri nodes

* better formatting for job.yaml

* remove unecessary comments

* add defaults and commnets for job.yaml, fix spark_configuration file loading

* fix spark_configuration file paths

* fix app_arg issue

* clean up code, address comments

* rename to cluster_submit_helper

* update get job print

* add list-apps, update help text for job id

* add application metadata to job, update job get print

* better error for get_application_log, add working example to job.yaml, add print_application

* add warning about master selection

* update list_applications return value

* whitespace

* Add link to docs in job.yaml, validation for job.yaml

* convert to commandbuilder, remove print

* fix submit exit code, fix no app_args bug, set autoscale interval

* wait until custom scripts are completed

* add missed import in get_app_logs, whitespace

* use correct python version for on node app submit

* update get output format
2018-01-19 13:37:58 -05:00
Timothee Guerin e193ed30dd
Feature: VNet support (#324)
* VNet support
* Azure active directory authentication
2018-01-17 11:14:56 -08:00
Jacob Freck 7e9bf9c383
Feature: Spark retry job (#318) 2018-01-08 12:31:47 -08:00
JS 6091b1d390
Docs: update (#263)
* Update README.md

streamline and update main readme.md

* Update README.md

* Update README.md

* Update 13-configuration.md

* Update 12-docker-image.md

* Update 12-docker-image.md

* Update README.md

* Create README.md

* Update README.md

* Update 10-clusters.md
2017-12-11 17:02:51 -08:00
Jacob Freck 40bd2d62f3
Bug: fix wrong path for global secrets (#265)
* fix wrong path for global secrets

* load spark_conf files correctly

* docker-image docs fix

* docker-image docs fix

* move load_aztk_spark_config function to config.py
2017-12-11 15:06:08 -05:00
JS c12ecebad2
Update 60-gpu.md (#253)
* Update 60-gpu.md

make sure is available in region

* Update 60-gpu.md
2017-12-07 14:11:53 -08:00
Jacob Freck 6c26943819
Feature: update docker image doc (#251)
* update docker-image readme with new images

* update docs
2017-12-06 14:04:07 -08:00
Jacob Freck 8a060a2f78
Feature: Spark GPU (#206)
* conditionally install and use nvidia-docker

* status statements, and -y flag for install

* add example, remove unnecessary ppa

* rename custom script, remove print statement, update example

* add Dockerfile

* fix path in Dockerfile

* update Docker images to use service account

* updated docs, changed default docker repo for gpu skus

* make timing statements more verbose

* remove unnecessary script

* added gpu docs

* fix up docs and numba example
2017-12-04 13:28:05 -08:00
Jacob Freck d74ceee3f5
Feature: Rename SDK (#231)
* initial refactor

* rename cli_fe to cli

* add docs for sdk client

* typo

* remove conflict

* fix zip node scripts bug, add sdk_example program

* start models docs

* add ClusterConfiguration docs, fix merge bug

* Application docs update

* added Application and SparkConfiguration docs

* whitespace

* rename cli.py and spark/cli

* add docstring for load_spark_client
2017-12-01 13:42:55 -08:00
Pablo Selem cabcc29b3c
Feature: Azure Files (#241)
* initial take on installing azure files

* fix cluster.yaml parsing of files shares

* remove test code

* add docs for Azure Files
2017-11-30 14:16:53 -08:00
JS b983d12419 update 10-clusters.md - rm jupyter ref (for now) (#222) 2017-11-27 09:37:15 -08:00
Ian McDonald 7c59567bec Fix a typo in link (#235) 2017-11-27 08:56:54 -08:00
Matt Scanlon e50cf8c52c Spellchecking (#233)
aztk was misspelled as aztb, amended to correct spelling.
2017-11-27 08:35:02 -08:00
Matt Scanlon ef0acbabbe Amended documentation to correctly obtain logs (#234)
Documentation amended: Use of aztk spark app logs results in an error. Correct usage is aztk spark cluster app-logs. Document amended to reflect this.
2017-11-27 08:31:17 -08:00
Jacob Freck 60cae3b8dd
Feature: HDFS plugin (#215)
* hdfs plugin initial

* fix passwordless ssh key, allow raw ip for datanodes

* remoe debug statement

* description of forwarded ports

* add prcryptodome to requirements.txt

* fix file copy bug

* add namenode ui to ssh command, add docs
2017-11-22 14:51:11 -08:00
JS 65a56dd657
Feature/python container (#210)
* added python container, jupyter install script, vanilla container

* Update README.md

* Create README.md

* Create README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update jupyter.sh

* Update README.md

* Update README.md

* python

* readme update

* docker updates

* Update README.md

* Update 12-docker-image.md

* Update constants.py

* add image files for wiki

* update imageS

* .

* dockerfile typo

* dockerfiles

* Removed r

* readme

* update constants.py

* update readme

* readme updates

* readme updates
2017-11-09 00:43:32 -08:00
Pablo Selem dbc1adddcd Feature: Azure Data Lake Store support (#170)
* initial adls work

* initial native support for ADLS connector

* add core-site.xml and azure storage/adls jars to project

* enable custom jars and core-site.xml changes for custom storage connectors

* remove unused ADL crendentials from environment

* documentation feedback updates

* documentation feedback

* PR feedback
2017-10-19 09:16:14 -07:00
Jacob Freck 9e84949980 Feature: global flag for init (#149)
* add-user accept ssh-key path and prompt for password

* added global flag for init

* global flag only initializes home environment

* remove unnecessary print statement

* add docs

* fix global path and change constant names

* add help message for global flag

* whitespace
2017-10-18 10:23:30 -07:00
Jacob Freck 8cfb7b2967 Bug: replace dash with underscore in docs (#165) 2017-10-16 15:42:02 -07:00
JS f15d18fd5d Update docs (#148)
* Update 11-custom-scripts.md

* Update 11-custom-scripts.md

* Update 12-docker-image.md

* Update 13-configuration.md

* Update 20-spark-submit.md
2017-10-05 10:13:26 -07:00
Jacob Freck dd5a5c938a Feature: ssh directly into container (#142)
* add-user accept ssh-key path and prompt for password

* change ssh command to connect directly to container

* add --host flag for ssh

* update docs

* cleaner checking of config options
2017-10-04 15:51:22 -07:00
Timothee Guerin f51f613fe7 Enable docker authentication for private images (#92)
* Docker login working

* fix docker image ignored

* Fix job need to wait for pool to be created

* Fix

* fix

* Fix

* Fix

* Update docs

* Fix cr

* Fix merge issue
2017-10-03 13:30:12 -07:00
Jacob Freck 37c5d873a0 Feature: Support for multiple custom scripts and master only scrips (#93)
* added IS_MASTER environment variable, moved custom script execution to python

* allow for custom_scripts to check if executing on spark master

* allow for multiple custom scripts

* updated docs, fixed overwrite bug, more precise typing

* added support for script ordering, consolidated file uploading

* rename custom script on upload to prevent conflicts

* removed unnecessary import

* added environment variable IS_MASTER back

* white space

* updated docs

* updated docs

* fix pylint errors

* undo previous commit error

* clearer docs

* removed cli custom script functionality, fixed no custom scirpts bug

* fixed pylint errors

* changed location to runOn, removed unused parameters, add error checking

* removed unused parameter

* added information about storage account

* remove unused functions to upload scripts

* chagned dtde to aztk
2017-10-02 13:20:11 -07:00
JS f158ceafc5 Docs/update (#128)
* Update README.md

* Update 10-clusters.md

* Update README.md

* Update README.md

* Update README.md
2017-09-30 05:02:25 -04:00
JS 57cfa9dd1d Docs/update (#106)
* Updates to README.md

* typos

* Update 00-getting-started.md

* Update README.md

* Update 30-cloud-storage.md

* faq in readme

* Update 13-configuration.md

* switch to use underscore for consistency

* Update 10-clusters.md

* Update README.md

* Update README.md

* fixes

* fix broken link

* Update 20-spark-submit.md

* update to aztk on README

* aztk fix

* Update 00-getting-started.md

* Update 10-clusters.md

* Update 12-docker-image.md

* Update 13-configuration.md

* Update 20-spark-submit.md

* Update 30-cloud-storage.md

* Update README.md

* Update 10-clusters.md

* vm size link

* dummy commit1

* dummy commit 2

* Update 00-getting-started.md

* Update 13-configuration.md
2017-09-29 22:23:14 -04:00
Jacob Freck c2172b5f64 Bug: add spark conf defaults (#115)
* added default spark conf files to config/

* refactored loading and cleanup of spark conf files

* curated spark conf files and added docs
2017-09-29 13:03:11 -07:00
Jacob Freck 5f12ff66dc Feature: prompt for password (#116)
* prompt for password on cluster create, add secrets validation, add custom ssh priv key

* whitespace

* fixed error

* remove ssh_priv_key from secrets template, add docs

* updated error message, fixed typo

* fixed innacurate message
2017-09-29 11:16:54 -07:00
Jacob Freck 859c6fa6c8 Feature: ssh configuration file support (#77)
* added ssh.yaml configuration file

* added jupyter support, better format for ssh.yaml instructions

* refactored read_conf_file method

* rename master ui to web ui

* fixed typo in comments

* added documentation for ssh.yaml

* refactored merge and _merge_dict methods

* changed default ssh experience to require --id cli parameter

* renamed conflicting documentation

* improved docs, changed default port forwarding to standard ports

* fix typo
2017-09-21 17:45:57 -07:00
Jacob Freck 4ae20caa14 Feature: added azb spark init command (#79)
* added command azb spark init

* azb spark init no longer overwrites existing files

* added documentation for azb spark init

* init renames secrets template file automatically

* updated docs to reflect changed workflow

* fixed bug where file rename would fail if file exists
2017-09-20 15:36:01 -07:00
Jacob Freck 611cc584cf Feature: config files (#65)
* moved secrets config template to config directory

* updated new default file location for secrets.cfg

* added ConfigObject class

* added template for cluster config file

* updated .gitignore to include cluster.yaml

* updated cluster.yaml.template to include more fields

* added read_config_file method to ConfigObject, modified cluster_create.py to use cluster.yaml configuration by default

* added pyyaml to requirements.txt

* added documentation for cluster.yaml config file

* added constant for default cluster.yaml location

* changed secrets.cfg to secrets.yaml, added SecretsConfig object

* removed secrets.cfg.template

* improved error messages for bad secrets.yaml config

* changed config/ directory to .thunderbolt/

* remove old unused code

* removed unncessary ignored file

* renamed configuration class to ClusterConfig and added merge method with error checking

* collapsed cluster.yaml file

* updated cluster.yaml.template to new collapsed format

* better descriptions of cluster config settings

* added error checking to cluster config settings

* added back accidentally removed code

* removed template for cluster.yaml, added cluster.yaml

* renamed .thunderbolt/ to config/, added defaults for cluster.yaml

* fixed typo in secrets.yaml.template

* added .thunderbolt/ directory to .gitignore

* fixed bug when adding user with ssh_key

* code cleanup

* updated docs

* removed redundant docs, added instructions to secrets.yaml.template

* fixed typo

* refactored read_config_file method and changed default cluster id

* removed unnecessary ignored file

* changed format for cluster.yaml instructions

* added back docker-repo support

* refactored merge and merge_dict methods
2017-09-19 14:08:26 -07:00
Jacob Freck 43dd8e57f1 Bug: Move commands under app to cluster (#82)
* moved azb spark app commands to azb spark cluster

* fixed docs to reflect new cli structure

* renamed commands for clarity
2017-09-19 08:29:23 -07:00
JS 03baff6a26 Feature: custom docker images (#68)
* symlink /home/spark-version to /home/spark-current

* working checkpoint

* working checkpoint

* docker readme update

* removed deprecated dockerfile

* docker cli + docs for docker

* docs update

* docs update

* remove unneeded comments

* docker version and docs updates, misc fixes
2017-09-15 20:57:16 -07:00