289 KiB
Change Log
v0.5.0-rc.0 (2019-03-28)
Fixed bugs:
- kfctl apply failed for invalid spec.version when installing CRD tfjobs.kubeflow.org #2634
- Wrong service account is used if a project ID has a prefix for its org #2244
- PytorchJob & operator not installed by kfctl in for non GCP platforms in 0.4 #2212
- kfctl.sh fails if a user isn't explicitly added to the IAM policy #2186
Closed issues:
- kfctl attribute deleteStorage is appearing in app.yaml when platform is not gcp #2826
- kfctl delete fails when no platform or minikube is specified #2809
- kfctl generate fails "jupyter" is not a component or a module #2796
- kfctl apply fails to create k8s resources; tries to deploy to 127.0.0.1 #2791
- ipName and zone are being written to app.yaml even when no platform is specified #2773
- kfctl Incorrect ksonnet OptionServer is specified when no platform is provided #2761
- kfctl apply should fail early if no client id/secret environment variable #2752
-
kfctl
- kfctl presubmit test failure #2725
- Jupyter resource should indicate when notebook is running #2723
- kfctl apply fails trying to deploy to 127.0.0.1 #2722
- make build-kfctl is failed during test #2715
- kfctl needs to support passing swagger spec to ks init from file #2709
- kfctl creates invalid IAM policy binding file if email not set #2707
- kfctl tries to call ks init even on ks generate platform #2706
-
kfctl
- kfctl go doesn't exit with non-zero exit code on failure #2699
- "make build" under "bootstrap" module fails due to some go modules missing under /tmp inside Docker. #2696
- spark-operator test is flaky #2693
- kfctl config issue #2691
- tf-operator Docker image based on Cent OS image with a security flaw #2688
-
Dashboard v1
- Store Azure Secrets during setup #2675
- kfctl could not generate kfApp no hostname #2670
- kfctl doesn't create secret with username and password for basic auth #2661
- kfctl Not actually setting GCP IAM roles #2659
- kfctl apply requires oauth client id and secret even if using basic auth #2658
- kfctl generate won't succeed if first attempt failed #2657
- kfctl generated yaml fails validation #2653
- Central Dashboard Label is spelled incorrectly #2652
- Error in application component #2642
- Inject common config into Jupyter Pods #2641
- kfctl - provide option
default
to statically link ksonnet and kustomize into kfctl #2635 - kfctl apply failed #2622
-
1-click deploy
-
Dashboard v1
-
Dashboard v1
- Central Dashboard - Rename "JupyterHub" to "Notebooks" #2612
-
GCP
-
GCP
- User inside Notebook is root and not jovyan #2602
- Set the base_url flag in Jupyter Pod #2595
- jupyter turn off token authentication #2593
- Ambassador routes for new jupyter notebooks don't work #2590
- Central dashboard doesn't render; lots of 404s #2582
-
click-to-deploy app
- Automate image builds of Jupyter UI #2578
- Multi user for Jupyterhub #2577
- presubmit error: ImportError: cannot import name tf_job_client #2574
-
question
- clean up role binding after presubmit tests. #2570
- How can I work around "cannot create directory ‘/home/jovyan/work’: Permission denied" #2567
-
Dashboard v1
- Move ksonnet, kustomize implementations out of KfApps.Children #2561
- kfctl: Add a show subcommand #2560
- Create a new repo for kustomize packages #2559
- Update TFDV and TFMA versions in Jupyter images #2551
- Create secret in istio namespace when using istio #2549
- kfctl
golang
move creation of test directories to subdirs under bootstrap #2537 - kfctl
golang
Simplify the child/parent relationships in platforms #2536 - kfctl
golang
Remove hard dependency on GOOGLE_APPLICATION_CREDENTIALS #2535 - openshift 3.11 routes ok, but stylesheets and fonts from Web #2524
- Stop hosting mongo in gcr.io/kubeflow-images-public/mongo:3.4 #2520
- kfctl, replace regex edits with yaml modifications of individual fields #2515
- add additional config files to kfctl for iap, basic_auth, default components #2514
- vizier-db and vizier-core cannot running #2513
-
Dashboard v1
- Stop deploying JupyterHub #2498
- Update the central dashboard to redirect to the new jupyter web-app #2497
- kfctl go binary should have an option to skip initi project #2491
- bootstrap period test failure #2489
- The CustomResourceDefinition "scheduledworkflows.kubeflow.org" is invalid #2487
- Calling
os.makedirs
fails even after we have check for directory existence. #2484 - Add Jupyter WebApp to kfctl #2481
- Cloud NFS PVC needs to specify volumeName with k8s 1.11 #2475
- kfctl: support deploy with basic auth #2472
- Provide a list of required docs for the documentation card on the new landing page #2468
- Upgrade to go 1.12, remove 'static', 'plugins' macros in kfctl
golang
#2460 - Change jupyter images to work with the new jupyter CR #2458
- Issues with the current Go notebook controller #2456
- Update Python scripts based on ".style.yapf" config. #2448
- Failed to resolve server nfs-server.kubeflow.svc.cluster.local: Name or service not known for pipelines #2443
- jupyter spawns pod outside of it's namespace #2436
- under bootstrap document the golang version required #2430
- Error creating Profile CR: status.observedGeneration in body must be of type int64 #2423
- add docker-for-desktop platform to kfctl
golang
#2420 - make it easy to configure the foo sample plugin as a plugin or statically linked #2414
- kfctl.sh delete all failed on OSX #2411
- presubmit failing: Quota 'DISKS_TOTAL_GB' exceeded. Limit: 16384.0 in region us-east1. #2401
- kfctl
golang
rename 'ks' directory and app to ksonnet and ksonnet.go respectively #2398 - kfctl - Fetch registry automatically #2397
- DM template fails with GPU enabled #2392
-
Dashboard v1
- Clean up trailing whitespaces at the end of lines #2389
- Integrate test_flake8.py into CI system #2388
- I am installing kubeflow and got this error,no matches for kind "Application" in version "app.k8s.io/v1beta1",need help #2380
-
Click To Deploy
- "pip install --upgrade pip" is usually a bad practice since it's known to have caused issues. #2375
- add the gcp platform to kfctl
golang
#2370 - Replace CfgFile *viper.Viper in KsApp with a struct and move to group.go in kfctl
golang
#2368 - Have the different kfctl
golang
implementations of KfApp hold a map of KfApp children #2367 - Move --namespace parameter from generate to init in kfctl
golang
#2366 - Remove --packages and --components from 'kfctl generate'
golang
#2365 - Re-enable click_deploy_test #2364
-
Dashboard v1
- Failed to connect to Hub API at 'http://jupyter-0:8081/hub/api'. Is the Hub accessible at this URL
from host: jupyter-q
? #2350 - Unable to download after pod starts #2345
- TF-Serving: How to add new Param to Service Prototype
running in Field does not exist
#2326 - kubeflow 有没有中文社区 #2325
- TF serving new template should support model from NFS volume #2321
- tf-serving how to add volumes/volumeMounts #2319
- Add Azure Version of tf-serving prototype #2317
- studyjob controller stays in CrashLoopbackoff #2308
- Proposal: Kubeflow Serverless Serving CRD #2306
- Sending predictions through TF-Serving returns 404's when using IAP #2302
- Need help deploying simple model with tf-serving and enable Rest API #2292
- "kfctl.sh apply k8s" stuck in ambassador and return error. #2290
- GCP secrets are not created unless oAuth credentials are provided #2284
- Go based implementation of the profiles custom resource? #2270
- Go based implementation of Jupyter CR? #2269
- Argo UI doesn't work when Ambassador is scaled to more than 1 replicas #2222
- Install profiles CR by default #2177
- Profiles resource needs an E2E test #2176
- Jupyter CR E2E test #2174
- Docs: The reference to the deployer app should be higher in the page #2169
- Notebook manager UI should disable New PVC option if no default storage class is defined. #2157
- Jupyter spawner UI should provide a warning if pod isn't backed by durable storage #2156
- Allow in-cluster traffic through ambassador while enable basic username/password login with ambassador #2149
- Cookie based auth for basic username/password login with ambassador #2148
- login page for basic username/password login with ambassador #2147
- Kubebench operator jsonnet unittest #2134
-
gcp-click-to-deploy
- nodeselector in jupyter spawner not working #2131
- 0.4 blog post #2102
-
gcp-click-to-deploy
- Support basic username/password login with ambassador #2094
- Central dashboard should be left handed navigating bar #2093
- Color scheme for central dashboard should match Kubeflow pipelines #2092
-
Question / friction
- Broken links for k8s-model-server docs #2027
- Jupyter notebook manager UI that uses the new CRD #1995
- Use ISTIO to restrict access to Jupyter Notebooks #1994
- Can not deploy kubeflow on docker-for-kubernetes #1991
- Error creating object: Service "vizier-core" #1982
- ksonnet cleanup don't merge "params and env" #1961
- Add e2etest for katib #1946
- feature request: include 'google-api-python-client' in the JH images #1932
- create cluster-admin role for kubeflow:default service account so that it can create a tfjob #1882
- Remove OpenMPI package in 0.5.0 #1859
- kfctl.sh delete doesn't check/change kubectl context before issuing delete #1669
- add jsonnet tests for all libsonnet files #1542
-
kfctl
-
microk8s
- JupyterHub users can't create pods on GKE #1281
- Pachyderm and Kubeflow integration #151
- Make JupyterHub More Configurable #56
Merged pull requests:
- Cherrypick to 0.5 #2849 (lluunn)
- remove openmpi from registry file #2840 (kunmingg)
- make createResourceFromFile check first, then create #2839 (lluunn)
- update pytorch image #2837 (lluunn)
- fixes 'kfctl attribute deleteStorage is appearing in app.yaml when platform is not gcp' #2827 (kkasravi)
- Cherrypick #2787 into 0.5 branch #2822 (lluunn)
- update ks for 0.5 #2816 (lluunn)
- Use YAML modifications instead of regex replacement #2815 (gabrielwen)
- Add kfctl E2E test on GCP with IAP #2814 (jlewi)
- fixes 'kfctl apply fails to create k8s resources; tries to deploy to 127.0.0.1' #2813 (kkasravi)
- fixes 'kfctl delete fails when no platform or minikube is specified' #2810 (kkasravi)
- Retag jupyter images with tag v0.5.0 #2805 (zabbasi)
- check CLIENT_ID/CLIENT_SECRET before acutal Apply #2803 (gabrielwen)
- Add named context to KUBECONFIG #2802 (gabrielwen)
- kfctl interface in gcp.go/ksonnet.go for deploy app #2801 (kunmingg)
- Stop setting JupyterHub parameters and pull config from PR branch. #2798 (jlewi)
- E2E kfctl go test should verify that Kubeflow is deployed. #2795 (jlewi)
- Update the developer guide for kfctl. #2794 (jlewi)
- Add missing alsologtostderr flag to mpi-operator args #2793 (terrytangyuan)
- Add "update" verb to pdb resource of mpi-job #2792 (terrytangyuan)
- Namespace selector now only shows up when on activity page #2788 (avdaredevil)
- notebook CR shows container status #2787 (lluunn)
- Privacy link implemented with CSS changes #2786 (avdaredevil)
- update pipeline system images to 0.1.13 release #2785 (IronPan)
- add spartakus as component #2784 (gabrielwen)
- update pipeline version to v0.1.13 #2783 (IronPan)
- add jupyter image with tf 2.0 #2782 (zabbasi)
- move admission webhook to gcp #2781 (lluunn)
- update dashboard image #2780 (lluunn)
- Obtain build version from ENV #2779 (prodonjs)
- fixes 'ipName and zone are being written to app.yaml even when no platform is specified' #2774 (kkasravi)
- replace download k8s manifest with ksonnet entry #2772 (kunmingg)
- remove openmpi package #2771 (lluunn)
- Adds Kferror interface #2769 (gabrielwen)
- update katib image for 0.5 #2768 (lluunn)
- cache k8s spec in kubeflow so it can be used by ksonnet lib #2765 (kunmingg)
- added notebook-controller and jupyter-web-app as part of kf deployment #2764 (zabbasi)
- fixes 'kfctl Incorrect ksonnet OptionServer is specified when no platform is provided' #2763 (kkasravi)
- kfctl delete for GCP #2762 (gabrielwen)
- ModelDB V2 support in KubeFlow #2757 (mvartakAtVerta)
- Fix error in iap component #2756 (lluunn)
- kfctl install istio #2755 (lluunn)
- fix make build; set IAM policy following read/modify/write #2751 (kunmingg)
- admission webhook manifest #2744 (lluunn)
- added ReadyReplicas status to notebook-controller #2743 (zabbasi)
-
KUBEFLOW-2735
- Polish changes: #2733 (avdaredevil)
- Add pdb to mpi-operator clusterrole #2732 (stpabhi)
- Added tf 1.13 to the list of jupyter images #2730 (zabbasi)
- Fix presubmit kfctl_go_test #2729 (gabrielwen)
- update fairing@7f7a66687ceab3ed2838ad29e3ff4b04afe351ab #2728 (r2d4)
- Launch trtserver so that CUDA driver compatibility is enabled #2726 (deadeyegoodwin)
- fixes: kfctl go doesn't exit with non-zero exit code on failure #2721 (kkasravi)
- Admission webhook for inserting credential to pod #2720 (lluunn)
- fixes 'kfctl needs to support passing swagger spec to ks init from file' #2719 (kkasravi)
- fixes 'kfctl creates invalid IAM policy binding file if email not set' #2718 (kkasravi)
- fixes Kfctl clientspec #2691 #2714 (kkasravi)
- Move basic auth username/password to ENV #2713 (gabrielwen)
- Update centraldashboard.jsonnet #2711 (prodonjs)
- Added PVC volumeName #2710 (zabbasi)
- Kfctl go E2E test should deploy kubeflow #2705 (jlewi)
- Fix error in setup_backend #2704 (lluunn)
- Fix param set on ipName and hostname #2702 (gabrielwen)
-
Dashboard v1
- Attempt to fix spark-operator and argo-deploy test flakiness. #2694 (jlewi)
- moved the Profiles golang controller kustomize manifests around #2692 (swiftdiaries)
- Create secret for basic login #2690 (gabrielwen)
- fixes 'Install profiles CR by default' #2689 (kkasravi)
- Skip check to oauth_id/oauth_secret if using basic auth #2687 (gabrielwen)
- Add ClusterRole and ClusterRole #2684 (prodonjs)
- fix pipeline-viewer-controller access kube-apiserver 403 #2683 (hzxuzhonghu)
- fix IAM policy set #2681 (kunmingg)
- Don't create PVCs if no default StorageClass is set #2679 (kimwnasptd)
- Serve Pytorch v1beta2 and v1beta1 crd #2677 (johnugeorge)
- store azure secrets to cluster during setup #2676 (rakelkar)
- fixes 'kfctl generated yaml fails validation #2653' #2674 (kkasravi)
- fixes 'add kustomize support to kfctl' #2673 (kkasravi)
- Scaffolding for E2E test for the new kfctl go binary. #2672 (jlewi)
- Add pre/post submit jobs for Jupyter UI #2671 (kimwnasptd)
- Update the ROADMAP with more details about how we are approaching 1.0 #2669 (jlewi)
- Add Namespaces and Activities from API server. #2667 (prodonjs)
- Updated TFMA and TFDV versions of Jupyter images #2666 (zabbasi)
- Add GCB scripts for Jupyter UI #2656 (kimwnasptd)
- update DM config file #2655 (kunmingg)
- Fix iap libsonnet: avoid empty object #2654 (lluunn)
- Getting started card completed #2651 (avdaredevil)
- Use multiple projects in the load test for 1-click-deployment #2648 (zhenghuiwang)
- add retry around ks init #2643 (kunmingg)
- update pipeline system images to 0.1.12 release
\#2637
#2640 (gaoning777) - Updated jupyter image paths to point to the latest ones #2638 (zabbasi)
- update pipeline system images to 0.1.12 release #2637 (gaoning777)
- Fix kfctl: didn't pass flag client_id/secret #2632 (lluunn)
- Fix typo in kfctl #2631 (lluunn)
- Setting the base_url flag of Jupyter images #2627 (zabbasi)
- Separate front-end components and add mock activities service. #2626 (prodonjs)
- store non exist parameter differently #2625 (IronPan)
- Fix ambassador route and base_url in notebook controller #2620 (lluunn)
-
Dashboard v1
- Add prodonjs to OWNERS #2613 (prodonjs)
- Updated jupyter spawning options to the latest images. #2611 (zabbasi)
- Adding additional printer columns to display status #2609 (johnugeorge)
- Updated jupyter images to allow any origin #2606 (zabbasi)
- Add build using GCB support to profile controller #2603 (lluunn)
- Removing prune for Subresource field in CRD #2598 (johnugeorge)
- Adding backOffLimit to metric collector spec #2597 (johnugeorge)
- Turnning off jupyter images authentication #2591 (zabbasi)
- Update the jupyter-web-app Image #2589 (kimwnasptd)
- Semi-working kfctl go binary #2587 (gabrielwen)
- Updating python package imports related to tf-operator #2581 (zabbasi)
- Add ESLint to enforce Google JS style in public/ #2579 (prodonjs)
- Fixed Dashboard v0.0.1 w/ Polymer3 + Webpack #2576 (avdaredevil)
- Refactor for Express + Webpack build stack. #2572 (prodonjs)
- Enabling EFS on AWS for Tensorboard #2569 (jeremievallee)
- new deploy API following Open API format #2564 (kunmingg)
- Add status to notebook #2558 (lluunn)
- Add Azure support for kfctl.sh #2557 (ritazh)
- Temporary change to pin kubeflow to the latest verified tf-operator commit #2555 (zabbasi)
- update katib component #2553 (hougangliu)
- Updated Jupyter image paths for using in notebook controller #2552 (zabbasi)
- fixes 'kfctl
golang
Simplify the child/parent relationships in platforms' #2547 (kkasravi) - Fixed Version bump #2544 (avdaredevil)
- fix backendconfig name #2542 (lluunn)
- port ksonnet changes to pipeline branchs #2541 (IronPan)
- Update dashboard image to point to newest build #2540 (avdaredevil)
- fix command in bootstrap dev guide #2538 (lluunn)
- update pipeline image version to v0.1.10 #2534 (IronPan)
- Golang profile controller #2533 (lluunn)
- Pytorch v1beta2 support #2532 (johnugeorge)
- Update dev guide for bootstrapper #2527 (lluunn)
- Fix typo #2523 (ib-steffen)
- Fix periodic test #2519 (lluunn)
- Added a windows equivalent script for the makefile for central-dashboard #2518 (avdaredevil)
- Change jupyter images to work with the new jupyter CR, issue# 2458 #2517 (zabbasi)
- Added unit tests for GetUpdatedPolicy at gcpUtils.go - bootstrap/app #2516 (mwarzynski)
- Add optional parameter ambassadorNodePort #2511 (iankoulski)
- Small fix for notebook controller #2506 (lluunn)
- create permanent storage from click-to-deploy app #2504 (IronPan)
- name the PD generically and assign to minio #2503 (IronPan)
- Add myself to Central Dashboard owners #2502 (avdaredevil)
- Bump up pipeline version to 0.1.10
\#2496
#2501 (hongye-sun) - Add jupyter-web-app to kfctl #2500 (kimwnasptd)
- Readme added for Kubeflow Central-Dashboard Component #2499 (avdaredevil)
- Bump up pipeline version to 0.1.10 #2496 (hongye-sun)
- Add kubebench-operator jsonnet unit test #2495 (andreyvelich)
- add missing fields #2494 (kunmingg)
- update jupyter-web-app image path #2493 (gabrielwen)
- fix workflow name #2492 (lluunn)
- Update Jupyter WebApp redirect URL and libsonnet #2490 (kimwnasptd)
- Add build with GCB support to notebook controller #2486 (lluunn)
- pass username & password hash as k8s secret instead of ks parameters #2485 (kunmingg)
- Wrap os.makedir calls inside try,catch block #2483 (ashahba)
- add config file entries for deployment on GCP #2482 (kunmingg)
- Add script to upgrade kubeflow pipeline #2479 (IronPan)
- Refactor the Kubeflow Pipeline ksonnet to better support upgrade #2478 (IronPan)
- fix test image #2477 (lluunn)
- Support multiple CRD versions for TFJob #2474 (richardsliu)
- Fix typo in profile CR #2473 (lluunn)
- cherrypick pipeline deployment changes to pipelines branch #2470 (IronPan)
- add selector for minio deploy #2466 (IronPan)
- Reduce NFS disk size from 200 -> 20 GB #2464 (IronPan)
- Notebook controller fixes #2463 (lluunn)
- fix test name on testgrid #2462 (lluunn)
- add basic auth as alternative of iap access to deploy app; #2455 (kunmingg)
- Skipped context check on OSX #2453 (svalleru)
- bootstrapper e2e test with istio #2451 (lluunn)
- Fix testgrid test name #2450 (lluunn)
- fix artifact dir #2449 (lluunn)
- Update python code styles based on what's provided in .style.yapf #2447 (ashahba)
- Deploy NFS only when explicitly set the nfs parameter #2446 (IronPan)
- Add mpijobs and pods/log in kubebench rbac #2442 (Jeffwan)
- part of 'add the gcp platform to kfctl
golang
' #2440 (kkasravi) - NB controller fix #2439 (lluunn)
- ingress entry for basic auth #2438 (kunmingg)
- make click-deploy-test upload junit artifacts #2437 (lluunn)
- Use command array in mpi job prototype #2434 (Jeffwan)
- fixes 'add docker-for-desktop platform to kfctl
golang
' #2433 (kkasravi) - Dashboard immediate changes complete #2432 (avdaredevil)
- fix click_to_deploy test #2431 (lluunn)
- Cherry pick storage changes to pipelines branch #2429 (IronPan)
- fixes 'make it easy to configure the foo sample plugin as a plugin or statically linked' #2428 (kkasravi)
- Bump pipeline version to 0.1.9 #2427 (IronPan)
- fixes 'kfctl - Fetch registry automatically' #2426 (kkasravi)
- provision minio using pv or pd directly. only create nfs conditionally #2425 (IronPan)
- Specify CRD version using v1beta1 schema #2424 (neuromage)
- fixes kfctl
golang
rename 'ks' directory and app to ksonnet and ksonnet.go respectively #2419 (kkasravi) - Deploy viewer CRD controller for Kubeflow Pipelines. #2416 (neuromage)
- port leftover diff from kfapp-ksapp branch after kfctl merge #2410 (ashahba)
- Wait for at most 100s on namespace deletion #2409 (IronPan)
- Create kfctl command #2407 (kkasravi)
- Profiles e2e test #2404 (lluunn)
- Integrated test_flake8.py into CI system #2403 (bikramnehra)
- fix click_deploy_test #2400 (lluunn)
- sync pipelines branch to master #2396 (IronPan)
- add query string to click-to-deploy app to support arbitrary version #2384 (IronPan)
- deploy storage through kfctl.sh #2373 (IronPan)
- upgrade conda and tini to nerwer versions and stick with pip version 19.0.1 for now #2372 (ashahba)
- Remove all unnecessary white spaces at the end of lines #2371 (ashahba)
- Combine check for kf client dependencies into one utility function check_installed_deps #2369 (ashahba)
- Propagate Disk name to ks application #2363 (IronPan)
- one-click-deploy fix - break if deploy successfully #2362 (IronPan)
- Dashboard v1 UI rewrite done #2361 (avdaredevil)
- Jupyter UI that manages Notebook CRs #2357 (kimwnasptd)
- update component name #2355 (kunmingg)
- Add Deployment manager job to create permanent storage #2353 (IronPan)
- Add annotation to istio ingressgateway to enable IAP #2352 (lluunn)
- use Istio for JWT validation #2348 (lluunn)
- update config and README for local test #2346 (kunmingg)
- Improve the delete_deployment script. #2344 (jlewi)
- upgrade ks app for E2E test to 0.3.0 #2338 (stpabhi)
- Golang notebook controller #2336 (lluunn)
- add condition to create pv and pvc for kubeflow pipeline #2335 (IronPan)
- Integrate Jupyter e2e test into CI #2333 (jlewi)
- use wrong var in katib test #2332 (hougangliu)
- add in-cluster nfs #2328 (IronPan)
- GCP secrets cherrypick #2324 (DanSanche)
- Create an E2E test for the Jupyter custom resource. #2323 (jlewi)
- Clean up - Remove the load sample job #2322 (IronPan)
- basic auth login UI #2316 (kunmingg)
- bootstrapper install istio #2315 (lluunn)
- create secrets before checking for oauth env vars #2313 (DanSanche)
- Added context check before issuing delete #2305 (svalleru)
- increase the number of mktemp template variables in download.sh #2304 (tnthornton)
- Do best practices for kfctl_test Argo workflow #2303 (jlewi)
- update katib studyjob controller manifest #2301 (hougangliu)
- add pipeline team to the owner file #2299 (IronPan)
- Bump ScheduledWorkflow CRD version to v1beta1 #2298 (neuromage)
- add neuromage to owner of pipeline registry #2297 (IronPan)
- Add Katib e2etest for presubmits #2296 (richardsliu)
- Update pipelines version to 0.1.8 #2295 (yebrahim)
- Update seldon to version 0.2.5 #2293 (cliveseldon)
- Add xgboost to image, fix credential-helper #2286 (r2d4)
- adjust param name for v0.3 compatibility #2285 (kunmingg)
- Define signer_email in main to remove flake8 F821 #2229 #2283 (philtremblay)
- add gcp to registry #2280 (kunmingg)
- Update README for newer TRTIS releases #2279 (deadeyegoodwin)
- Update kfctl to export PLATFORM and GKE_API_VERSION #2278 (gabrielwen)
- fixes "add jsonnet tests for all libsonnet files" #2264 (kkasravi)
- basic auth service #2262 (kunmingg)
- Extend Katib documentation to use not in GKE #2260 (andreyvelich)
- Create test_flake8.py #2247 (cclauss)
- Added 'google-api-python-client' to JH images #2241 (svalleru)
- Fix TabError in sync_images.py #2240 (cclauss)
- Add Deprecation Notice to README of openmpi package #2127 (everpeace)
- Add Spark Job #1467 (holdenk)
v0.4.1-rc.4 (2019-01-15)
v0.4.1 (2019-01-15)
Merged pull requests:
v0.4.1-rc.3 (2019-01-15)
Merged pull requests:
v0.4.1-rc.2 (2019-01-15)
Closed issues:
- Suggest KubeFlow to use ksonnet 0.13.1 as minimal version #2250
- Suggest to update the imagePullPolicy to "IfNotPresent" for Pipeline component #2249
-
Google
Merged pull requests:
- add missing pkg to registry #2277 (kunmingg)
- add missing pkg to registry #2276 (kunmingg)
- Update gitignore for all VIM temp files. #2275 (gabrielwen)
- use istio 1.1 #2259 (lluunn)
- Branch v0.4.0 fix #2255 (cheyang)
- Update image pull policy for pipeline #2251 (jinchihe)
v0.4.1-rc.1 (2019-01-12)
Fixed bugs:
-
tf-notebook
Closed issues:
- Create application CR's per package and enable a dependency tree that points to the top application CR
kubeflow
#2257 - seldon gRPC #2239
- Kfctl requires zone to be specified. Could be potentially automatically inferred. #2209
- ksonnet installation of core giving 404 #2201
- Cut 0.4.0 Release #2098
- Investigate using Istio for jwt validation #1907
- Docs for using ModelDB to track models using the API #1861
- Deploy Model DB with durable storage #1860
- Change the application component to agree with kubernetes-sigs application CRD #1856
- Can't install >v0.1.3 KubeFlow to Microk8s #1854
- Provide an option to deploy an application-controller that will deploy the kubeflow application in-cluster #1831
- Not able to build from source: Makefile:17: recipe for target 'presubmit' failed #1812
-
gcp
- Create a kubeflow application in the GCP Marketplace #1691
- migrate bootstraper from glide to dep #1520
- Make ambassador number of replicas configurable #1112
- proposal for tooling #250
Merged pull requests:
- remove ghost registries from 0.4 branch #2266 (kunmingg)
- remove ghost registries; make kf version env var on deploy app #2265 (kunmingg)
- use golang:1.11.2 as base image; more ksonnet 0.13.1 API change #2258 (kunmingg)
- remove load sample job from pipeline config #2254 (IronPan)
- bump pipeline sdk version to 0.1.7 #2253 (IronPan)
- Update Pipeline version to v0.1.7 #2252 (IronPan)
- Minor readability modifications in test_jsonnet.py #2246 (cclauss)
- update params following ksonnet v0.13.1; update image build file path #2245 (kunmingg)
- update jupyter hub with customized docker registry #2243 (cheyang)
- split debug case into separate docker file so we don't expose port unnecessary #2242 (kunmingg)
- argo s3 artifact repository configuration added #2238 (pbrzostowski)
- Fix katib repo for the docker registry which doesn't support sub namespace #2237 (cheyang)
- Adding pytorch operator to non gcp platforms #2236 (johnugeorge)
- set jupyterHubAuthenticator explicitly even for default values #2235 (kunmingg)
- Add HorizontalPodAutoscaler to TF-Serving component #2231 (Youki)
- Sync up Kubeflow official docker images to user's registry #2230 (cheyang)
- Undefined name: from six.moves import xrange for line 50 #2228 (cclauss)
- Undefined name: 'false' --> 'False' #2227 (cclauss)
- Adding pytorch operator to non gcp platforms #2225 (johnugeorge)
- Make ambassador number of replicas configurable #1112 #2224 (agemocui)
- Consistent formatting of prototype docs #2221 (TimZaman)
- Profiles resource should support setting resource quotas on namespaces #2182 (stpabhi)
- fixes Change the application component to agree with kubernetes-sigs application CRD #2154 (kkasravi)
v0.4.0 (2019-01-05)
Closed issues:
- kfctl.sh should check the name of the deployment is < 25 characters on GCP #2189
- sync-notebook needs to look for a Deployment kind that is version independent #2185
-
Proposal
- Can we scope SeldonCore to a namespace #1452
Merged pull requests:
- Kubebench component update cherrypick #2220 (r2d4)
- Add kubebench-dashboard package #2219 (xyhuang)
- simplify cloud shell; add instruction message #2218 (kunmingg)
- publish v0.4.0 on web app #2217 (kunmingg)
- Update kubebench tags for 0.4.0 release #2216 (richardsliu)
- add status tag to deployments_failure; exclude failure caused by quota or permission issue #2215 (kunmingg)
- Cherrypick v0.4-branch: kfctl: add default zone if not specified
\#2210
#2214 (r2d4) - sync-notebook needs to look for a Deployment kind that is version independent #2211 (stpabhi)
- kfctl: add default zone if not specified #2210 (r2d4)
- sync-notebook needs to look for a Deployment kind that is version independent #2194 (stpabhi)
- Check the name of the deployment is < 25 characters on GCP #2191 (jcastill)
- Update tf-serving.libsonnet for replicas #2113 (Youki)
- Remove Seldon Cluster Roles by default #2039 (cliveseldon)
v0.4.0-rc.3 (2019-01-03)
Closed issues:
- Deploy script fails with unsupported k8s version 1.9.6-gke.6 #2193
- Use consistent naming in release artifacts #2183
- Docs: referenced version of KF is not the latest #2170
- Add timestamps in the log messages of the click to deploy app #2166
- Don't wait for IAP redirect when Skip IAP is selected #2165
- util.sh check_install falls back to check for ks #2162
- Create changelog for 0.4 release #2133
- Can kubeflow run a job using multi-nodes' resources ? #2130
- Reproduce FastAI dawn-benchmark using Kubeflow #2119
- bootstrap golang modules #2022
- Argo UI not working #2021
- Which jubyterhub disks param name to use when assigning GCFS as PV #1934
- Kubeflow on GKE not using the default gcp zone configured in the cloud shell. #1914
- Cherry-pick reduced minikube scope in 0.3 patch #1891
- How to change the TFJob termination policy to waiting all workers to complete #1796
-
Discussion
- TFJob will not run if the TFJob CRD includes scope: Namespaced #1606
Merged pull requests:
- Automated cherry pick of #2204: Update katib for v0.4.0 release Cherry pick of #2204 on v0.4-branch. #2204: Update katib for v0.4.0 release #2205 (richardsliu)
- Tag and update katib for v0.4.0 release #2204 (richardsliu)
- Cherrypick - Add retry to loading the sample job
\#2199
#2203 (IronPan) - Fix OWNERs file for gcp package. #2202 (jlewi)
- Add CUJs for multiuser kubeflow to the ROADMAP. #2200 (jlewi)
- Add retry to loading the sample job #2199 (IronPan)
- Add log timestamps, and do not wait for IAP if skipped IAP #2198 (abhi-g)
- kfctl.sh shouldn't fail if a user isn't explicitly in the IAM policy - 0.4 patch. #2188 (jlewi)
- kfctl.sh shouldn't fail if a user isn't explicitly in the IAM policy. #2187 (jlewi)
- Add the TF 1.11 & TF 1.12 Jupyter images to the spawner drop down box. #2181 (jlewi)
- Add the TF 1.11 & TF 1.12 Jupyter images to the spawner drop down box. #2180 (jlewi)
- Add instructions for enabling the periodic tests on the new release branch #2179 (jlewi)
- "argo" should be "Argo" #2168 (suigh)
- Add log timestamps, and do not wait for IAP if skipped IAP #2167 (abhi-g)
-
gcp-click-to-deploy
- Fix check install to use param for type as well as which check #2163 (holdenk)
- update kf version to 0.3.5; add retry around IAM policy edit #2161 (kunmingg)
-
gcp-deployer
- Update ROADMAP with link to CUJ for build/train/deploy from notebook. #2143 (jlewi)
- parameterize mysql and minio image #2109 (IronPan)
- kfctl functional description and changes required #2091 (kkasravi)
- fixes 'bootstrap golang modules' #2090 (kkasravi)
v0.4.0-rc.2 (2018-12-21)
Closed issues:
- Split up core ksonnet package into separate packages #42
Merged pull requests:
- rename kubeflow/core to kubeflow/common #2160 (jlewi)
- Cut 0.4.0 release branch #2150 (r2d4)
- rename kubeflow/core to kubeflow/common #2107 (kkasravi)
- use service account token to make request, set token life time to be 15 mins. #2062 (kunmingg)
v0.3.5 (2018-12-21)
Merged pull requests:
v0.4.0-rc.1 (2018-12-21)
Closed issues:
- Issue installing Kubeflow 0.3 #2128
- Update ksonnet package for Katib #2126
- master head: cloud-endpoints controller not respond to newly created CRD #2125
- Cloud Endpoints Controller not working on master #2120
- Ks package for kubebench operator #2110
- CrashLoopBackOff pods after 'apply k8s' #2077
- no persistent volumes available for this claim and no storage class is set; minio and pipelines #2076
- Issue while deploying kubeflow on GKE using command line- killed message in Cloud Shell #2075
- Ambassador crashing in kops cluster #2074
- deploy.sh should not assume uuidgen is present #2072
- setup-minikube.sh doesn't install ksonnet if missing #2068
- New Jupyter spawner Ui doesn't allow entering a custom image #2060
- add a jobs component #2058
- Upgrade ks to 0.13.1 #2031
- Auto-scaling for Seldon serving? #2029
- Upgrade bootstrapper to use go modules #2023
- Grant kubeflow user service account CMLE permission #2012
- Seems to be an issue in kfctl.sh re existence of dir ${DEPLOYMENT_NAME} #2009
- trying to deploy a component using ks after initial deployment fails #2006
- Move components in core into a GCP specific ksonnet package #1996
- Installation woes - ./kfctl.sh: line 189: env.sh: No such file or directory #1993
- TF-Serving http-proxy error #1979
- TFServing template needs to convert numGpus from string to int #1972
- Expose Istio's grafana dashboard #1969
- Integrate pipelines into Kubeflow click to deploy #1967
- Figure out Istio installation in Kubeflow #1909
- upgrade cloud-endpoints-controller to use metacontroller/metacontroller:v0.3.0 #1824
- Jupyter image with NVIDIA Rapids #1806
-
GCP
- On Premises deployment of Kubeflow fails unless ks flags are set v0.3 #1615
- Test Flake git clone fails RPC error #1178
-
Test Flake
- Document Serving for PyTorch models #1117
Merged pull requests:
- update pipeline version #2151 (IronPan)
-
gcp-deployer
- Fix one click deployer url in README #2138 (yebrahim)
- Fixes initialization script does not correctly detect ks installation #2137 (holdenk)
- Update ksonnet package for Katib #2135 (andreyvelich)
- Create cloud sql database for ml pipeline #2123 (IronPan)
- fixes "Cloud Endpoints Controller not working on mmaster" #2122 (kkasravi)
- Enable stackdriver agents for GKE by default. #2118 (jlewi)
- Fix the modeldb ambassador route. #2117 (jlewi)
- Fix the check if directory named ${DEPLOYMENT_NAME} exists #2115 (jlewi)
- Update kubeflow components to v0.4.0 #2112 (richardsliu)
- Kubebench operator ksonnet package #2111 (andreyvelich)
- Add yebrahim to pipeline ksonnet owner #2108 (IronPan)
- add r2d4 to approvers #2106 (r2d4)
- Replace with std.asciiUpper with supported by required version of jsonnet #2104 (Jeffwan)
- Don't grant cloudservices account IAM admin priveleges. #2101 (jlewi)
- ks init: --skip-default-registries + append ${KS_INIT_EXTRA_ARGS} #2100 (doodlesbykumbi)
- Update aws parameters in serving and dashboard #2097 (Jeffwan)
- Update private IP configuration with new GKE API #2085 (IronPan)
- Correct typo, "for ever image" should be "for every image" #2084 (suigh)
- Fix variable in setup-minikube.sh #2082 (andreyvelich)
- Owners file for jupyter ksonnet #2081 (pdmack)
- fixes 'upgrade cloud-endpoints-controller to use metacontroller/metacontroller:v0.3.0 #2080 (kkasravi)
- JH Spawner Enhancements - Fixes #2060 #2079 (ioandr)
- Add job to load pipeline samples #2071 (IronPan)
- Enforce use string type param for S3_USE_HTTPS and S3_VERIFY_SSL #2067 (Jeffwan)
- Temp workaround for RAPIDS shared lib issues #2066 (pdmack)
- Update katib components #2064 (richardsliu)
- update kf version to v0.3.4 #2063 (kunmingg)
- Fix a version number bug and add tf-batch-prediction into kubeflow registry #2061 (yixinshi)
- unify central dashboard layout #2056 (kunmingg)
- Updates for RAPIDS AI v0.4.0 image #2053 (pdmack)
- Add modeldb as a package for kubeflow #2050 (mpvartak)
- refactor monitoring metrics #2048 (kunmingg)
- upgrade central dashboard image to v0.3.4 to include pipeline ui link #2047 (kunmingg)
- Update Pipeline SDK version to v0.1.3 #2046 (IronPan)
- fixes 'Move components in core into a GCP specific ksonnet package' #2043 (kkasravi)
- Easy for code reading #2042 (wangkeqiang123)
- always convert numGpu to int #2034 (lluunn)
- Support running batch prediction by launching a Dataflow job on GCP #2026 (yixinshi)
- Give jupyter-notebook pods/log permission for fairing #2015 (r2d4)
v0.3.4-rc.2 (2018-12-06)
v0.3.4 (2018-12-06)
Closed issues:
- ImagePullBackOff for images on GCR within same GCP project as GKE cluster #2044
- ksonnet runtime error Seldon #2001
- Better testing output for jsonnet #1988
- Install pipelines SDK in Jupyter images #1968
- Deploy pipelines as part of kfctl.sh #1966
- Create a pipelines ksonnet package #1965
- ksonnet env should override params #1924
- Use ksonnet modules to better organize applications #1922
- Create an openvino component #1913
- Kubernetes Engine for Kubeflow Quickstart Guide - Proposed Fixes #1898
- Tooling to convert notebook to docker container #1857
- Fire off TFJob from Jupyter Notebook #1240
- JupyterHub spawner - Highlight PV requirement #541
- Extend KubeSpawner and its UI to handle Persistent Volume Claims #34
Merged pull requests:
- Merge Pipeline integration change to v0.3 #2055 (IronPan)
- More istio manifest, and README #2051 (lluunn)
- Update Pipeline version to v0.1.3 #2045 (IronPan)
- delete config set in gcp-click-to-deploy dir since its already merged with kfctl config #2041 (kunmingg)
- Update OWNERS #2040 (ellis-bigelow)
- Add fairing library to jupyter images #2038 (r2d4)
- add permission to get pod's log #2037 (IronPan)
- Add roles/dataproc.editor for kubeflow user account #2035 (IronPan)
- Grafana routing rule #2025 (lluunn)
- Istio manifest #2020 (lluunn)
- Activate CMLE during kfctl deployment #2018 (IronPan)
- Activate CMLE during one-click deployment #2017 (IronPan)
- grant KF user SA CMLE admin permission #2013 (IronPan)
- Katib 0.3 cherrypick #2011 (texasmichelle)
- Add latest stable DSL SDK to jupyter image #2008 (IronPan)
- fixes 'trying to deploy a component using ks after initial deployment fails' #2007 (kkasravi)
- use backoff module to handle IAM policy retry with randmize wait time #2004 (kunmingg)
- Fix bad merge conflict on v0.3-branch #2003 (r2d4)
- Adding scope for tf job dashboard #2002 (johnugeorge)
- Make it easier to debug jsonnet tests #1989 (jlewi)
- deploy app loadtest, python part #1986 (kunmingg)
- fixes 'ksonnet env should override params' #1939 (kkasravi)
- Support multiple PVCs in default JH UI, add example 3rd-party UI #1918 (ioandr)
- fixes 'Create an openvino component' #1916 (kkasravi)
v0.3.4-rc.1 (2018-11-26)
Closed issues:
- Missing scope in CRD spec? #1985
- Error persistentvolumeclaim "nfs" not found while host model from NFS #1964
- Fix Katib image tags in v0.3-branch #1953
- user-gcp-sa secret not show up with --platform gcp #1950
- Error deploying Kubeflow using kfctl.sh with --platform gcp #1949
- Can TFJob be parameterized by replica index #1943
- Dashboard cannot list tf jobs #1883
- Issues installing 0.3 #1871
- flaky configure_envoy_for_iap.sh #1807
- create a notebook controller that can replace jupyterhub and uses k8 native auth #1769
- Build Jupyter notebook images for TF 1.11 and 1.12 #1740
- kfctl.sh apply platform assumes availability of yaml python library #1739
- Unable to install Kubeflow on a exising 2 node ubuntu cluster; Docs need to be fixed #1711
-
gcp
- PyTorch and TFJob v1beta1 API #1584
- Error in the generated YAML #1524
- Exposing service using Nginx Ingress Controller and load balancing
question
#1214 - Problems upgrading services of type NodePort; spec.clusterIP: Invalid value: "": #1145
- Simplify Image Tag management for releases #1060
- TFServing supports collection of metrics with prometheus #1036
- Add a parameter for clusters without RBAC #1027
- Can not bring up Jupyter Notebook #672
- Clusters created during e2e tests should be GC #560
- ksonnet + openshift picking up wrong k8s version number #521
- How to access notebook on KUBO? #294
- Investigate using Docker to run Kubernetes and Kubeflow locally #218
Merged pull requests:
- Pipelines 0.3.3 cherrypick #1998 (r2d4)
- Add texasmichelle to OWNERS #1992 (texasmichelle)
- Fix CRD scopes. #1987 (jlewi)
- Remove IAM permission from DM service account #1984 (kunmingg)
- add pipeline as part of kf deployment #1981 (IronPan)
- Add IronPan to pipeline ks package owner #1980 (IronPan)
- Adding Pytorch v1beta1 image #1978 (johnugeorge)
- Check for python module pyyaml when using kfctl.sh on platform GCP. #1975 (IMBurbank)
- Add IronPan as Argo owner #1974 (IronPan)
- initial change - add pipeline entry to kubeflow registry #1973 (IronPan)
- Initial version of a Kubeflow Roadmap. #1963 (jlewi)
- Fix dashboard URI and increase poll interval. #1962 (abhi-g)
- Enable node autoprovisioning in v1beta1 clusters #1959 (richardsliu)
- Add prometheus annotation to tf serving service #1958 (lluunn)
- Add docker-for-desktop platform #1954 (rogaha)
- make IAP optional for click-deploy app #1927 (kunmingg)
- miscellaneous updates and enhacements for shell scripts #1923 (ashahba)
- make prober always sleep before execute #1908 (kunmingg)
- fixes 'create a notebook controller that can replace jupyterhub and uses k8 native auth' #1855 (kkasravi)
- Separate logic for setup_backend.sh #1841 (r2d4)
v0.3.3 (2018-11-15)
Closed issues:
- Update spartakus image to v1.1.0 to fix cloud providers' annotation #1940
- Use iam-policy value for EMAIL if case-sensitive #1936
- Following Quickstart Guide - Can't Deploy Kubeflow #1929
- Enable kubeflow running on POWER #1928
- Do we have support in documentation for Inference? #1905
- Update katib manifests #1903
- Deploy same components in click-to-deploy as kfctl #1892
- Kubeflow 0.2.7 Spawning an image with GPU #1890
- ksonnet package for Jupyter/JupyterHub #1886
- Ksonnet package organization for TFJob and PyTorch #1885
- Rollout model with istio and/or ambassador #1844
- create a self-serve component for data-scientists #1842
- TensorFlow Jupyter Notebook images 1.9 and above in gcr.io cannot see GPUs #1828
-
tf-notebook
- Create 0.3.1 Release #1761
- TF data validation and tfma are in 1.9 images but not 1.10 #1718
-
GCP
- Argo UI doesn't work behind ambassador #1694
-
gcp
-
kfctl
- Use Istio 1.0 #1309
- TF serving supports request id #1220
- Enable GitOps: Use
Weave Flux or Argo CD
to manage Kubeflow deployments #971
Merged pull requests:
- Automated cherry pick of #1904: Fix capitalization in katib ID fields Cherry pick of #1904 on v0.3-branch. #1904: Fix capitalization in katib ID fields #1957 (richardsliu)
- Allow CloudShell origin pattern in Jupyter config #1956 (fdasilva59)
- update katib components #1955 (YujiOshima)
- Add pipeline to centaldashboard #1951 (yupbank)
- Redirect deployer webapp page to Kubeflow dashboard after its ready #1945 (abhi-g)
- Parse all command line options in one place and within function #1942 (ashahba)
- Switch spartakus volunteer image to v1.1.0 #1941 (abhi-g)
- Use iam-policy value for EMAIL if case-sensitive. #1937 (IMBurbank)
- Adding support for Pytorch v1beta1 operator #1930 (johnugeorge)
- Consolidate the ksonnet component update scripts #1926 (richardsliu)
- Update tf-operator component to v1beta1 #1921 (richardsliu)
- Change ksonnet package path for tf-job-operator #1920 (andreyvelich)
- fix typo #1919 (lluunn)
- fixes 'ksonnet package for Jupyter/JupyterHub' #1917 (kkasravi)
- Enable tf serving prometheus metrics #1911 (lluunn)
- make links open in new tab #1910 (kunmingg)
- Envoy config change for istio #1906 (lluunn)
- Fix capitalization in katib ID fields #1904 (texasmichelle)
- Fix skip init project; use spaces consistently for indentation #1902 (jlewi)
- Gkeversion #1901 (kunmingg)
- use backoff module to wrap flaky APIs; avoid using global locks when possible #1899 (kunmingg)
- Fix model rollout with Istio #1897 (lluunn)
- Dm conf #1893 (kunmingg)
- Provide an error message in case GCP ZONE is not set and exit, also fix styles for kfctl.sh #1888 (ashahba)
- Automated cherry pick of #1717 upstream v0.3 branch #1884 (jlewi)
- Add a script that automatically creates cherry-picks #1880 (richardsliu)
- fix iap script #1879 (kunmingg)
- upgrade app version to 0.3.2 #1875 (kunmingg)
- Fixing the conflict with the getting started guide #1873 (connected-bsamadi)
- create a self-serve component for data-scientists #1872 (kkasravi)
- Add base href for correct link in Argo UI #1865 (andreyvelich)
- prober test, kubeflow testing part #1845 (kunmingg)
- Refactor JH integration and UI #1839 (ioandr)
-
tf-notebook-image
-
cuda
v0.3.2 (2018-10-26)
Closed issues:
- Job not stopping once chief is complete - v0.2.0-rc.1 #1853
- For GKE deployment, change default CPU to Broadwell #1840
-
minikube
- metacontroller's compositecontrollers and decoratorcontrollers need to be at Cluster scope #1833
- ERROR in ks apply default -c iap-ingress #1827
- ksonnet namespace should be retrieved from env not params #1825
- issues with 0.3.0 deployment scripts #1767
- Enable TPU in GKE #1766
- TFMA Dependency #1745
- jupyter_console needs update in the latest notebooks #1721
-
gcp
- test coverage for click-to-deploy app #1578
- Required JWT token is missing #1495
- A small problem, Error validate kubeflow-core #1462
- Does CentralUi need ClusterScope? #1450
- Remove ksonnet parameter cloud #1227
- Test Flake wait_for_workflow terminated on HTTPSConnectionPool error; why don't we retry #1169
- Application Custom Resource for Kubeflow Deployments #1106
- Create & label P1 issues needed for an initial release of Horovod support #778
- Document Red/Green Model Rollout using ISTIO #667
- Central UI - need release process #527
- Model Management Features #136
Merged pull requests:
- fix iap script #1874 (kunmingg)
- fix iap set up and prometheus config
\#1847
#1869 (kunmingg) - remove default version #1868 (kunmingg)
- Run minikube e2e test if core components change #1864 (richardsliu)
- Adds release docs for central UI #1863 (swiftdiaries)
- make zone a dropdown list which has GPU available #1862 (kunmingg)
- use broadwell #1852 (lluunn)
- Fix broken presubmit tests #1851 (richardsliu)
- use TF serving 1.11.1 #1850 (lluunn)
- make alert metrics count type & add service heartbeat #1849 (kunmingg)
- fix iap set up and prometheus config #1847 (kunmingg)
- add gauge metrics for dashboard & alerting; add metrics for invalid a… #1846 (kunmingg)
- Remove argo & katib from default minikube #1838 (texasmichelle)
- Add some latency metrics for GKE & KF deployments #1836 (abhi-g)
- part of fix for 'add jsonnet tests for all libsonnet files' #1835 (kkasravi)
- fixes 'metacontroller's compositecontrollers and decoratorcontrollers need to be at Cluster scope' #1834 (kkasravi)
- Create a simple flask server to redirect http to https #1832 (jlewi)
- fixes 'ksonnet namespace should be retrieved from env not params' #1829 (kkasravi)
- Roll out model with istio #1823 (lluunn)
- Don't add a GPU pool by default #1810 (jlewi)
- Add RapidsAI notebook image #1809 (pdmack)
- Add support for using host network for OpenMPI worker pods #1805 (hmizuma)
- Make the project configurable. #1792 (jlewi)
- Fix typos in markdown #1788 (r2d4)
- components: tf-notebook-image: pin to base image #1784 (r2d4)
- Enable TPU in GKE
\#1766
#1774 (dsdinter) - Replace 'cloud' parameter with more direct names like 'params.tfDefaultImage' and 'params.platform' #1742 (ashahba)
- Application crd #1633 (kkasravi)
v0.3.1 (2018-10-19)
Fixed bugs:
- Deploy kfctl.sh apply k8s 'namespaces "kubeflow" not found' #1675
Closed issues:
- add the metacontroller component #1819
- Enabling TFMA jupyter extension on GPU images requires libcuda #1818
- Errors when starting kubeflow with minikube #1802
- add parts attribute to libsonnet for ksonnet incubator/best practices #1798
- Feature Request + Help Wanted: Attach PVs to TFJob components #1782
- Update katib suggestion images #1779
- presubmit failed when build gpu version #1778
- Build machine running out of space - Presubmit test jobs failing for PRs #1775
- Current master version fails when applying platform in GCP #1768
- kubeflow namespace not found using kfctl.sh #1758
- tensorboard generated under kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet is incomplete #1753
- Error Creating Object: Pod in version "v1" cannot be handled as a Pod #1748
- No deployment directory found in https://github.com/kubeflow/kubeflow/archive/v0.3.0.tar.gz #1743
- the variable of ENVIRONMENT #1741
-
docs
- Installation errors with kfctl.sh #1733
- Please patch PR 1716
update GCP credentials filename
into v0.2 #1719 - Tiny jupyter lab notebook toolbars #1704
- K8s dashboard showing "no healthy upstream"; remove K8s dashboard links and services #1699
- 1.0 Exit Criterion for TFJob and PyTorch #1683
- Deploy kfctl.sh apply k8s: Failed to pull image #1651
- Support scope for postsubmit #1587
- Kubeflow quick start misses namespace creation #1514
- Automated sync of Kubeflow between GCR and DockerHub #1320
- Deployment script didn't create namespace kubeflow #1274
- Rename TF-Hub to JupyterHub #1223
- TFServing test if failing blocking submits #1126
- Presubmit failures; Timeout waiting for TFJob v1alpha2 job #974
- Create labels for releases #885
- Create image release workflow for tf operator images #855
-
openmpi
- Upgrade ksonnet version to v0.10 for kubeflow. #727
- Liveness/Readiness checks for TF Serving #368
Merged pull requests:
- Tag Jupyter notebook images for v0.3.1 #1830 (richardsliu)
- reorganize tests to be under their respective dirs, add aws,gcp tests for tensorboard #1821 (kkasravi)
- fixes 'add the metacontroller component #1819' #1820 (kkasravi)
- add ssl cert reuse logic in e2e & prober tests #1817 (kunmingg)
- Fix TFMA and TFDV in Jupyter Images #1815 (jlewi)
- Remove root level makefile #1814 (r2d4)
- stop e2e test for deploy app till fix letsencrypt rate limit #1813 (kunmingg)
- Remove K8s dashboard link from central UI #1811 (jlewi)
- Knative build for in-cluster image builds in Kubeflow #1804 (swiftdiaries)
- skip readarray if --all provided to jsonnet fmt script #1800 (r2d4)
- fixes 'add parts attribute to libsonnet for ksonnet incubator/best practices' #1799 (kkasravi)
- Pin tf-serving version in tf-notebook #1797 (r2d4)
- Tf serving with istio 1.0 #1795 (lluunn)
- cherry-pick PR #1754 onto V0.3 branch #1794 (kkasravi)
- remove empty readme #1791 (r2d4)
- remove travis config #1790 (r2d4)
- Add a few basic metrics for deployment service #1787 (abhi-g)
- Cherry pick #1779 #1781 (leoncamel)
- Update katib suggestion images
\#1779
#1780 (leoncamel) - Cherry pick #1727 #1777 (leoncamel)
- Presubmits should be triggered if the DM configs are modified. #1776 (jlewi)
- Cleanup the OWNERs files. #1773 (jlewi)
- Enable new beta Stackdriver Kubernetes Monitoring feature if using v1beta1
\#1768
#1772 (dsdinter) - add jupyterhub test #1771 (kkasravi)
- Use new stackdriver agents in DM config #1765 (lluunn)
- pin to minor GKE version in dm config #1763 (r2d4)
- gofmt: run go fmt github.com/kubeflow/kubeflow/bootstrap/cmd/... #1762 (r2d4)
- remove statsd from ambassador deployment #1760 (r2d4)
- Cherry pick #1746 and #1747 #1756 (jlewi)
- fixes tensorboard generated under kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet is incomplete #1754 (kkasravi)
- Seldon ksonnet refactor #1752 (cliveseldon)
-
openmpi
- Ensure namespace exists #1747 (jlewi)
- Pin jupyter-console to 6.0.0 #1746 (jlewi)
- periodic e2e test for click deploy app #1732 (kunmingg)
- change clusterrole to role #1728 (swiftdiaries)
- Add katib's hyperband/bayesianoptimization suggestion images
\#1464
#1727 (leoncamel) - gke/deploy.sh: test uuidgen exists before using it #1674 (rabierp)
- Allow folks to have the kubeflow github repo over https + improve error message #1626 (holdenk)
- Improvements to kubeform_spawner.py
form UI and others
#1551 (tlkh)
v0.2.7 (2018-10-10)
Fixed bugs:
-
gcp
-
gcp
Closed issues:
- How to submit multiple OpenMPI jobs? #1730
- Permission denied errors when pip (un)installing without --user in new nb images #1722
-
GCP
- jsonnet test is failing but no jsonnet files changed in the PR. #1707
- Error while updating iap-ingress for custom domains #1689
- Jupyterlab service account token #1648
- Create 0.3. Release #1541
- upgrade to ksonnet 0.12.0, jsonnet v0.11.2 in the Docker image for our test workers #1540
- Chief worker cannot start #1440
- Web app deploying kubeflow through deployment manager #884
- Proposal: kubeflow-scheduler #68
Merged pull requests:
- Change the default branch in download.sh to v0.3-branch as opposed to master #1734 (jlewi)
-
Cherrypick 0.2
- Create an initial CHANGELOG #1723 (jlewi)
- Cherry-pick: Update gcp credentials filename
\#1716
#1720 (lluunn) - Fix the TFJobs Dashboard UI #1717 (jlewi)
- Update gcp credentials filename #1716 (texasmichelle)
- TF serving template cleanup #1714 (lluunn)
- Add a notice explaining to users what the app is doing. #1713 (jlewi)
- config & dockerfile update #1705 (kunmingg)
- Update link in README #1640 (kunmingg)
- Merge config for kftcl and click-to-deploy webapp #1594 (lluunn)
- HubSync v1.0 #1548 (TheJaySmith)
v0.3.0 (2018-10-04)
Fixed bugs:
- v0.3.0-rc.1: ERROR no prototype names matched 'pytorch-operator' #1663
Closed issues:
- Accessing custom metrics in our Python model #1681
-
v0.3.0-rc.1
- Run tests periodically on the release branch starting with 0.3 #1603
- Kubebench-job default param mainJobConfig points to a wrong path #1596
- Release process for kubebench #1510
Merged pull requests:
-
Cherry-pick
- Cherrypick #1700 - TF serving liveness probe #1708 (lluunn)
- Add TOS and Privacy links and some other UI improvements. #1703 (jlewi)
- Fix jsonnet formatting errors in katib #1702 (richardsliu)
- Add tf serving liveness probe #1700 (lluunn)
- Update kubebench-job prototype default parameters #1693 (xyhuang)
- Add PVC to Katib #1687 (inc0)
- fix katib metrics collector bug #1680 (YujiOshima)
- Scope postsubmit jobs by modified directories #1658 (richardsliu)
- Fix cert-manager and iap config to make dashboard accessible #1544 (sambaiz)
v0.3.0-rc.3 (2018-10-02)
Closed issues:
-
v0.3.1-rc.1
-
v0.3.0-rc.1
-
GCP
- Update Katib image in 0.3 branch #1604
- central dashboard image build workflow doesn't work #1575
-
Image Auto Release
- Image Auto Release Cron Job is failing #1563
- ack_guide.md out of date #1293
- GKE: can not read from google cloud storage in Jupyter notebook #1249
- Friction log for bootstrapper documentation #927
- Create a minimal release process for our ksonnet configs #215
Merged pull requests:
cherry pick
upgrade argo version to v2.2.0 #1692 (kunmingg)- upgrade argo version to v2.2.0 #1690 (kunmingg)
- Tag images for TF notebooks 1.9.0 and 1.10.1 #1688 (richardsliu)
- Build TF notebook images for TF 1.9.0 and 1.10.1 #1686 (richardsliu)
- Cherry-pick #1676 fix centralUI #1685 (swiftdiaries)
-
0.3
- Update image tag for centralUI #1682 (swiftdiaries)
- cherrypick - Remove trailing slash from KUBEFLOW_REPO
\#1664
#1677 (jlewi) - Fix for "/" not directing to centralUI
\#1670
. #1676 (swiftdiaries)
v0.3.0-rc.2 (2018-09-30)
Fixed bugs:
- Deploy kfctl.sh apply k8s : Service "ambassador" is invalid ? #1566
Merged pull requests:
- remove prune from kubeflow core which delete required fields
\#1580
#1667 (jlewi) - Add swiftdiaries as reviewer #1665 (swiftdiaries)
- Remove trailing slash from KUBEFLOW_REPO #1664 (jlewi)
- Change kubebench 0.3 image #1661 (xyhuang)
- Ankush Signing Out #1652 (ankushagarwal)
v0.3.0-rc.1 (2018-09-28)
Closed issues:
-
GCP
- Update CentralUI image used at head and 0.3 branch #1435
- kfctl.sh needs to get initial cluster version based on get-server-config #1359
Merged pull requests:
- Cherrypick
\#1657
tag image and change libsonnet for centraldashboard image #1660 (swiftdiaries) - Tag and update centraldashboard image #1657 (swiftdiaries)
- Cherry-pick #1589 to v0.3 #1656 (lluunn)
- Cherrypick #1650; automatically set master version to supported version. #1654 (jlewi)
- Update initial clsuter version in cluster.jinja based on what gcloud get-server-config returns #1650 (ashahba)
- Enable periodic prow tests #1649 (richardsliu)
- Build image for centralui part of presubmit #1623 (swiftdiaries)
v0.2.6 (2018-09-28)
Fixed bugs:
- Ambassador Version 0.34.0 causing DNS Issues on Worker Node #945
Closed issues:
- Update document on Tf Serving #1634
-
test flake
- tensorboard prototypes should include the optionalParams available in that prototype #1610
- Katib ksonnet component needs an E2E test #1607
- Update Kubebench images on 0.3. branch #1602
- Update Jupyter images on 0.3. branch #1601
- Update PyTorch Job image on 0.3 branch #1600
- Update TFJob on 0.3 Branch #1599
- Update Seldon to 0.2.3 on 0.3 release branch #1598
- Envoy unable to read config #1588
- deploying kubeflow with bootstrapper failed #1586
- How can we modify jupyterhub configuration to use GitHub Authentication? #1585
- Jupyter Image Builds Are failing; Dependency issue related to TFMA? #1576
- bump gke version to 1.10.7-gke.2 #1572
- presubmit build is failing with a quota error #1562
- refactor tf-job-operator to match style-guide of libsonnet #1534
- Test to verify we can deploy Katib #1483
-
test flake
- Make TF serving component more readable and extendable #1264
- E2e test of TF Serving using built-in HTTP api #1258
-
Discussion
- bootstrapper should support push the ksonnet app to a source repo #912
- Investigate using tensorflow serving's built-in http server #896
Merged pull requests:
- TF Serving changes: enable http server, test cleanup etc #1647 (lluunn)
-
Cherry-pick
- GKE 1.9.7-gke.5 is no longer available for master; bump to 1.9.7-gke.6 #1644 (jlewi)
- Add a simple e2e-test for katib #1638 (ankushagarwal)
- Tag release 0.3.0 for Jupyter notebook images #1637 (richardsliu)
- Cherrypick changes from 0.3-branch to master #1636 (ankushagarwal)
- Update katib component to include studyjobcontroller #1632 (ankushagarwal)
- Install tensorflow-data-validation in jupyter notebook #1629 (ankushagarwal)
- Use JupyterNotebook as default instead of JupyterLab #1628 (ankushagarwal)
- Tag and update 0.3.0 release for Kubebench controller #1627 (richardsliu)
- Update katib image URIs #1625 (ankushagarwal)
- Add postsubmits to push centralui and Jupyter notebook images to kubeflow-images-public #1624 (richardsliu)
- Add richardsliu to approvers #1622 (richardsliu)
- make builder image version consistent with glide.lock hash #1621 (kunmingg)
- Tag and update pytorch operator #1619 (richardsliu)
- Update pytorch image on 0.3 release branch #1614 (johnugeorge)
- Update Seldon to 0.2.3 version on 0.3 branch
\#1592
#1613 (cliveseldon) - fixes 'tensorboard prototypes should include the optionalParams available in that prototype' #1611 (kkasravi)
- Update title & favicon for the deploy app page. #1609 (abhi-g)
- Update the TFJob image to the latest image and tag 0.3 #1608 (jlewi)
- Update Katib images on 0.3 branch #1605 (jlewi)
- Update release process for Katib. #1597 (jlewi)
- Force update IAM policy; print app url on web UI. #1595 (kunmingg)
- Update instructions about releasing images using prow. #1593 (jlewi)
- Update Seldon to 0.2.3 version #1592 (cliveseldon)
- New TF Serving template #1589 (lluunn)
- fix iam patch logic and update image in config #1582 (kunmingg)
- App frontend stop polling the status after it's done #1581 (lluunn)
- remove prune from kubeflow core which delete required fields #1580 (kunmingg)
- Don't install TFMA for TF versions < 1.9 #1579 (richardsliu)
- bump gke version to 1.10.7-gke.2 #1573 (kkasravi)
- Typo in nvidia inference server README #1569 (cliveseldon)
- Run Jupyter image and centraldashboard image release on postsubmit #1565 (jlewi)
- Adding --skipInitProject to kfctl_test.jsonnet until CI quota is increased #1564 (ashahba)
- Tag and update v0.3.0 release for chainer-operator #1552 (everpeace)
- fixes 'refactor tf-job-operator to match style-guide of libsonnet' #1535 (kkasravi)
- Rename TF-Hub to JupyterHub #1410 (pvsousalima)
4e7f4ed (2018-09-19)
Closed issues:
- kfctl.sh needs to call get-server-config to get GKE version #1570
- Certificate not working #1567
- "Patching IAM bindings" halt during deployment. #1559
- cert-manager missing clusterrole #1554
- Jupyter Notebooks for TF 1.9 and 1.10 #1546
- test_jsonnet is failing in postsubmit #1543
- Directory ${KFAPP} already exists #1530
- Move kubebench package to kubeflow repo #1513
- cloud endpoint prototype breaks on master. #1507
- Error: Failed to apply app: find objects: RUNTIME ERROR: Field does not exist: v1 #1506
- PR shows review is not required to merge #1503
- update tensorboard to use the same pattern as kubeflow/core/prototypes #1500
- No prototype names matched 'kubeflow-core' #1492
- cannot list namespace on tfjob dashboard #1491
- Issue installing on GKE with deploy script #1489
- Installation fails on Amazon EKS #1488
- add ability to only generate parts of a component in the jsonnet file #1486
- Prometheus for seldon models #1484
- Multiple issues with gke/deploy.sh #1481
- Katib StudyJob failed to mount directory #1480
- New image release for pytorch operator #1479
- simplify tensorboard as separate aws, gcp prototypes #1477
- Cut release 0.2.5 #1476
- do you have a performance benchmarks when run Horovod with your openmpi component? #1461
- Review/extend jovyan permissions in TF notebooks #1438
-
gcp
- kfctl.sh unable to find component ambassador #1429
-
kfctl
- standardize remaining <component>.{jsonnet,libsonnet} files #1414
- Update 0.2 blog with new deployment script #1390
- Update E2E test to use kfctl.sh and delete gke/deploy.sh; #1331
- Docker image building workflows are failing #1135
-
bootstrap
- camelCase for some recently fixed params #1050
- Add document on Stackdriver agents #997
- Suggest using simple port forwarding instead of LoadBalancer for cloud deploy in User Guide #860
- Need docs for TFJobs UI #573
- Update docs to mention known issues with ksonnet and windows #501
- Need: User facing website for Kubeflow that details how to choose a stack #213
- Tutorial(s) that correspond to CUJs #85
Merged pull requests:
- Bump GKE version because 1.10.7-gke.1 is no longer valid master version. #1571 (jlewi)
- Fix cert-manager #1568 (lluunn)
- Bug fix for #1559 #1561 (lluunn)
- increase pageSize for service list to avoid truncate #1558 (kunmingg)
- Fix for issue 1050 - camelCase for some recently fixed params. #1556 (ashahba)
- fix cert-manager: add clusterrole back #1555 (kunmingg)
- enable sourcerepo.googleapis.com api if needed #1553 (kunmingg)
- Webapp: Don't use DM for IAM. #1550 (lluunn)
- add chainer-operator to releaser #1549 (everpeace)
- Add version config for TF 1.9,1.10 #1547 (pdmack)
- The minikube test should not be running the jsonnet unittests. #1545 (jlewi)
- Restore missing tf-hub-lb service #1539 (pdmack)
- Prevent ambassador getaddrinfo error logs #1537 (sambaiz)
- edit server address for app config on each request #1536 (kunmingg)
- Update the client ID for webapp #1533 (lluunn)
- Fix kfctl.sh remove the ksonnet environment for the deleted cluster #1532 (sambaiz)
- Add Kubebench package #1531 (xyhuang)
- add chainer-job package to registry.yaml #1529 (everpeace)
- Support specifying registy version in create request #1528 (kunmingg)
- Adds centralui to releaser. #1527 (swiftdiaries)
- Replace logFatal with logError #1526 (lluunn)
- NVIDIA TensorRT Inference Server #1523 (deadeyegoodwin)
- update manifest for webapp #1521 (lluunn)
- catch save-config error #1519 (kunmingg)
- update k8s version for k.libsonnet when do ks init; remove spartakus #1517 (kunmingg)
- edit gke version to 1.10.6 #1516 (kunmingg)
- Integrate with cloud source repos to save ks app config. #1515 (kunmingg)
- Remove unused Azure specific config #1512 (wbuchwalter)
- Adding v1alpha2 as the default PyTorch operator version #1511 (johnugeorge)
- update ks version to 0.12 for webapp backend #1508 (lluunn)
- Add myself as one of the approvers. #1505 (abhi-g)
- Update webapp backend to respond and then finish deployment in the background #1504 (lluunn)
- separate dep and src for bootstrapper #1502 (lluunn)
- fixes update tensorboard to use the same pattern as kubeflow/core/prototypes #1501 (kkasravi)
- Change webapp url #1499 (lluunn)
- fixes kfctl.sh unable to find component ambassador #1498 (kkasravi)
- Update jupyterhub-kubespawner version #1497 (pdmack)
- update gke to 1.10.7-gke.1 #1494 (kkasravi)
- click to deploy: remove default project name #1490 (lluunn)
- fixes add ability to only generate parts of a component in the jsonnet file #1487 (kkasravi)
- Simplify tensorboard #1485 (kkasravi)
- Delete deploy.sh scripts; we use kfctl.sh now. #1482 (jlewi)
- Ensure jovyan has site-packages perms #1470 (pdmack)
- Adding namespace scope to pytorch operator #1465 (johnugeorge)
- standardize remaining <component>.{jsonnet,libsonnet} files #1437 (kkasravi)
- Fix kfctl.sh gcpInitProject is always skipped #1425 (sambaiz)
v0.2.5 (2018-09-04)
Fixed bugs:
- JupyterHub Version Mismatch #1393
Closed issues:
- Error setting up kubeflow in minikube #1459
- Current documentation for setting up Kubeflow in minikube not working #1455
- presubmit failure for jsonnet test: name must be set #1453
- What is image registry.opensource.zalan.do/teapot/external-dns ? #1446
- What's the function of tfReplicaType: Master ? #1442
- Bootstrapper fails in docker-for-desktop #1430
- tf_job_simple_test results not being report #1426
- deploy.sh should be restart-aware in terms of directory structure #1422
- kfctl.sh should not assume uuidgen is present #1415
- How to spawn the jupyter container as a root user #1412
- Don't use DM for IAM policy management #1401
- Trigger minikube E2E test on presubmit when minikube test is modified #1350
- GKE version "foo" is unsupported. #1348
- CentralDashboard returns 404; Ambassador can't parse the route #1306
- When creating releases we should pin the version of source.tar.gz used in the deploy.sh #1239
- provide the ability to add imagePullSecrets to different ServiceAccounts so that private images can be fetched #1231
- Restrict privilege of Kubeflow services accounts such as tf-job-operator to namespace level #1213
- Minikube deploy script should start minikube #1153
-
gcp
- Make jupyterlab discoverable/default #1124
- Enable test for tf-job-simple prototype for v1alpha2 #1048
- Initial report for spartakus metrics #351
- Batch Prediction using GPUs with local runner #251
Merged pull requests:
- Fix bug with updating an existing deployment. #1475 (jlewi)
- Update Kaggle Dockerfile to add their API package #1469 (pdmack)
- Add storage read scope to VM scopes; v0.2-branch
\#1466
#1468 (jlewi) - Add GCS read only scope to GKE VM scopes #1466 (gindeleo)
- implement gcpUtils which takes care of integrating deployment manager API
part of PR 1458
#1460 (kunmingg) - make web app stateless & backend security update #1458 (kunmingg)
- Fix presubmit unit test #1456 (lluunn)
- Click to deploy ui: remove ipName and make hostName optional #1449 (lluunn)
- Create webapp manifest dir #1447 (lluunn)
- Fix readme for local run #1445 (lluunn)
- Ensure MOUNT_LOCAL env var is forwarded to kfctl #1443 (abhi-g)
- Enable v1alpha2 for Pytorch operator #1441 (johnugeorge)
- Fix KUBEFLOW_REPO dir pointer. #1439 (abhi-g)
- Weaveflux version 0.1.2. #1434 (TheJaySmith)
- Add imagePullSecret to Seldon Prototypes #1431 (cliveseldon)
- Add logging statements to help figure out why test results aren't reported #1427 (jlewi)
- Add texasmichelle to OWNERS #1424 (texasmichelle)
- Use JupyterLab as default instead of Jupyter Tree interface #1423 (ankushagarwal)
- Use bash ${RANDOM} instead of uuidgen #1418 (ankushagarwal)
- Fix minikube test and trigger on presubmit when modified. #1411 (jlewi)
- Python script to declaratively manage IAM binding patches #1408 (ankushagarwal)
- Remove deploy_gcp.sh and update workflows.libsonnet. #1406 (jlewi)
- Update various scripts to use lowercase clientid/secret #1405 (ankushagarwal)
- Fix use_gcr_for_all_images.sh #1404 (ankushagarwal)
- Update tf-job-operator prototype to support namespace-scoped deployment #1403 (ankushagarwal)
- Fix annotations in ambassador and centraldashboard #1399 (richardsliu)
- Add namespaces to the tf-job-dashboard role #1397 (ankushagarwal)
- Updates to iap component to support private clusters #1396 (ankushagarwal)
- util.libsonnet has methods that allow a <component>.part to be modified #1395 (kkasravi)
- Add mxnet operator
https://github.com/kubeflow/community/issues/136
#1392 (suleisl2000) - support customized image in ACK and update the document #1388 (cheyang)
- Added a minikube setup script. #1387 (abhi-g)
-
kfctl web app
- Fix iap-ingress certificate creation and use BackendConfig for IAP #1327 (danisla)
v0.2.4-rc.0 (2018-08-21)
v0.2.4 (2018-08-21)
Closed issues:
- The testing/install_minikube.sh script assumes the host OS is Ubuntu. #1383
- Add Argo UI to Ambassador and Central UI #1310
Merged pull requests:
- Cherry-pick: Update TFJob operator to the latest image #1391 (richardsliu)
- Fix typo #1385 (fisache)
- Fixes the install_minikube.sh script for non-Ubuntu OSes #1384 (Ark-kun)
- fix a typo and remove a too restrictive limit for gpu specification #1382 (yixinshi)
- Fix groups for various resources #1380 (activatedgeek)
- Adds argo to ambassador and Central UI #1376 (swiftdiaries)
v0.2.3-rc.0 (2018-08-17)
v0.2.3 (2018-08-17)
Features and improvements:
- Extra packages in jupyterhub-image #1175
- Use GKE auto-scaling when configuring node pools in the provided Deployment Manager configs #1033
Fixed bugs:
Closed issues:
- Jupyter-role error applying kubeflow-core component with ksonnet #1353
- Jupyter notebook Connection failed because Ambassador doesn't enable websockets #1344
- Need to update glide's ksonnet version to ^0.11.0 #1340
- PS still running after tfjob is complete. #1334
- Central UI should include a link to Kubeflow docs website #1318
- Istio integration doc: Point to kubeflow/website documentation for #1315
- Build and debug improvements for bootstrapper #1312
- Fix incorrect links to user_guide in kubeflow.org #1300
- JupyterHub login unauthorized
401
#1296 - Katib apply fail with error: Field does not exist: modeldbDatabaseImage #1291
- ambassador crashing on node with wrong DNS resolver address due to misconfigured kubelet #1289
- Move Katib documentation to kubeflow website #1286
- How can we change the tensorflow image in kubeflow? #1285
-
gcp
- TFJob operator v1alpha2 doesn't work with TF.Estimator API for TF <=1.6 #1283
-
GCP
-
GCP
-
GCP
-
gcp-deployer
- Deploy argo by default; add it to deploy.sh scripts #1268
-
Test Flake
- ERROR no prototype names matched 'kubeflow/core' #1263
- TF Serving GPU test failing #1262
- Keras training in Kubeflow on GKE gets "Killed" #1261
- Better installation guide on kubeflow.org #1257
- Create a batch predict example #1250
- GPU support on GKE not available #1246
- TF-job package missing #1245
- Spawning Jupyter failed; user jovyan does not have permission to write to default storage class #1241
- "Getting Involved" in README.md should point to kubeflow.org #1237
- scripts/gke/deploy.sh fails when kubeflow_deployment_manager_configs/ exists #1233
-
openmpi
-
GCP
- Cant start jupyter-notebook pod in kubeflow version 0.2.1 #1221
-
Test Failure
- Error from server
NotFound
: tfjobs.kubeflow.org "mycnnjob" not found #1217 - Getting started error: No such file or directory: 'cluster-kubeflow.yaml' #1206
- Getting started error #1205
- README.md QuickStart should refer to kubeflow.org Getting Start #1202
- "getting-started-gke" installer fails #1201
-
gcp
- deploy.sh is broken; wrong directory for the unpack? #1193
- unable to spawn jupyter notebook - volume name is too long #1177
- Delete old GCP configs #1171
- Test Flake gke teardown failed; insufficient quota #1166
- Create Prometheus Component in Kubeflow Core. #1160
- Jupyter image suitable for running the examples/codelabs #1157
- Create links on kubeflow.org to redirect to ksonnet tarballs and deploy scripts #1156
- Make it easy to customizes PVs attached to Jupyter pods #1125
- Split all prototype into separate prototypes #1107
- Support for AVX2 when using deployment manager #1082
-
gcp-click-to-deploy
- Bootstrapper release instructions need to explain how to build at appropriate commit #1053
- Don't check in vendor for bootstrap; it adds 160M which slows down cloning the registry as part of Kubeflow deployment #1051
- unflake TF serving testing #1031
- Support and testing different versions of TF serving images #1005
- Serving path should support logging request input/output #1000
- 'ks delete ${KF_ENV} -c kubeflow-core' doesn't take down user notebook pods #968
- Bootstrapper should support file and http registries in a consistent manner #962
- Click-to-deploy UI upgrade. #959
- Remove "alpha" in deployment manager config when gke-1.10.2 is public #821
- TF Serving test flaky #815
- TFJobs UI doesn't work behind IAP; React APP needs support IAP? #574
- Can not launch TensorFlow Serving because AVX not available on VM #421
- Trigger rebuild of TF serving image in E2E test only when files change #371
- HTTP Proxy and TFServing should not use the same resource defaults #360
- Add script to export inception into SaveModel from checkpoints #229
- ks apply -f tf-job.jsonnet, -f is not a valid flag #201
- e2e test for http-proxy #198
- Make https://hub.docker.com/r/kubeflow/jupyterhub/ a community resource #197
- How to port model developed in Jupyter notebook to TFJobs #110
Merged pull requests:
- Change the default branch used for the configs to v0.2-branch and not master #1378 (jlewi)
- add apply handler; fix k8s auth; reset glide hash #1375 (kunmingg)
- Add Jobs permission to jupyterhub-noteook role to for common use cases #1374 (activatedgeek)
- use tf serving image #1370 (lluunn)
- Disable spartakus usage reporting in our ci clusters. #1364 (jlewi)
- Use gcr.io images for argo, ambassador and cert-manager #1362 (ankushagarwal)
- add optionsHandler for OPTIONS request from browser #1361 (kunmingg)
- fix 'kfctl.sh delete platform' doesn't delete platform correctly. #1360 (everpeace)
- Fix a bug in kfctl.sh when just creating a K8s app and not GCP. #1357 (jlewi)
- ks version update to 0.11.0, and bug fixes. #1356 (kunmingg)
- upgrade gke version for v0.2 #1355 (lluunn)
- Add README steps for releasing website #1352 (richardsliu)
- Create ks prototype for tf-batch-predict #1351 (yixinshi)
- fixes issue #1344 #1345 (fyuan1316)
- Add more tests to the subgraph we created to run the tests. #1342 (jlewi)
- Update TFJob operator to the latest image. #1341 (jlewi)
- add bindRoleHandler which bind roles to service accounts #1335 (kunmingg)
- Add TFJob test to the Kfctl test; refactor workflows to start to use #1333 (jlewi)
-
gcp-deployer
- Update ambassador to 0.37.0 #1324 (ankushagarwal)
- adds link to docs in centralui #1322 (swiftdiaries)
- points to the documentation in kubeflow.org #1319 (amsaha)
- Build and debug improvements for bootstrapper #1313 (kkasravi)
- Turn deploy.sh into kfctl.sh #1308 (jlewi)
- revert to k8s 1.9 #1307 (lluunn)
- wording #1305 (pymia)
- Add reasonable default to tf-serving s3 creds #1304 (inc0)
- Fixed broken links
to user\_guide
#1301 (amsaha) - Pointing katib doc to kubeflow/website #1292 (amsaha)
- Make it easier to rerun deploy.sh #1290 (jlewi)
- Update README.md #1287 (pdmack)
- Enable GKE node-pool autoscaling options #1273 (richardsliu)
- Deploy argo in deploy.sh #1272 (lluunn)
- Improve tf serving e2e test #1271 (lluunn)
- Small fix to tf serving component #1270 (lluunn)
- add components dir to tf serving test #1269 (lluunn)
- Fix tfjob test; the simple tfjob test prototype was renamed. #1267 (jlewi)
- Update TFJob image to include the fixes to make the UI work with IAP. #1265 (jlewi)
-
gcp-deployer
- Scripts' code style adjustment #1256 (zacharyzhao)
-
gcp-deployer
- Enable new SD agents by default #1252 (lluunn)
- Deploy click to deploy app #1248 (jlewi)
- Modified README.md to point to Kubeflow.org community page. Issue #1237 #1247 (amsaha)
- Scope kubeflow prow jobs by job type and modified dirs #1244 (richardsliu)
- Fix deploy.sh when re-trying on existing deployment name #1243 (lluunn)
- Adds check to see if Katib is deployed and updates CentralUI accordingly. #1242 (swiftdiaries)
- Update indentation issue in jupyterhub spawner ui #1238 (jainprachi2506)
- Fix link on front page. #1236 (jlewi)
-
gcp-deployer
- Update reviewers so that automatic reviewer assignment works better. #1234 (jlewi)
- ksonnet package for WeaveFlux #1232 (TheJaySmith)
- TF serving support request logging #1229 (lluunn)
- Update the README.md; remove instructions and direct folks to kubeflow.org #1226 (jlewi)
- Improve jupyterhub form interface #1212 (ankushagarwal)
- Speed up gke script by running DM and GCFS creation in parallel #1211 (ankushagarwal)
- fix tf-job-dashboard fails to show logs #1210 (cheyang)
- Seldon 0.2 update #1209 (cliveseldon)
-
gcp-deployer
- Split kubeflow-core into individual prototypes #1207 (ankushagarwal)
- prometheus prototype #1204 (kunmingg)
- change {username}{servername} to {userid} to avoid volume length restrictions of 63 characters #1200 (kkasravi)
- Add a script to cleanup leaked kubeflow-ci ingress resources #1199 (ankushagarwal)
- Extra packages in tensorflow-notebook-image #1198 (wmuizelaar)
- Cherrypick #1194 to master - Fix deploy.sh moving the unpacked repo #1196 (ankushagarwal)
- Delete docs/gke/configs and move tests to scripts/gke #1195 (ankushagarwal)
- Support private GKE clusters in gke deploy.sh script #1192 (ankushagarwal)
- Add doc for TF Serving #1165 (lluunn)
- Create a doc to describe the deployment process. #1159 (jlewi)
- Turn bootstrapper into an RPC server for ksonnet. #1151 (jlewi)
- Add Kaggle notebook Dockerfile #1109 (pdmack)
- Set min-cpu-platform to haswell to support avx2 in deployment manager #1083 (lluunn)
v0.2.2-rc.0 (2018-07-13)
v0.2.2 (2018-07-13)
Closed issues:
- TFMA plots don't render; GET tfma_widget_js.js returns 404 #1130
Merged pull requests:
- Fix deploy.sh moving the unpacked repo. #1194 (jlewi)
- Use PV for /home/jovyan by default #1191 (jlewi)
- GCFS support #1173 (ankushagarwal)
v0.2.1-rc.1 (2018-07-12)
v0.2.1 (2018-07-12)
Closed issues:
- Use PV by default mounted at /home/jovyan #1187
- Central UI image needs to be updated in 0.2.1 release; it is too old. #1147
- metrics_collector should emit K8s events to indicate when Kubeflow is ready #1142
- Jupyter images in 0.2.1 need to be upgraded #1129
Merged pull requests:
- Use PV for /home/jovyan by default #1188 (jlewi)
- Put the full PyTorch prototype in the jsonnet file.
\#1119
#1186 (jlewi) - Grant roles/viewer to kubeflow user service account #1185 (ankushagarwal)
- Fix socket.Error typo #1183 (ankushagarwal)
- Update Central UI image to the v0.2.1 tag. #1181 (jlewi)
- Make metric collector emit k8s event #1161 (kunmingg)
v0.2.1-rc.0 (2018-07-11)
Closed issues:
- Test Flake deploy.sh fails trying to enable the deployment service #1158
- Make downloading our ksonnet registry for getting started efficient #1154
- kubeflow cluster cannot pull image from GCR within same Project. #1139
-
Test Flake
- Ambassador pod failed to run because kube-dns not running #1134
-
Test Flake
- PyTorch job prototype should contain the full job spec #1114
- Finalize release 0.2.0 #1070
-
gcp
-
gcp click-to-deploy
- Cherry-pick for release v0.2.0-rc.1 #1024
- Cut a 0.2 release branch #964
-
gcp
- Include GPU daemonset in GKE configs? #288
- KubeFlow or Kubeflow? #44
- Tooling to manage configuration and deployment #23
Merged pull requests:
- Cherry pick upgrading the central UI to v0.2.1 #1182 (jlewi)
- Make the deploy scripts more efficient and other fixes.
\#1174
#1180 (jlewi) - Cherry Pick: Don't check in bootstrap/vendor.
\#1152
#1176 (jlewi) - Make the deploy scripts more efficient and other fixes. #1174 (jlewi)
- Make GKE VM service account storage.objectViewer to have read access of gcr #1164 #1172 (jlewi)
- Cherypick: Tag the latest Jupyter images with v0.2.1
\#1144
#1170 (jlewi) - Delete kubeflow namespace before deleting the cluster #1167 (ankushagarwal)
- Make GKE VM service account storage.objectViewer #1164 (kunmingg)
- Give KF_USER_NAME service account roles/cloudbuild.builds.editor role #1163 (ankushagarwal)
- Skip project setup during deployment. #1162 (jlewi)
- Don't check in bootstrap/vendor. #1152 (jlewi)
- Make Katib work with Ambassador.
\#1103
#1150 (jlewi) - Fix the makefile so that we tag the image with the comit. #1149 (jlewi)
- Tag the latest Jupyter images with v0.2.1 #1144 (jlewi)
-
gcp-deployer
- Set requests and limits for RAM and CPU in TF notebook image releaser. #1136 (jlewi)
- Make the test robust to test flakes due to problems initializing the ksonnet app #1133 (jlewi)
- Fix TFMA jupyter extensions. #1131 (jlewi)
- Fix typos #1127 (idealhack)
- add service monitor prototype for monitoring deployment status #1123 (kunmingg)
- Add Dockerfile and Makefile to build docker images #1122 (ankushagarwal)
- Pin and fix katib images.
\#1113
#1120 (jlewi) - Put the full PyTorch prototype in the jsonnet file. #1119 (jlewi)
- Pin and fix katib images. #1113 (jlewi)
- Create a deployment script for gke and minikube #1111 (ankushagarwal)
- Add a jupyter-notebook-role and use it for notebooks in jupyterhub #1110 (ankushagarwal)
-
gcp-deployer
- add katib releaser #1102 (YujiOshima)
- Create a script to update a ksonnet app to the latest Kubeflow package #1100 (jlewi)
- Tfjob create fails in tfjob UI #1099 (kkasravi)
v0.2.0 (2018-06-29)
Fixed bugs:
-
gcp click-to-deploy
-
gcp click-to-deploy
Closed issues:
- Wrong comment in setting default CleanPodPolicy #1081
- Deprecate tfserving http-proxy? #1080
- user guide disappear? #1078
- user_guide link is dead #1075
- TFJob prototype's default TFVersion should be v1alpha2 #1049
- TF-Serving 1.8 Images #845
Merged pull requests:
-
gcp-deployer
- cherry pick 3 commits #1104 (kunmingg)
- Make Katib work with Ambassador. #1103 (jlewi)
- add seldon to ks config; pre install all pkg #1098 (kunmingg)
- Create a version of echo-server to echo headers. #1097 (jlewi)
- add chainer-job/chainer-operator ksonnet package #1095 (everpeace)
- Install gpu driver in deployment manager #1094 (lluunn)
- Improvements to deploy.sh #1093 (jlewi)
- Delete the tf-job package. #1091 (jlewi)
- cherry-pick: update tf job default version
\#1086
#1087 (kunmingg) - update tf job default version #1086 (kunmingg)
- Add katib tag to images #1085 (inc0)
- fix for file_cache is unavailable when using oauth2client >= 4.0.0 #1084 (kkasravi)
- add a readme file to the mpi-job ksonnet component #1079 (rongou)
- fix #1075, user_guide link is dead #1077 (theofpa)
- TFJobs UI doesn't work behind IAP #1073 (kkasravi)
- point bootstrapper to v0.2.0 in release branch #1069 (kunmingg)
- Tooling to make it easier to tag images and update the ksonnet prototypes #1066 (jlewi)
- Create a script to update some of the docker images in the prototypes #1063 (jlewi)
-
gcp-deployer
v0.2.0-rc.1 (2018-06-22)
Closed issues:
- Cross-Origin Resource Sharing with TF Jobs Dashboard #1046
- nvidia-smi fails for TFJob's but not for similarly configured Job's #1042
- Jupyter can't start pod; the default spawner image is way too old #1041
- TFJob pods deleted on completion/failure impairing debugging #1039
- Invalid value: "v1alpha2": field is immutable #1029
- Update PyTorch Image to officially released one #1020
- Bootstrapper in 0.2.0-RC 0 doesn't set tfJobsVersion to v1alpha2 by default #1018
- TensorFlow 1.8 not included in stock Jupyter Images #1014
- Investigate supporting of TF serving 1.7 gpu #1009
- Parameter names for Ambassador images improperly named; have extra tf prefix #994
- Bootstrapper: Friction log from minikube on macOS #981
- Enable TFJob v1alpha2 by default in 0.2 release #977
- Central UI sometimes doesn't render if screen too small #957
-
gcp
-
central UI
- Verify Central UI is working #805
- Batch Prediction Beam Library #662
Merged pull requests:
- cherry pick: update tf_operator image to point to tag v0.2.0
\#1062
#1065 (kunmingg) - cherry pick fixes to release branch #1064 (kunmingg)
- update tf_operator image to point to tag v0.2.0 #1062 (kunmingg)
- Some scripts to support retagging our images as part of our releases. #1061 (jlewi)
- Fix typo in NB image sha #1058 (pdmack)
- Update default notebook image for spawner #1052 (pdmack)
- A bunch of fixes for deploying with private ks registry #1047 (IronPan)
- test-dir-delete should depend upon teardown to complete #1045 (ankushagarwal)
- add mpi operator ksonnet package #1043 (rongou)
- Changing defaults in Pytorch-job #1040 (johnugeorge)
- Delete user_guide.md; it is now part of the website #1038 (johnugeorge)
-
gcp-deployer
- Update examples for 0.2. #1035 (jlewi)
- update pytorch image #1034 (kunmingg)
- release workflow bug fixes and doc update #1032 (kunmingg)
- Update TF serving 1.7 gpu image to use cuda 9.0 #1030 (lluunn)
- fix format error for nightly release config #1026 (kunmingg)
- Merge master commits to release 0.2 branch #1025 (kunmingg)
- Fix param name of ambassador #1023 (lluunn)
- update bootstrapper image to use tfJobsVersion v1alpha2 as default #1021 (kunmingg)
- Pin the central UI image. #1019 (jlewi)
- Add a link to JupyterHub in the CentralUI #1016 (jlewi)
- Include the TF 1.8 image in kubeform_spawner. #1015 (jlewi)
- Fix hard coded "name" variable in example prototypes #1013 (xyhuang)
- Set default version of TFJob operator to be v1alpha2 #1012 (kunmingg)
- Use docs/gke/configs to run e2e tests #1002 (ankushagarwal)
- Fix user guide for the refactored tf-cnn example #995 (xyhuang)
- Fix permissions for jupyter-role #978 (wmuizelaar)
v0.2.0-rc.0 (2018-06-16)
Fixed bugs:
- JupyterHub authenticates login but doesn't redirect to user home #430
Closed issues:
- Bootstrapper should apply components in an order #1006
- Make Katib work with reverse proxy and ambassador #991
- ambassador memory leak statsd:0.30.1 #986
- presubmit failing: couldn't open import "X": no match locally or in the Jsonnet library path #983
- Verify Katib is working #973
- Issues with image release workflow app #970
- ks error: ERROR open /home/jlewi/app.yaml: no such file or directory #966
- Bootstrapper throwing error when deployed on GKE #961
-
gcp
- Nightly builds of TensorFlow notebook images do not have have the same commit hash in the tag #943
- Create ksonnet prototype for auto image release #941
- How to configure a local volume for Jupyterhub_spawner.py #933
- Unknown variable error when applying kubeflow-core prototypes #932
- GCP Deployment Manager needs to delete IAM roles when DM is deleted #910
- Unable to use image-releaser for tf-notebook-workflow #909
- Regression Jupyter notebooks no longer include python2 runtime #906
- Deadlocks configuring envoy for IAP #903
- IAP setup script needs permission to list backend services #902
- bootstrapper should not crash loop on error #901
- openmpi-controller:0.0.3 image pull error. #887
- deployment manager should allow setting IAM roles in YAML config #883
- Bootstrapper should support packages from other registries #880
- Deployment manager config should create service accounts and set IAM roles #878
- Deployment manager config should enable Cloud Endpoints API #876
- dev.kubeflow.org envoy; iap sidecar stuck waiting for backend #871
- Presubmit unhealthy? Networking issues #869
- Failed to run TfCnn example from User Guide: ERROR find objects: parse jsonnet snippet: params.libsonnet:22:5-13 Expected a comma before next field. #863
- unknown node type issue related to mycnnjob.jsonnet #858
- Verify PVC for /home/jovyan works #854
- E2E test for TFJob v1alpha2 ksonnet package #852
- Include ssh in notebook image to support authenticated git push #850
- Jupyter image for TensorFlow 1.8 #846
- Some questions about tf-serving on NFS #844
- E2E test verifies Kubeflow installed via Deployment Manager #836
- JupyterHub GitHub OAuth Setup missing manifest/config.yaml files #835
- GCP deployment manager test handle internal errors #833
- bootstrapper error trying to create the PyTorch Operator #832
- Unify and dedup release workflows #830
- Bootstrapper should optionally use config map to specify ksonnet parameters #829
- horovod error #826
- Give release service account access to kubeflow-images-public #824
- Ambassador failed to start up #811
- IAP Envoy route should map / to central dashboard #809
-
Central UI
- Bootstrapper should deploy new Stackdriver agents #807
- deployment manager should disable legacy Stackdriver agents #806
- Invalid envoy config duplicate value IAP #804
- On GCP trigger bootstrapper with deployment manager #802
- TF serving GPU test failing #794
-
openmpi
- Minikube E2E test timeout waiting for VM to be ready. #788
- Spawner options error(401) #784
- jupyterNotebookPVCMount is not work in v0.1.2 #770
- Swift Jupyter Integration #763
- It didn't return [dead loop] when Python 3 is being used. #760
- Support using bootstrapper image as a kube deployment. #758
- Support install kubeflow by Deployment Manager #757
- the tfserving prototype missing label after generation #746
- E2E test for bootstrapper #742
- E2E test for Argo Package #740
- cloud-endpoints fails when using bootstrapper #735
- Use Pod Preset to add environment variables and volume mounts to pods #732
- Bootstrapper fails if no ~/.kube/config is present #722
- openmpi controller exited with error #718
-
openmpi
- Create individual prototypes for just TFJob #669
- Nightly
regular
build of container images #666 - Bootstrapper should enable IAP on GKE #665
- Bootstrapper should check if user has appropriate permissions and if not create cluster role #664
- Bootstrapper should create namespace if it doesn't exist #663
- Tests are failing because we are running out of PD quota in us-east1 #618
- ksonnet packages for Pachyderm #611
- Friction log for TFX Chicago taxi cab example on minikube #594
- TFJob prototype should contain the full TFJob spec so that ks generate is mostly just copying the prototype #564
- Can we put all of /home on PV for Jupyter notebooks #561
- Create gcr.io/kubeflow-images-public #534
- Central UI Ambassador Integration #528
- Have the JupyterHub spawner report issues with spawning the user's server #505
- Recommended minikube setup #502
- TF Serving component logging #495
- JupyterHub Spawner complains notebook is already spawning; upgrade JupyterHub to 0.9 #479
- Build TFServing images for different TF versions #468
-
Discussion
- E2E test for tf-job-simple prototype #462
- Unable to mount volumes for pod "jupyter-... #424
- multiple Matplotlib libraries #423
- ksonnet style guide #403
- Reformat only modified files #395
- Establish a pattern for creating/using secrets used by multiple kubeflow prototypes #372
- ks delete default fails #364
- Python script to set parameters for ksonnet prototypes #322
- Flakes building TF Serving image; problems downloading unbuntu packages #310
- ksonnet prototypes for TensorBoard #297
- Enforce formatting for all jsonnet files #282
- Assess usability on vanilla dockers #267
- Discussion: Eventing model/solution #263
- Recommended Kubeflow Setup on GKE #241
- Recommended setup for different K8s Solutions #240
- Use Ambassador/Envoy as proxy for JupyterHub #239
ks apply cloud -c newjob
silently fails #217- Tracking Central UI #199
- Kubeflow logo? #187
-
discussion
- Add kubeflow into tesorflow/ecosystem #177
- Central UI #141
- Investigate file server connection errors with Jupyter and IAP #140
- TfJob controllers are not namespace scoped so tests aren't isolated #134
- Add resource request and limit fields to tf-job ksonnet prototype #116
- Figure out ksonnet repo organization #106
- Add Argo package to our ksonnet registry #21
Merged pull requests:
- Add tf serving 1.8 cpu image #1011 (lluunn)
- Add test for tf-job-simple prototype #1010 (ankushagarwal)
- update ks version; run ks apply following config order #1008 (kunmingg)
- fix iam role delete for deployment manager #1001 (kunmingg)
- fixes disappearing elements in Central UI #996 (swiftdiaries)
- Enable iam api and set project flag explicitly #993 (ankushagarwal)
- bootstrapper doc update #992 (kunmingg)
- Release prototype update. #990 (kunmingg)
- Doc for katib #989 (lluunn)
- Retry on ssl.SSLError in vm_util - this resolves test flakiness #987 (ankushagarwal)
- Consolidate GKE deployment script #985 (activatedgeek)
- Remove gpu_model.jsonnet #984 (lluunn)
- config update #982 (kunmingg)
- Fix vizier core #979 (lluunn)
- use ks apply in bootstrapper to avoid creation ordering issue #976 (kunmingg)
- Disable the v1alpha2 test because it is failing because mnist data can't be downloaded. #975 (jlewi)
- Adding pytorch-operator to nightly builds #972 (johnugeorge)
- Delete tf-cnn-benchmarks.jsonnet; there is now an example prototype. #967 (jlewi)
- Source admin role should be granted to KF service accounts. #965 (jlewi)
- Remove gke_setup.md. The instructions are now part of the website. #963 (jlewi)
- A bunch of fixes to the DM configs for GCP to work with latest bootstrapper #956 (jlewi)
- Add aws cloud details #952 (suneeta-mall)
- hide registries config in bootstrapper image to save user effort #951 (kunmingg)
- ksonnet prototype for automated image release #948 (kunmingg)
- Update the notebook images used with Jupyter. #947 (jlewi)
- Downgrade ambassador version to 0.30.1 #946 (ankushagarwal)
- Update instructions for recreating minkube with required resources. #944 (abhi-g)
- update default image for TF serving #940 (lluunn)
- add ambassador route for katib #939 (lluunn)
- Set 15 minute timeout for workflow step #937 (ankushagarwal)
- Remove service account creation in testing deployment manager #936 (ankushagarwal)
- update image for central ui #935 (kunmingg)
- fix vm service account config, doc update #931 (kunmingg)
- fix minor typos #930 (rongou)
- Add AVX to Troubleshooting #929 (pdmack)
- DM config should allow setting users/groups to grant IAP access to. #928 (jlewi)
- Delete test directory after test completes #926 (ankushagarwal)
- Prevent deadlocks trying to setup IAP. #924 (jlewi)
- Update image-releaser README #922 (ankushagarwal)
- keep bootstrapper pod alive when error occurs #921 (kunmingg)
- Run v1alpha1 and v1alpha2 TfJob tests in GKE workflow #918 (ankushagarwal)
- Activate service account before pushing images #917 (ankushagarwal)
- Ambassador to jupyterhub #915 (kkasravi)
- support jupyterhub in alibaba cloud #914 (cheyang)
- copy registry into bootstrap image during e2e test #913 (kunmingg)
- Doc on adding new ksonnet package #911 (lluunn)
- Update default parameters for image-releaser workflows #908 (ankushagarwal)
- Install py2 to global conda env #907 (pdmack)
- Upgrade jupyterhub and kube_spawner #905 (jlewi)
- Improvements and bug fixes in DM config. #904 (jlewi)
- Make bootstrapper support multi ks registry, including private ones #900 (kunmingg)
- Checkin the deployment manager config for running e2e-gke test #899 (ankushagarwal)
- Create wait_for_deployment.py #898 (ankushagarwal)
- Click to deploy Kubeflow web app on GCP #897 (jlewi)
- Restrict jupyterHubRole #895 (wmuizelaar)
- Use Deployment Manager to bring up clusters for running E2E tests #894 (ankushagarwal)
- Use deployment name as the NAME_PREFIX and CLUSTER_NAME #893 (ankushagarwal)
- Merge PVC from spawner and provisioners #892 (pdmack)
- update release doc for images nightly release #890 (kunmingg)
- fix endpoin deploy prototype #889 (kunmingg)
- Update to Ambassador 0.34.0 #888 (kflynn)
- Add tf notebook image 1.8 version #886 (ankushagarwal)
- Restore openssh-client to NB images #882 (pdmack)
- Add GKE Security Features to Deployment Manager config #879 (ankushagarwal)
- Fix a bunch of issues with creating Kubeflow with deployment manager + bootstrapper #875 (jlewi)
- tfjob(v1alpha2): Add validation #874 (gaocegege)
- Katib ksonnet package #873 (lluunn)
- deployment manager and prototype bug fix #872 (kunmingg)
- Update the script so that we don't assume that the remote repository is #867 (jlewi)
- Create a python script to deploy Kubeflow on GCP via deployment manager. #866 (jlewi)
- build_image.py waiting for docker daemon #864 (lluunn)
- Use a config map to configure bootstrapper. #859 (jlewi)
- Fix push_dockerhub in k8s-model-server/images/Makefile #857 (parano)
- build_image.py supports pushing images #856 (lluunn)
- Use yaml config to manage k8s resources deployed by bootstrapper #853 (kunmingg)
- ksonnet changes to support deploying the v1alpha2 TFJob operator. #851 (jlewi)
- params update for release workflow #849 (kunmingg)
- unifying image release process #843 (lluunn)
- Use relative directory path so that build_image.py can be called from another dir #840 (lluunn)
- Add kunmingg as owner #839 (kunmingg)
-
openmpi
- IAP should route to central ui instead of jupyterhub #827 (kkasravi)
- cron job & workflow config for image release #825 (kunmingg)
- Start deploying the bootstrapper via deployment manager. #823 (jlewi)
-
openmpi
- deployment manager uses new stackdriver agents #814 (lluunn)
- add bootstrapper to e2e test #803 (kunmingg)
- add meta api for serving #800 (u2takey)
- seldon load model files from PVC #799 (ogre0403)
- Adds Makefile to build the centraldashboard image #798 (swiftdiaries)
-
openmpi
- Use test_helper to simplify test_jsonnet.py #796 (ankushagarwal)
- Refactor testing package #792 (ankushagarwal)
- Python script to build TF notebook images #791 (lluunn)
- Support TF serving 1.5 image #790 (lluunn)
- Checking that Ambassador started in e2e test #789 (Maerville)
- Initial set of deployment manager configs for creating a GKE cluster to run Kubeflow #787 (jlewi)
- Modify NB Dockerfile and start scripts to support mount of /home/jovyan #786 (pdmack)
-
tf-job
- Create a ksonnet package for injecting credentials through pod presets #782 (ankushagarwal)
- -Format go code and fix spelling errors #780 (wgliang)
- Adds ambassador integration for Central UI #779 (swiftdiaries)
- Argo workflow e2e test #777 (ankushagarwal)
- Update seldon ksonnet image versions to 0.1.6 #776 (cliveseldon)
-
openmpi
-
openmpi
-
openmpi
- Python script to set parameters for ksonnet prototypes #769 (Maerville)
- Create an examples package in kubeflow #768 (ankushagarwal)
- Support deploy kubeflow bootstrapper within k8s #766 (kunmingg)
- Support building different version of TF Serving images #765 (lluunn)
-
openmpi
-
openmpi
- Fix bytes literals in Python3 #759 (gangliao)
-
openmpi
-
openmpi
- Move away from mlkube-testing to kubeflow-ci #750 (ankushagarwal)
- Add test to check jsonnet formatting #747 (ankushagarwal)
- Add tf1.8 build to jupyter notebook images #745 (ankushagarwal)
- support IAP setup for bootstrap on GKE #744 (kunmingg)
- Make TFJob operator a standalone prototype #743 (pdmack)
- Fix Serving metadata #741 (sozercan)
- Remove spurious comma #729 (jlewi)
- Argo Ksonnet package - Add Docker image Tag #728 (julienstroheker)
- tf-serving support model on NFS #688 (lqj679ssn)
- Create deploy job #673 (inc0)
- Pytorch ksonnet #649 (jose5918)
- Update README to reflect install experience #617 (ykevinc)
- Reformats only modified files by default #395 #553 (jmsmkn)
v0.1.3 (2018-04-26)
Closed issues:
- all training run on the first worker #721
- Jupyter images pruned too aggressively? #719
- cert-manager in "setting-up-iap-on-gke" not working #709
- New repo for benchmarking #708
- minikube VM unavailable for e2e kubeflow-presubmit check #707
- Kubeflow Jupyter notebook images are large because of all the extra Python libs; can we shrink to improve pull times? #568
Merged pull requests:
- CP to v0.1-branch: restore some important NB pkgs; http_timeout; bool fixes for tf-serving #726 (pdmack)
- update ksonnet version to v0.10.0-alpha.3 #725 (kunmingg)
- Add a few packages to the jupyter notebook image #724 (ankushagarwal)
- add kunming to reviewer #715 (kunmingg)
- controller bug fix #714 (kunmingg)
-
openmpi
- Adding ksonnet components for tensorboard files. Issue #297 #710 (abkosar)
- Point instructions to 0.1.2 release #706 (ankushagarwal)
- CP to v0.1-branch: Update http_timeout to 5 minutes in jupyterhub
\#691
#705 (pdmack) -
openmpi
- openmpi: namespaced resource names should be prefixed with component name #698 (everpeace)
v0.1.2 (2018-04-21)
Fixed bugs:
- Segfault in bootstrapper while processing .kube #657
Merged pull requests:
- CP to v0.1-branch: TF notebook slimming, joyvan pip installs, and new gcr.io locations #703 (pdmack)
v0.1.1 (2018-04-20)
Fixed bugs:
- tf-cnn: no matches for kind "TFJob" in version... #643
Closed issues:
- ks 0.10.0-alpha.2 does not work with install instructions #686
- Bootstrap image not found #682
- Unable to get kubeflow working on hyperkube in a circleci vm #674
- Cannot install pip packages from jupyter notebook #668
- Expose istio dashboards #644
- Timeout waiting for simple-tfjob-gke #636
- Refactor tf-serving image build for multiple TF versions #632
- Some individual tests not showing up in test-grid #631
- User guide sprawl #629
- inception server not working #621
- Add Link to Intel's tutorial #612
- Add auto-generated TOC for all docs #603
- Error occurs when following user guide #598
-
GKE
- kubeflow version #578
- Presubmit failure: No such file or directory XXX/.kube/config #562
- Make IAP config robust to updating the Ingress #550
- TF MPI support #535
- TF serving param deployHttpProxy needs to be transformed from string to bool #531
- Cut a 0.1 release #506
- TF serving component monitoring #496
- TF serving GPU e2e test is flaky #484
- Build more efficient tensorflow-notebook images #472
-
Jupyter/Azure
- Make it easy to get started with Kubeflow #105
- Size of gcr.io/kubeflow/tensorflow-notebook-* #37
Merged pull requests:
- permission bug fix and image tag update #699 (kunmingg)
- Update the hub spawner dropdown for latest NB images #697 (pdmack)
- openmpi: master/workers sync mechanism is replaced with k8s api from redis #696 (everpeace)
- Migrate images to kubeflow-images-public #695 (ankushagarwal)
- Pin instructions to ks 0.9.2
for now
#694 (pdmack) - add export to ksonnet github token instructions #693 (mattf)
- openmpi: slots clause should be generated when gpus '> 0' #692 (everpeace)
- Update http_timeout to 5 minutes in jupyterhub #691 (ankushagarwal)
- openmpi: fix failing installing redis-tools in init.sh #690 (everpeace)
- Refactor tensorflow-notebook-image/Dockerfile #689 (ankushagarwal)
-
openmpi
- openmpi: make 'schedulerName' configurable to use custom schedulers. #683 (everpeace)
- create rolebinding within namespace to guarantee permission #680 (kunmingg)
- Support rollout new model with istio #679 (lluunn)
- Add documentation for exposing grafana dashboard #678 (lluunn)
- openmpi package doesn't work on kubernetes cluster having custom dns. #676 (everpeace)
- Add batch support to openmpi package #671 (jiezhang)
- make Dockerfile & Makefile cross platform #670 (kunmingg)
- Update ksonnet_packages.md #661 (pdmack)
- Include tensor2tensor in jupyter notebook image #659 (ankushagarwal)
- Add willingc to reviewers #656 (willingc)
-
Azure
- Bootstraper polish: #654 (kunmingg)
- add link to Intel's tutorial #652 (raddaoui)
-
Azure
- Remove zjj2wry as a reviewer. #648 (jlewi)
- Remove the outdated YAML specs for TFCnn job. #647 (jlewi)
- update user_guide #646 (lluunn)
- Add components to work with GCP #645 (jlewi)
- Change naming from jupyter to jupyterhub when referring to hub #642 (willingc)
- Troubleshooting note on cluster-admin privileges for tf-job-operator #641 (tmckayus)
- Use dashes not underscores in junit file names. #638 (jlewi)
- Update various images in kubeflow to kubeflow-images-public #635 (ankushagarwal)
- Create a kubeflow version file: version.txt and a configmap kubeflow-version #634 (ankushagarwal)
- Cleaned up README.md #628 (ddutta)
- TF Serving + Istio #627 (lluunn)
- Improve user guide wrt exposing notebook in non-cloud setup #625 (xyhuang)
- Add openmpi package #624 (jiezhang)
- Fix setup/teardown of VM for minikube. #620 (jlewi)
- Update and clarify JupyterHub README #616 (willingc)
- images: Add the link #614 (gaocegege)
- Improve documentation #613 (jlewi)
- Initial ksonnet package for Pachyderm. #610 (jlewi)
- Provide documentation about adding ksonnet packages to Kubeflow. #609 (jlewi)
- Upgrade tf-serving version to 1.6.0 #608 (pdmack)
- import of cloud-endpoints component and support in iap-ingress component #605 (danisla)
- docs(TOC): add auto-generated TOC for all docs #604 (DjangoPeng)
- docs(troubleshooting): add a new item into troubleshooting #602 (DjangoPeng)
- Update the docs to point to the v0.1.0 release now that it is cut. #601 (jlewi)
- Create a doc with information about our docker images. #600 (jlewi)
- Typo in UG #599 (pdmack)
- Create a POC of an app to simplify deployment of Kubeflow #595 (jlewi)
- CP: Install graphviz in tensorflow notebook image
\#583
#585 (pdmack) - Enable TF serving gpu test #558 (lluunn)
- Branching and tagging policy for releases #519 (willb)
v0.1.0-rc.4 (2018-04-04)
v0.1.0 (2018-04-04)
Closed issues:
- Explicitly specified version of tensorflow being replaced in the python 2 environment with the latest version from PyPI #571
- Jupyter pod stuck at ContainerCreating when spawning #336
Merged pull requests:
- Cherry pick : Update tensorflow notebook version to v20180403-1f854c44
\#589
#590 (ankushagarwal) - Update tensorflow notebook version to v20180403-1f854c44 #589 (ankushagarwal)
- README: Add link for tf-operator #588 (gaocegege)
- Clean up minor formatting errors in README #587 (pdmack)
- Correct typo for param defaultHttpProxyImage
\#556
#584 (jlewi) - Install graphviz in tensorflow notebook image #583 (ankushagarwal)
- Disable PVC by default
\#577
#582 (inc0) - Fix the tensorflow version in prebuilt images for python 2.7. #580 (ojarjur)
- Disable PVC by default #577 (inc0)
- Add pdmack to OWNERS #565 (pdmack)
- Correct typo for param defaultHttpProxyImage #556 (pdmack)
- sidecar for envoy pod to keep IAP up to date #552 (danisla)
- Update instructions about releasing TFJob #532 (jlewi)
v0.1.0-rc.3 (2018-04-03)
Closed issues:
Merged pull requests:
- Cherrypick #570 and #572 into v0.1-branch #575 (ankushagarwal)
- Moved OAuth secret from param to named secret #572 (danisla)
- Update tf jupter notebook images to include tfma
from \#544
#570 (ankushagarwal) - Rety upto 3 times while building tensorflow notebook images #559 (ankushagarwal)
- Add the "tensorflow-model-analysis" package to the notebook images #544 (ojarjur)
- add willb to approvers #542 (willb)
- Docs should show how to pull a particular release #524 (jlewi)
- Changes to support running E2E tests on minikube. #523 (jlewi)
v0.1.0-rc.2 (2018-04-02)
Closed issues:
- Tensorflow no longer working in the GPU image
if you build from HEAD
#549 - TFJob UI not included in ksonnet configs #546
- dev.kubeflow.org 502s #545
Merged pull requests:
- Add TfJob dashboard to ksonnet
\#548
#555 (jlewi) - Delete Makefile for tensorflow-notebook-image #551 (ankushagarwal)
- Add TfJob dashboard to ksonnet #548 (wbuchwalter)
- disable gpu test #547 (lluunn)
- Add tf1.7 to tensorflow-notebook images #543 (ankushagarwal)
v0.1.0-rc.1 (2018-03-30)
Closed issues:
- Upgrade ksonnet in our Jupyter images to 0.9.2 #490
- Javascripts widgets don't work in JupyterLab #489
- Option to disable use of PVC for Jupyter #365
- Katacoda Demo Scenario Python incompatibility - Warning #70
Merged pull requests:
- Cherrypick #526 and #533 into v0.1-branch #540 (ankushagarwal)
- Update the TFJob image in preparation for a new RC. #533 (jlewi)
- Fix typos in tensorflow notebook images in spawner #526 (ankushagarwal)
- Adds Centraldashboard
in place of \#146
#525 (swiftdiaries) - Clarify authentication requirements in release process #520 (willb)
- Install jupyter widgets in tensorflow-notebook-image #518 (pdmack)
- Mount pvc #503 (kkasravi)
- Enable tf serving gpu test #497 (lluunn)
v0.1.0-rc.0 (2018-03-27)
Fixed bugs:
- tf-cnn fails to create #458
- Presubmit failure; cannot import k.libsonnet #447
- Presubmit shows succeeded, but some test actually failed. #436
Closed issues:
- Tf serving component should provide default HTTP image #511
- Continuous testing for release branches #507
- construct base object: Failed to filter components; the following components don't exist: [ 'kubeflow-core' ] #481
- Document GITHUB_TOKEN #478
- ks env set default --namespace=kubeflow doesn't change the namespace #477
- Build Jupyter Notebook images for supported versions of TF #467
-
ERROR
- JupyterHub terminates with 500 Internal Server error #433
- Proposal: We should have basic tests for every ksonnet prototype #432
- Remove tensorboard link from Jupyter notebooks
hub?
#428 - jovyan user cannot sudo in terminal #425
- Cannot not run tensorboard from Jupyter notebooks #422
- IPyWidgets not displaying when using a Python 2 kernel #419
- normalize libsonnets so their "all(params)" is only called from kubeflow/core/all.libsonnet #417
- libcublas.so.9.0: cannot open shared object file: No such file or directory #414
- Executing child process 'start-notebook.sh' failed: 'Permission denied' #412
- Use local NFS server as PersitentVolume #410
- Create a ksonnet component to deploy seldon-core models #405
- Replace two envoy containers with one in the envoy pod #404
-
question
- Build Envoy Container with JWT validation #394
-
support
- cnn tfjob status never change to completed #389
- TFServing prototype for using GCS with service account key #385
- Jupyter notebook image should pin TF version #375
- ClusterRole's and ClusterRoleBindings should include namespace in the name #374
- Ambassador can't watch services at the cluster level. #373
- Missing TFServing monitoring features? #369
- Set userid and group id in TF Serving GPU container #367
- Reviewable is blocking automatic merging #356
- Add tensorflow-serving-api to jupyter image #355
- Build http proxy for TF serving as part of our release workflow #353
- Better docs for http proxy #352
- Failed to pull image "gcr.io/kubeflow/tf-benchmarks-gpu:.... #348
- autoformat_jsonnet.sh unknown predicate -E #345
- ambassadors are crashed and cannot be created #344
- How to login the Jupyterhub on the remote server? #343
-
ksonnet
- Build TFServing GPU Docker Image #338
- Automatic PR merge #331
- Add additional links and information about JupyterHub to README #329
- Jupyter pod can't start; jinja exception Encountered unknown tag 'trans'. #325
- Avoid code duplication in Dockerfiles for Jupyter notebook images #321
- Kubeflow not deployed #318
- Investigate notebook container sidecar to enable in-notebook container builds #312
- JupyterHub Spawner should have a UI element that specifies default docker images #309
- Need an OWNERS file #308
-
Discussion
- option naming inconsistencies #303
- Test the TFServing configs with default image #301
- Configure JupyterHub spawner to allow root operations by default #300
- TFServing docs are incorrect; we no longer assign a public IP by default #296
- Add an option to set service type to TFServing deployment #295
- Example/Docs for using IAP with TFServing #293
- TFServing deployment should support GPUs #292
- E2E Test for TFServing with GPUs #291
- TFServing component should add an ambassador route #290
- How to add persistent volume in jupyter_spawner.py file ? #285
- Prow jobs should use a common docker image #276
- TFJob test is failing #273
- Presubmit failing: Time out waiting for Workflows #272
- Better identity management in K8s #266
- JupyterLab not working in latest notebook images #262
- JupyterHub spawner: upstream request timeout #261
-
bug
- IAP component should use cert-manager to get a signed certificate #255
- Run util_test.jsonnet added in #246 as part of E2E tests #254
- tf.transform libraries in our notebooks #244
- user guide - tf-serving: ClusterIP instead of LoadBalancer service #233
- Typo in argo-ui rolebinding #230
- JupyterHub fails to load image properly, but starts a notebook anyway #226
- Move troubleshooting guide into user_guide #223
- Build and publish Docker images using Argo #221
- E2E tests need to verify that we can submit a TFJob #207
- Use kubeflow/testing to run Argo workflows #205
- KubeFlow on AWS - tf-hub-lb not created and cannot change jupyterHubServiceType to loadbalancer #203
- ERROR Server is unable to handle tensorflow.org/v1alpha1, Kind=TFJob #200
- Increase rate limit for installing kubeflow packages #195
- Add GPU Support for k8s-model-server on Kubeflow #194
- TFJob controller cannot terminate job #193
- Remove tensorflow/k8s as a submodule #190
- Delete components/tf-controller #189
- Error from server
Forbidden
: error when creating "deploy_crd.yaml": clusterroles.rbac.authorization.k8s.io "tf-job-operator" is forbidden #188 - tf-job.libsonnet is issuing the wrong CRD job type #186
- Test failures #176
- Create a repository for examples: kubeflow/examples #174
- Create a testing repository #173
- Change google/kubeflow links to kubeflow/kubeflow #168
- Setup prow for the new org #165
- Add TFJob CRD creation to ksonnet components #164
- ksonnet component for Seldon.io deployment #159
- E2E Solution for GitHub Issue Summarization On Kubeflow #157
- CreateSession still waiting for response from worker: /job:ps/replica:0/task:0 #153
- Test cluster is unhealthy no ready nodes #150
- ClusterRoleBinding.rbac.authorization.k8s.io "tf-job-operator" is invalid #148
- worker restart result in the TFJOB can not finish expectedly #147
- PVC created but not added to volume list #145
- Code of conduct #143
- Add ksonnet to our Jupyter notebook Docker images #138
- What's up with Tensorflow? #135
- Postsubmits appear to be broken #133
- Make ArgoUI for our tests publicly accessible #131
- tf-cnn template doesn't set TerminationPolicy correctly #129
- Copy Jupyter notebook Dockerfiles into Kubeflow repo #126
- tf-job-operator service account is missing roles #125
- JupyterHub should not open up external IP by default #123
- where is the Dockerfile of gcr.io/kubeflow/tensorflow-notebook-cpu? #122
- Default service type for ModelServer should not be loadbalancer #121
- Add tensorBoard field to tf-job prototype #113
- po/tf-job-operator logs said user cannot get endpoints in the namespace "default" #109
- Typo in tfjob.jsonnet #107
- Problems connecting to Jupyter Kernel with IAP #104
- Create only one service for JupyterHub and make type a parameter #99
- Delete inception.tar.gz #98
- Postsubmits need to use registry components at commit being tested #96
- Presubmits need to pull registry from the PR branch #95
- link to ksonnet might be confusing #91
- release .yaml manifest #90
- Incorporate tf.transform #88
- ProwTestCase library #83
- Set suitable defaults for JupyterHub service type and make it configurable? #80
- if i use kubeflow on local cluster with gpu #79
- Failed to connect to my Hub at http://tf-hub-0:8081/hub/api
attempt 1/5
. Is it running? #78 - Support for other Deep Learning Libraries #74
- Setup PR Dashboard #73
- TfServing uber tracking bug #64
- tf-job missing from ksonnet registry yaml #62
- Add missing features to TfJob controller ksonnet component #61
- Support IAP on GKE #60
- Syntax error of juypterhub.yaml for Kubernetes 1.6 #58
- Proposal: Include very basic tracking of usage by default #55
- Should Kubeflow publish and maintain TF Serving Docker Images? #50
- tf-cnn prototype doesn't add GPU resource requests #48
- Remove namespace as a package parameter #43
- Clean up repo after switching to ksonnet #41
- Doc gen for kubeflow ksonnet registry #39
- E2E Testing For Kubeflow. #38
- Proposal: Discuss Kubeflow organization and community #35
- Proposal: our expectation on KubeFlow #33
- Add LICENSE file: Apache License, Version 2.0 #27
- Fault tolerant storage for Jupterhub #19
- Tensorboard support #17
- python model server #15
Merged pull requests:
- Http proxy default image #517 (lluunn)
- Update tensorflow notebook release to v20180327-6bb4058 #516 (ankushagarwal)
- Use ks version 0.9.2 #515 (ankushagarwal)
- Update ambassador and statsd to 0.30.1 as part of 0.1 release #513 (ankushagarwal)
- Pin tf serving version to 1.4 #512 (lluunn)
- update default image for TF serving #510 (lluunn)
- Fix TF serving release process #509 (lluunn)
- Update images for our 0.1 release #508 (jlewi)
- Update the nodejs and ipywidgets dependencies. #498 (ojarjur)
- Fix instructions for tensorflow-notebook release #494 (ankushagarwal)
- Update release instructions to ease onboarding #493 (willb)
- Add google-cloud-storage package to jupyterhub notebook #492 (ankushagarwal)
- Update buildTemplateImage to take workflow_name #488 (ankushagarwal)
- Temporarily disable gpu test since it's flaky #487 (lluunn)
- Use a 10 minute timeout for jupyterhub backend in envoy config #486 (ankushagarwal)
- Enhance tf serving gpu test #485 (lluunn)
- Add Github's rate exceeded error trobuleshooting #482 (inc0)
- Update jsonnet tests so that it tests all files under kubeflow #480 (ankushagarwal)
- Fix ambassador_test.jsonnet #476 (ankushagarwal)
- Tag tensorflow-notebook-images with latest #475 (ankushagarwal)
- Set a large timeout for jupyterhub spawner #474 (ankushagarwal)
- Update tensorflow-notebook-image workflow to pin tf and cuda version #471 (ankushagarwal)
- Update cert-manager component to inherit namespace #466 (ankushagarwal)
- Update docs so that namespace is only set at ks env level #463 (ankushagarwal)
- Fix the TFCnn prototype which was broken by #444. #461 (jlewi)
- Create a script to deploy minikube on a VM. #459 (jlewi)
- Move IAP config to initContainer on the envoy pod #457 (danisla)
- Add jsonnet_path_dirs flag to test_jsonnet script #456 (ankushagarwal)
- Fix for #432 Provide basic tests for every ksonnet prototype. It will… #450 (kkasravi)
- remove the pip install tensorboard #446 (kkasravi)
- Add jsonnet unit tests for iap #445 (ankushagarwal)
- set termination policy to worker 0 by default when no master exists #444 (raddaoui)
- Run jsonnet tests as part of kubeflow-e2e workflow #443 (ankushagarwal)
- E2e test for TF serving with GPU #442 (lluunn)
- Use cert-manager for obtaining valid certificates #441 (danisla)
- Use kubeflow-ci/test-worker as worker image #440 (ankushagarwal)
- Update autoformat script to only format files in kubeflow/ directory #439 (ankushagarwal)
- Run ks upgrade #434 (lluunn)
- Libsonnet cleanup #431 (kkasravi)
- Allow creating ingress with hostname #429 (ankushagarwal)
- Inherit namespace from ksonnet environment #426 (ankushagarwal)
- Fix the import name for seldon/serve-simple.libsonnet #420 (ankushagarwal)
- GCP credential support for TF serving #416 (lluunn)
- Replace azure env by aks || acs-engine #415 (wbuchwalter)
- Create a seldon-serve ksonnet prototype #413 (ankushagarwal)
- Use kubeflow-ci as the test cluster #411 (lluunn)
- Use kubeflow-images-staging to store envoy image instead of kubeflow-dev #409 (ankushagarwal)
- Replace ClusterRole/ClusterRoleBinding for ambassador with Role/RoleBinding #408 (ankushagarwal)
- Replace two envoy containers with a single one #407 (ankushagarwal)
- JWT validation works with GCP IAP - remove outdated docs #406 (ankushagarwal)
- Add HTML5 dropdown for image selector with cpu,gpu defaults #402 (pdmack)
- Use the correct jwt validation config #401 (ankushagarwal)
- use tini to start model server #400 (yupbank)
- Fix 2 issues with s3 config from #387. #399 (jlewi)
- Fix k8s dashboard on Azure #396 (wbuchwalter)
- Troubleshooting section updates #393 (pdmack)
- Add info about authenticate user account #392 (lluunn)
- Update tfserving gpu image to create model-server user #390 (ankushagarwal)
- fix id in OWNER file #388 (lluunn)
- Refactor the TFServing component to better support GPUs and specific clouds #387 (jlewi)
- Build http proxy image in release workflow #386 (lluunn)
- Add pkg install step for argo README #384 (pdmack)
- Add build-image option to TF serving workflow #383 (lluunn)
- Fix tfjob dashboard ambassador route #381 (wbuchwalter)
- change kubeflow.io to kubeflow.org #379 (Jimexist)
- add commands of how to delete a training job #378 (ChanYiLin)
- Consolidate down to a single Dockerfile #366 (ojarjur)
- update http proxy readme #363 (yupbank)
- Create a GPU model deployment to use for E2E testing of serving with GPUs #362 (jlewi)
- Example showing how to do TF serving when IAP is enabled. #361 (lluunn)
- Allow seperate naming of model name vs deployment name for TF Serving. #359 (elsonrodriguez)
- Add tensorflow-serving-api to jupyter image #358 (inc0)
- Remove -E from find command as it is not supported by GNU find #357 (ankushagarwal)
- Refactor the workflow for TF Serving #347 (jlewi)
- Add classify support for http-proxy #341 (yupbank)
- Build the TF Serving GPU image as part of a release workflow. #339 (jlewi)
- Improve the Python 2 kernels for both CPU and GPU instances. #335 (ojarjur)
- Add signature support to http proxy #334 (lluunn)
- Clarify JupyterHub and Jupyter description in component readme #333 (willingc)
- Edit user guide links and wording for Jupyter and JupyterHub #332 (willingc)
- Refine the text and links related to JupyterHub and Jupyter #330 (willingc)
- Fix jupyterlab by upgrading Jupyterlab #328 (jlewi)
- Adding S3 support to model serving. #323 (elsonrodriguez)
- Optimize start.sh #320 (inc0)
- Create an Argo workflow to build the Jupyter images. #317 (jlewi)
- Create an initial owners file. #316 (jlewi)
- Update the Dockerfiles for tensorflow notebook images #315 (ankushagarwal)
- Add support for Python 2 kernels to the tensorflow notebook images. #314 (ojarjur)
- Fix #285 and make few tweaks to notebook Dockerfile #311 (inc0)
- Add ambassador annotation to tf-serving http proxy #307 (lluunn)
- Add seldon to user guide docs as example serving component #304 (cliveseldon)
- add option service_type to set TFServing service type #302 (fxue)
- Update the image used for the TFJob controller. #299 (jlewi)
- Start instructions for creating a release #298 (jlewi)
- Remove unused file build/check_errorf.sh #287 (ankushagarwal)
- Update jsonnet autoformat script #286 (ankushagarwal)
- Refactor the ksonnet configs #284 (jlewi)
- Add names to tfserving service ports #283 (ankushagarwal)
- Integrate tf serving image building with prow #281 (lluunn)
- Fix tf-serving command and args #280 (ankushagarwal)
- Drop imagePullPolicy override #279 (kflynn)
- update image to use prow_config if exists #275 (lluunn)
- Fix TfJob test #274 (jlewi)
- add prow_config.yaml #271 (lluunn)
- Add mount of pvc #270 (inc0)
- Bump Ambassador version to latest
0.26.0
#269 (pdmack) - Add seldon ksonnet integration #268 (cliveseldon)
- Fix typo in user_guide.md #259 (ankushagarwal)
- Redirect / to /hub for the envoy pods #257 (ankushagarwal)
- Add GPU scheduling instructions to user_guide.md #256 (ankushagarwal)
- Update enable_iap.sh so that it updates the healthcheck path #252 (ankushagarwal)
- Remove vim's .swp files from git #249 (inc0)
- Fix link to prow jobs dashboard #248 (jonas)
- Fix short description typo in tf-job.jsonnet #247 (ankushagarwal)
- Fix jsonnet issue in iap-envoy.jsonnet #246 (ankushagarwal)
- Fix typos in iap.md #245 (ankushagarwal)
- Move spawner to separate file #243 (inc0)
- Fix the docker file so we do not need wrap cmd every time #242 (yupbank)
- Troubleshooting: Gotcha for Docker for Mac kube #238 (pdmack)
- install Ksonnet 0.8.0 in both CPU and GPU dockerfile. #237 (fxue)
- *: Replace tensorflow/k8s with kubeflow/tf-operator #235 (gaocegege)
- Fix userguide issue223 #234 (fxue)
- Fix bug with using LoadBalancer for JupyterHub. #232 (jlewi)
- fix typo for argo-ui role binding #231 (cliveseldon)
- Add ksonnet template for http proxy #228 (yupbank)
- Run TFJob tests as part of Kubeflow tests #227 (jlewi)
- Made TFJob CRD requirement explicit #225 (nkashy1)
- Update README.md to mention OpenShift SCC #224 (pdmack)
- Adds pushing images to dockerhub as well #222 (s1113950)
- update the docker file for http proxy #220 (yupbank)
- Code of conduct reference #219 (ewilderj)
- Delete components/tf-controller #216 (zmhassan)
- update workflow for building TF serving image #214 (lluunn)
- Fixing readme typo #212 (zmhassan)
- feat(model-server): add Dockerfile of model-server with gpu support #210 (DjangoPeng)
- Update tests to use shared code in kubeflow/testing #209 (jlewi)
- Changing tensorflow.org to kubeflow.org. Fixing CRD deployment. #208 (elsonrodriguez)
- add correct boilerplates #204 (mitake)
- Workflow for building and pushing the TF serving image. #196 (lluunn)
- fixed typo TfJob -> TFJob #185 (sfabel)
- add http proxy for tf-serving #183 (yupbank)
- Add tensorflow-notebook-image dockerfiles #182 (dogopupper)
- Model Server Component documentation triage. #181 (elsonrodriguez)
- Argo ksonnet package #178 (jlewi)
- Use spartakus to report usage metrics #175 (jlewi)
- README: Fix link #172 (gaocegege)
- Missing namespace scope for commands in userguide #171 (barney-s)
- *: Add CRD creation for TFJob #170 (gaocegege)
- *: Update link to kubeflow/kubeflow #169 (gaocegege)
- Use ambassador as a reverse proxy #166 (jlewi)
- Fix model path. #162 (jlewi)
- add namespace setup to top-level readme quick-start section #161 (cliveseldon)
- Updating the verbiage in the README to clarify project motivation. #160 (dynamicwebpaige)
- fix prow_artifacts.py #158 (lluunn)
- Fix the instruction for setting up cluster to run test #152 (lluunn)
- Run jsonnet's autoformat tool to autoformat all our jsonnet files. #149 (jlewi)
- fix guide typo #144 (lluunn)
- Update README.md #142 (genome21)
- Changing Jupyter service to ClusterIP by default. #139 (elsonrodriguez)
- Fix datascientists typo #137 (pmangg)
- Make Argo UI available publicly at testing-argo.kubeflow.ui #132 (jlewi)
- Fix TfJob operator roles and TfCNN prototype #130 (jlewi)
- Support IAP on GKE using Envoy as a reverse proxy #128 (jlewi)
- Fix bugs with TfJob operator prototype #127 (jlewi)
- Add missing namespace to command invocation #124 (mindprince)
- Document flow improvements, expanded on Jupyter usage #120 (elsonrodriguez)
- change kjsonnet in tf-cnn to remove not used MASTER #119 (yupbank)
- update tf-cnn example use terminationPolicy to address not using master replica #115 (yupbank)
- Fixing errors and missing variables in user guide. #114 (elsonrodriguez)
- Fix typo #112 (amitkumarj441)
- replace the model from tensorflow/models #111 (yupbank)
- fix #107; args improperly set to the namespace #108 (cwbeitel)
- Fix a typo in README #103 (darthsuogles)
- Update README.md #97 (puneith)
- Updated Quick Start section on README #94 (nkashy1)
- Make TfJob CRD prototype functionally equivalent to the helm chart #93 (jlewi)
- Client script for inception model server #92 (nkashy1)
- Add a section about who should consider using/contributing to Kubeflow. #86 (jlewi)
- Makefile override #84 (nkashy1)
- Some improvements to docs. #82 (jlewi)
- Prow should launch and manage an argo workflow for e2e tests #81 (jlewi)
- Fix a typo in the README #77 (jlewi)
- fix(doc): correct cli instruction #75 (yue9944882)
- Argo workflow to run E2E tests #72 (jlewi)
- A python script to test deploying Kubeflow #71 (jlewi)
- Fix mailing list URL typo #69 (ScorpioCPH)
- Add tf-job to registry.yaml. #66 (jlewi)
- Typo in readme #63 (wydwww)
- Update readme #59 (jlewi)
- Update Jupyter docs to reflect the move to ksonnet #53 (jlewi)
- Fix specifying of GPU resources. #49 (jlewi)
- Add proposal link #47 (ddysher)
- Clean up the k8s-model-server component #46 (jlewi)
- Update instructions and README #45 (jlewi)
- Add a note about TF serving Deployment not working with Minikube's kvm driver #40 (dimpavloff)
- Use ksonnet to configure/deploy Kubeflow #36 (jlewi)
- Update README.md #30 (aronchick)
- Adding licence boilerplate verification #29 (vishh)
- Update Contributing.md, and add license headers #28 (foxish)
- Update README.md #26 (aronchick)
- Add license and contributing.md #25 (foxish)
- Include Tensorboard and Docker Images #18 (foxish)
- Final readme fixes #16 (foxish)
- Cleanup JupyterHub Dockerfile #14 (yuvipanda)
- Delete section on Tensorflow Training Operator which just has a TODO. #10 (jlewi)
- Update the README w.r.t to distributed training. #9 (jlewi)
- Resolve TODO for RBAC on GKE 1.8+ #8 (foxish)
- re-organize manifests and examples #7 (vishh)
- fix jupyterhub config for image selection and defaulting #6 (foxish)
- Move TF Job CRD to a more appropriately named directory #5 (vishh)
- Moving TF Serving from gke-accelerators repo to kubeflow #4 (vishh)
- Update README.md #3 (amygdala)
- Create specs for running distributed training with/without GPUs. #2 (jlewi)
- Updates to the Readme that include mission #1 (aronchick)
* This Change Log was automatically generated by github_changelog_generator