kubeflow/CHANGELOG.md

289 KiB
Исходник Ответственный История

Change Log

v0.5.0-rc.0 (2019-03-28)

Full Changelog

Fixed bugs:

  • kfctl apply failed for invalid spec.version when installing CRD tfjobs.kubeflow.org #2634
  • Wrong service account is used if a project ID has a prefix for its org #2244
  • PytorchJob & operator not installed by kfctl in for non GCP platforms in 0.4 #2212
  • kfctl.sh fails if a user isn't explicitly added to the IAM policy #2186

Closed issues:

  • kfctl attribute deleteStorage is appearing in app.yaml when platform is not gcp #2826
  • kfctl delete fails when no platform or minikube is specified #2809
  • kfctl generate fails "jupyter" is not a component or a module #2796
  • kfctl apply fails to create k8s resources; tries to deploy to 127.0.0.1 #2791
  • ipName and zone are being written to app.yaml even when no platform is specified #2773
  • kfctl Incorrect ksonnet OptionServer is specified when no platform is provided #2761
  • kfctl apply should fail early if no client id/secret environment variable #2752
  • kfctl
  • kfctl presubmit test failure #2725
  • Jupyter resource should indicate when notebook is running #2723
  • kfctl apply fails trying to deploy to 127.0.0.1 #2722
  • make build-kfctl is failed during test #2715
  • kfctl needs to support passing swagger spec to ks init from file #2709
  • kfctl creates invalid IAM policy binding file if email not set #2707
  • kfctl tries to call ks init even on ks generate platform #2706
  • kfctl
  • kfctl go doesn't exit with non-zero exit code on failure #2699
  • "make build" under "bootstrap" module fails due to some go modules missing under /tmp inside Docker. #2696
  • spark-operator test is flaky #2693
  • kfctl config issue #2691
  • tf-operator Docker image based on Cent OS image with a security flaw #2688
  • Dashboard v1
  • Store Azure Secrets during setup #2675
  • kfctl could not generate kfApp no hostname #2670
  • kfctl doesn't create secret with username and password for basic auth #2661
  • kfctl Not actually setting GCP IAM roles #2659
  • kfctl apply requires oauth client id and secret even if using basic auth #2658
  • kfctl generate won't succeed if first attempt failed #2657
  • kfctl generated yaml fails validation #2653
  • Central Dashboard Label is spelled incorrectly #2652
  • Error in application component #2642
  • Inject common config into Jupyter Pods #2641
  • kfctl - provide option default to statically link ksonnet and kustomize into kfctl #2635
  • kfctl apply failed #2622
  • 1-click deploy
  • Dashboard v1
  • Dashboard v1
  • Central Dashboard - Rename "JupyterHub" to "Notebooks" #2612
  • GCP
  • GCP
  • User inside Notebook is root and not jovyan #2602
  • Set the base_url flag in Jupyter Pod #2595
  • jupyter turn off token authentication #2593
  • Ambassador routes for new jupyter notebooks don't work #2590
  • Central dashboard doesn't render; lots of 404s #2582
  • click-to-deploy app
  • Automate image builds of Jupyter UI #2578
  • Multi user for Jupyterhub #2577
  • presubmit error: ImportError: cannot import name tf_job_client #2574
  • question
  • clean up role binding after presubmit tests. #2570
  • How can I work around "cannot create directory /home/jovyan/work: Permission denied" #2567
  • Dashboard v1
  • Move ksonnet, kustomize implementations out of KfApps.Children #2561
  • kfctl: Add a show subcommand #2560
  • Create a new repo for kustomize packages #2559
  • Update TFDV and TFMA versions in Jupyter images #2551
  • Create secret in istio namespace when using istio #2549
  • kfctl golang move creation of test directories to subdirs under bootstrap #2537
  • kfctl golang Simplify the child/parent relationships in platforms #2536
  • kfctl golang Remove hard dependency on GOOGLE_APPLICATION_CREDENTIALS #2535
  • openshift 3.11 routes ok, but stylesheets and fonts from Web #2524
  • Stop hosting mongo in gcr.io/kubeflow-images-public/mongo:3.4 #2520
  • kfctl, replace regex edits with yaml modifications of individual fields #2515
  • add additional config files to kfctl for iap, basic_auth, default components #2514
  • vizier-db and vizier-core cannot running #2513
  • Dashboard v1
  • Stop deploying JupyterHub #2498
  • Update the central dashboard to redirect to the new jupyter web-app #2497
  • kfctl go binary should have an option to skip initi project #2491
  • bootstrap period test failure #2489
  • The CustomResourceDefinition "scheduledworkflows.kubeflow.org" is invalid #2487
  • Calling os.makedirs fails even after we have check for directory existence. #2484
  • Add Jupyter WebApp to kfctl #2481
  • Cloud NFS PVC needs to specify volumeName with k8s 1.11 #2475
  • kfctl: support deploy with basic auth #2472
  • Provide a list of required docs for the documentation card on the new landing page #2468
  • Upgrade to go 1.12, remove 'static', 'plugins' macros in kfctl golang #2460
  • Change jupyter images to work with the new jupyter CR #2458
  • Issues with the current Go notebook controller #2456
  • Update Python scripts based on ".style.yapf" config. #2448
  • Failed to resolve server nfs-server.kubeflow.svc.cluster.local: Name or service not known for pipelines #2443
  • jupyter spawns pod outside of it's namespace #2436
  • under bootstrap document the golang version required #2430
  • Error creating Profile CR: status.observedGeneration in body must be of type int64 #2423
  • add docker-for-desktop platform to kfctl golang #2420
  • make it easy to configure the foo sample plugin as a plugin or statically linked #2414
  • kfctl.sh delete all failed on OSX #2411
  • presubmit failing: Quota 'DISKS_TOTAL_GB' exceeded. Limit: 16384.0 in region us-east1. #2401
  • kfctl golang rename 'ks' directory and app to ksonnet and ksonnet.go respectively #2398
  • kfctl - Fetch registry automatically #2397
  • DM template fails with GPU enabled #2392
  • Dashboard v1
  • Clean up trailing whitespaces at the end of lines #2389
  • Integrate test_flake8.py into CI system #2388
  • I am installing kubeflow and got this errorno matches for kind "Application" in version "app.k8s.io/v1beta1"need help #2380
  • Click To Deploy
  • "pip install --upgrade pip" is usually a bad practice since it's known to have caused issues. #2375
  • add the gcp platform to kfctl golang #2370
  • Replace CfgFile *viper.Viper in KsApp with a struct and move to group.go in kfctl golang #2368
  • Have the different kfctl golang implementations of KfApp hold a map of KfApp children #2367
  • Move --namespace parameter from generate to init in kfctl golang #2366
  • Remove --packages and --components from 'kfctl generate' golang #2365
  • Re-enable click_deploy_test #2364
  • Dashboard v1
  • Failed to connect to Hub API at 'http://jupyter-0:8081/hub/api'. Is the Hub accessible at this URL from host: jupyter-q? #2350
  • Unable to download after pod starts #2345
  • TF-Serving: How to add new Param to Service Prototype running in Field does not exist #2326
  • kubeflow 有没有中文社区 #2325
  • TF serving new template should support model from NFS volume #2321
  • tf-serving how to add volumes/volumeMounts #2319
  • Add Azure Version of tf-serving prototype #2317
  • studyjob controller stays in CrashLoopbackoff #2308
  • Proposal: Kubeflow Serverless Serving CRD #2306
  • Sending predictions through TF-Serving returns 404's when using IAP #2302
  • Need help deploying simple model with tf-serving and enable Rest API #2292
  • "kfctl.sh apply k8s" stuck in ambassador and return error. #2290
  • GCP secrets are not created unless oAuth credentials are provided #2284
  • Go based implementation of the profiles custom resource? #2270
  • Go based implementation of Jupyter CR? #2269
  • Argo UI doesn't work when Ambassador is scaled to more than 1 replicas #2222
  • Install profiles CR by default #2177
  • Profiles resource needs an E2E test #2176
  • Jupyter CR E2E test #2174
  • Docs: The reference to the deployer app should be higher in the page #2169
  • Notebook manager UI should disable New PVC option if no default storage class is defined. #2157
  • Jupyter spawner UI should provide a warning if pod isn't backed by durable storage #2156
  • Allow in-cluster traffic through ambassador while enable basic username/password login with ambassador #2149
  • Cookie based auth for basic username/password login with ambassador #2148
  • login page for basic username/password login with ambassador #2147
  • Kubebench operator jsonnet unittest #2134
  • gcp-click-to-deploy
  • nodeselector in jupyter spawner not working #2131
  • 0.4 blog post #2102
  • gcp-click-to-deploy
  • Support basic username/password login with ambassador #2094
  • Central dashboard should be left handed navigating bar #2093
  • Color scheme for central dashboard should match Kubeflow pipelines #2092
  • Question / friction
  • Broken links for k8s-model-server docs #2027
  • Jupyter notebook manager UI that uses the new CRD #1995
  • Use ISTIO to restrict access to Jupyter Notebooks #1994
  • Can not deploy kubeflow on docker-for-kubernetes #1991
  • Error creating object: Service "vizier-core" #1982
  • ksonnet cleanup don't merge "params and env" #1961
  • Add e2etest for katib #1946
  • feature request: include 'google-api-python-client' in the JH images #1932
  • create cluster-admin role for kubeflow:default service account so that it can create a tfjob #1882
  • Remove OpenMPI package in 0.5.0 #1859
  • kfctl.sh delete doesn't check/change kubectl context before issuing delete #1669
  • add jsonnet tests for all libsonnet files #1542
  • kfctl
  • microk8s
  • JupyterHub users can't create pods on GKE #1281
  • Pachyderm and Kubeflow integration #151
  • Make JupyterHub More Configurable #56

Merged pull requests:

v0.4.1-rc.4 (2019-01-15)

Full Changelog

v0.4.1 (2019-01-15)

Full Changelog

Merged pull requests:

v0.4.1-rc.3 (2019-01-15)

Full Changelog

Merged pull requests:

v0.4.1-rc.2 (2019-01-15)

Full Changelog

Closed issues:

  • Suggest KubeFlow to use ksonnet 0.13.1 as minimal version #2250
  • Suggest to update the imagePullPolicy to "IfNotPresent" for Pipeline component #2249
  • Google

Merged pull requests:

v0.4.1-rc.1 (2019-01-12)

Full Changelog

Fixed bugs:

  • tf-notebook

Closed issues:

  • Create application CR's per package and enable a dependency tree that points to the top application CR kubeflow #2257
  • seldon gRPC #2239
  • Kfctl requires zone to be specified. Could be potentially automatically inferred. #2209
  • ksonnet installation of core giving 404 #2201
  • Cut 0.4.0 Release #2098
  • Investigate using Istio for jwt validation #1907
  • Docs for using ModelDB to track models using the API #1861
  • Deploy Model DB with durable storage #1860
  • Change the application component to agree with kubernetes-sigs application CRD #1856
  • Can't install >v0.1.3 KubeFlow to Microk8s #1854
  • Provide an option to deploy an application-controller that will deploy the kubeflow application in-cluster #1831
  • Not able to build from source: Makefile:17: recipe for target 'presubmit' failed #1812
  • gcp
  • Create a kubeflow application in the GCP Marketplace #1691
  • migrate bootstraper from glide to dep #1520
  • Make ambassador number of replicas configurable #1112
  • proposal for tooling #250

Merged pull requests:

  • remove ghost registries from 0.4 branch #2266 (kunmingg)
  • remove ghost registries; make kf version env var on deploy app #2265 (kunmingg)
  • use golang:1.11.2 as base image; more ksonnet 0.13.1 API change #2258 (kunmingg)
  • remove load sample job from pipeline config #2254 (IronPan)
  • bump pipeline sdk version to 0.1.7 #2253 (IronPan)
  • Update Pipeline version to v0.1.7 #2252 (IronPan)
  • Minor readability modifications in test_jsonnet.py #2246 (cclauss)
  • update params following ksonnet v0.13.1; update image build file path #2245 (kunmingg)
  • update jupyter hub with customized docker registry #2243 (cheyang)
  • split debug case into separate docker file so we don't expose port unnecessary #2242 (kunmingg)
  • argo s3 artifact repository configuration added #2238 (pbrzostowski)
  • Fix katib repo for the docker registry which doesn't support sub namespace #2237 (cheyang)
  • Adding pytorch operator to non gcp platforms #2236 (johnugeorge)
  • set jupyterHubAuthenticator explicitly even for default values #2235 (kunmingg)
  • Add HorizontalPodAutoscaler to TF-Serving component #2231 (Youki)
  • Sync up Kubeflow official docker images to user's registry #2230 (cheyang)
  • Undefined name: from six.moves import xrange for line 50 #2228 (cclauss)
  • Undefined name: 'false' --> 'False' #2227 (cclauss)
  • Adding pytorch operator to non gcp platforms #2225 (johnugeorge)
  • Make ambassador number of replicas configurable #1112 #2224 (agemocui)
  • Consistent formatting of prototype docs #2221 (TimZaman)
  • Profiles resource should support setting resource quotas on namespaces #2182 (stpabhi)
  • fixes Change the application component to agree with kubernetes-sigs application CRD #2154 (kkasravi)

v0.4.0 (2019-01-05)

Full Changelog

Closed issues:

  • kfctl.sh should check the name of the deployment is < 25 characters on GCP #2189
  • sync-notebook needs to look for a Deployment kind that is version independent #2185
  • Proposal
  • Can we scope SeldonCore to a namespace #1452

Merged pull requests:

  • Kubebench component update cherrypick #2220 (r2d4)
  • Add kubebench-dashboard package #2219 (xyhuang)
  • simplify cloud shell; add instruction message #2218 (kunmingg)
  • publish v0.4.0 on web app #2217 (kunmingg)
  • Update kubebench tags for 0.4.0 release #2216 (richardsliu)
  • add status tag to deployments_failure; exclude failure caused by quota or permission issue #2215 (kunmingg)
  • Cherrypick v0.4-branch: kfctl: add default zone if not specified \#2210 #2214 (r2d4)
  • sync-notebook needs to look for a Deployment kind that is version independent #2211 (stpabhi)
  • kfctl: add default zone if not specified #2210 (r2d4)
  • sync-notebook needs to look for a Deployment kind that is version independent #2194 (stpabhi)
  • Check the name of the deployment is < 25 characters on GCP #2191 (jcastill)
  • Update tf-serving.libsonnet for replicas #2113 (Youki)
  • Remove Seldon Cluster Roles by default #2039 (cliveseldon)

v0.4.0-rc.3 (2019-01-03)

Full Changelog

Closed issues:

  • Deploy script fails with unsupported k8s version 1.9.6-gke.6 #2193
  • Use consistent naming in release artifacts #2183
  • Docs: referenced version of KF is not the latest #2170
  • Add timestamps in the log messages of the click to deploy app #2166
  • Don't wait for IAP redirect when Skip IAP is selected #2165
  • util.sh check_install falls back to check for ks #2162
  • Create changelog for 0.4 release #2133
  • Can kubeflow run a job using multi-nodes' resources ? #2130
  • Reproduce FastAI dawn-benchmark using Kubeflow #2119
  • bootstrap golang modules #2022
  • Argo UI not working #2021
  • Which jubyterhub disks param name to use when assigning GCFS as PV #1934
  • Kubeflow on GKE not using the default gcp zone configured in the cloud shell. #1914
  • Cherry-pick reduced minikube scope in 0.3 patch #1891
  • How to change the TFJob termination policy to waiting all workers to complete #1796
  • Discussion
  • TFJob will not run if the TFJob CRD includes scope: Namespaced #1606

Merged pull requests:

  • Automated cherry pick of #2204: Update katib for v0.4.0 release Cherry pick of #2204 on v0.4-branch. #2204: Update katib for v0.4.0 release #2205 (richardsliu)
  • Tag and update katib for v0.4.0 release #2204 (richardsliu)
  • Cherrypick - Add retry to loading the sample job \#2199 #2203 (IronPan)
  • Fix OWNERs file for gcp package. #2202 (jlewi)
  • Add CUJs for multiuser kubeflow to the ROADMAP. #2200 (jlewi)
  • Add retry to loading the sample job #2199 (IronPan)
  • Add log timestamps, and do not wait for IAP if skipped IAP #2198 (abhi-g)
  • kfctl.sh shouldn't fail if a user isn't explicitly in the IAM policy - 0.4 patch. #2188 (jlewi)
  • kfctl.sh shouldn't fail if a user isn't explicitly in the IAM policy. #2187 (jlewi)
  • Add the TF 1.11 & TF 1.12 Jupyter images to the spawner drop down box. #2181 (jlewi)
  • Add the TF 1.11 & TF 1.12 Jupyter images to the spawner drop down box. #2180 (jlewi)
  • Add instructions for enabling the periodic tests on the new release branch #2179 (jlewi)
  • "argo" should be "Argo" #2168 (suigh)
  • Add log timestamps, and do not wait for IAP if skipped IAP #2167 (abhi-g)
  • gcp-click-to-deploy
  • Fix check install to use param for type as well as which check #2163 (holdenk)
  • update kf version to 0.3.5; add retry around IAM policy edit #2161 (kunmingg)
  • gcp-deployer
  • Update ROADMAP with link to CUJ for build/train/deploy from notebook. #2143 (jlewi)
  • parameterize mysql and minio image #2109 (IronPan)
  • kfctl functional description and changes required #2091 (kkasravi)
  • fixes 'bootstrap golang modules' #2090 (kkasravi)

v0.4.0-rc.2 (2018-12-21)

Full Changelog

Closed issues:

  • Split up core ksonnet package into separate packages #42

Merged pull requests:

  • rename kubeflow/core to kubeflow/common #2160 (jlewi)
  • Cut 0.4.0 release branch #2150 (r2d4)
  • rename kubeflow/core to kubeflow/common #2107 (kkasravi)
  • use service account token to make request, set token life time to be 15 mins. #2062 (kunmingg)

v0.3.5 (2018-12-21)

Full Changelog

Merged pull requests:

v0.4.0-rc.1 (2018-12-21)

Full Changelog

Closed issues:

  • Issue installing Kubeflow 0.3 #2128
  • Update ksonnet package for Katib #2126
  • master head: cloud-endpoints controller not respond to newly created CRD #2125
  • Cloud Endpoints Controller not working on master #2120
  • Ks package for kubebench operator #2110
  • CrashLoopBackOff pods after 'apply k8s' #2077
  • no persistent volumes available for this claim and no storage class is set; minio and pipelines #2076
  • Issue while deploying kubeflow on GKE using command line- killed message in Cloud Shell #2075
  • Ambassador crashing in kops cluster #2074
  • deploy.sh should not assume uuidgen is present #2072
  • setup-minikube.sh doesn't install ksonnet if missing #2068
  • New Jupyter spawner Ui doesn't allow entering a custom image #2060
  • add a jobs component #2058
  • Upgrade ks to 0.13.1 #2031
  • Auto-scaling for Seldon serving? #2029
  • Upgrade bootstrapper to use go modules #2023
  • Grant kubeflow user service account CMLE permission #2012
  • Seems to be an issue in kfctl.sh re existence of dir ${DEPLOYMENT_NAME} #2009
  • trying to deploy a component using ks after initial deployment fails #2006
  • Move components in core into a GCP specific ksonnet package #1996
  • Installation woes - ./kfctl.sh: line 189: env.sh: No such file or directory #1993
  • TF-Serving http-proxy error #1979
  • TFServing template needs to convert numGpus from string to int #1972
  • Expose Istio's grafana dashboard #1969
  • Integrate pipelines into Kubeflow click to deploy #1967
  • Figure out Istio installation in Kubeflow #1909
  • upgrade cloud-endpoints-controller to use metacontroller/metacontroller:v0.3.0 #1824
  • Jupyter image with NVIDIA Rapids #1806
  • GCP
  • On Premises deployment of Kubeflow fails unless ks flags are set v0.3 #1615
  • Test Flake git clone fails RPC error #1178
  • Test Flake
  • Document Serving for PyTorch models #1117

Merged pull requests:

v0.3.4-rc.2 (2018-12-06)

Full Changelog

v0.3.4 (2018-12-06)

Full Changelog

Closed issues:

  • ImagePullBackOff for images on GCR within same GCP project as GKE cluster #2044
  • ksonnet runtime error Seldon #2001
  • Better testing output for jsonnet #1988
  • Install pipelines SDK in Jupyter images #1968
  • Deploy pipelines as part of kfctl.sh #1966
  • Create a pipelines ksonnet package #1965
  • ksonnet env should override params #1924
  • Use ksonnet modules to better organize applications #1922
  • Create an openvino component #1913
  • Kubernetes Engine for Kubeflow Quickstart Guide - Proposed Fixes #1898
  • Tooling to convert notebook to docker container #1857
  • Fire off TFJob from Jupyter Notebook #1240
  • JupyterHub spawner - Highlight PV requirement #541
  • Extend KubeSpawner and its UI to handle Persistent Volume Claims #34

Merged pull requests:

v0.3.4-rc.1 (2018-11-26)

Full Changelog

Closed issues:

  • Missing scope in CRD spec? #1985
  • Error persistentvolumeclaim "nfs" not found while host model from NFS #1964
  • Fix Katib image tags in v0.3-branch #1953
  • user-gcp-sa secret not show up with --platform gcp #1950
  • Error deploying Kubeflow using kfctl.sh with --platform gcp #1949
  • Can TFJob be parameterized by replica index #1943
  • Dashboard cannot list tf jobs #1883
  • Issues installing 0.3 #1871
  • flaky configure_envoy_for_iap.sh #1807
  • create a notebook controller that can replace jupyterhub and uses k8 native auth #1769
  • Build Jupyter notebook images for TF 1.11 and 1.12 #1740
  • kfctl.sh apply platform assumes availability of yaml python library #1739
  • Unable to install Kubeflow on a exising 2 node ubuntu cluster; Docs need to be fixed #1711
  • gcp
  • PyTorch and TFJob v1beta1 API #1584
  • Error in the generated YAML #1524
  • Exposing service using Nginx Ingress Controller and load balancing question #1214
  • Problems upgrading services of type NodePort; spec.clusterIP: Invalid value: "": #1145
  • Simplify Image Tag management for releases #1060
  • TFServing supports collection of metrics with prometheus #1036
  • Add a parameter for clusters without RBAC #1027
  • Can not bring up Jupyter Notebook #672
  • Clusters created during e2e tests should be GC #560
  • ksonnet + openshift picking up wrong k8s version number #521
  • How to access notebook on KUBO? #294
  • Investigate using Docker to run Kubernetes and Kubeflow locally #218

Merged pull requests:

v0.3.3 (2018-11-15)

Full Changelog

Closed issues:

  • Update spartakus image to v1.1.0 to fix cloud providers' annotation #1940
  • Use iam-policy value for EMAIL if case-sensitive #1936
  • Following Quickstart Guide - Can't Deploy Kubeflow #1929
  • Enable kubeflow running on POWER #1928
  • Do we have support in documentation for Inference? #1905
  • Update katib manifests #1903
  • Deploy same components in click-to-deploy as kfctl #1892
  • Kubeflow 0.2.7 Spawning an image with GPU #1890
  • ksonnet package for Jupyter/JupyterHub #1886
  • Ksonnet package organization for TFJob and PyTorch #1885
  • Rollout model with istio and/or ambassador #1844
  • create a self-serve component for data-scientists #1842
  • TensorFlow Jupyter Notebook images 1.9 and above in gcr.io cannot see GPUs #1828
  • tf-notebook
  • Create 0.3.1 Release #1761
  • TF data validation and tfma are in 1.9 images but not 1.10 #1718
  • GCP
  • Argo UI doesn't work behind ambassador #1694
  • gcp
  • kfctl
  • Use Istio 1.0 #1309
  • TF serving supports request id #1220
  • Enable GitOps: Use Weave Flux or Argo CD to manage Kubeflow deployments #971

Merged pull requests:

v0.3.2 (2018-10-26)

Full Changelog

Closed issues:

  • Job not stopping once chief is complete - v0.2.0-rc.1 #1853
  • For GKE deployment, change default CPU to Broadwell #1840
  • minikube
  • metacontroller's compositecontrollers and decoratorcontrollers need to be at Cluster scope #1833
  • ERROR in ks apply default -c iap-ingress #1827
  • ksonnet namespace should be retrieved from env not params #1825
  • issues with 0.3.0 deployment scripts #1767
  • Enable TPU in GKE #1766
  • TFMA Dependency #1745
  • jupyter_console needs update in the latest notebooks #1721
  • gcp
  • test coverage for click-to-deploy app #1578
  • Required JWT token is missing #1495
  • A small problem, Error validate kubeflow-core #1462
  • Does CentralUi need ClusterScope? #1450
  • Remove ksonnet parameter cloud #1227
  • Test Flake wait_for_workflow terminated on HTTPSConnectionPool error; why don't we retry #1169
  • Application Custom Resource for Kubeflow Deployments #1106
  • Create & label P1 issues needed for an initial release of Horovod support #778
  • Document Red/Green Model Rollout using ISTIO #667
  • Central UI - need release process #527
  • Model Management Features #136

Merged pull requests:

v0.3.1 (2018-10-19)

Full Changelog

Fixed bugs:

  • Deploy kfctl.sh apply k8s 'namespaces "kubeflow" not found' #1675

Closed issues:

  • add the metacontroller component #1819
  • Enabling TFMA jupyter extension on GPU images requires libcuda #1818
  • Errors when starting kubeflow with minikube #1802
  • add parts attribute to libsonnet for ksonnet incubator/best practices #1798
  • Feature Request + Help Wanted: Attach PVs to TFJob components #1782
  • Update katib suggestion images #1779
  • presubmit failed when build gpu version #1778
  • Build machine running out of space - Presubmit test jobs failing for PRs #1775
  • Current master version fails when applying platform in GCP #1768
  • kubeflow namespace not found using kfctl.sh #1758
  • tensorboard generated under kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet is incomplete #1753
  • Error Creating Object: Pod in version "v1" cannot be handled as a Pod #1748
  • No deployment directory found in https://github.com/kubeflow/kubeflow/archive/v0.3.0.tar.gz #1743
  • the variable of ENVIRONMENT #1741
  • docs
  • Installation errors with kfctl.sh #1733
  • Please patch PR 1716 update GCP credentials filename into v0.2 #1719
  • Tiny jupyter lab notebook toolbars #1704
  • K8s dashboard showing "no healthy upstream"; remove K8s dashboard links and services #1699
  • 1.0 Exit Criterion for TFJob and PyTorch #1683
  • Deploy kfctl.sh apply k8s: Failed to pull image #1651
  • Support scope for postsubmit #1587
  • Kubeflow quick start misses namespace creation #1514
  • Automated sync of Kubeflow between GCR and DockerHub #1320
  • Deployment script didn't create namespace kubeflow #1274
  • Rename TF-Hub to JupyterHub #1223
  • TFServing test if failing blocking submits #1126
  • Presubmit failures; Timeout waiting for TFJob v1alpha2 job #974
  • Create labels for releases #885
  • Create image release workflow for tf operator images #855
  • openmpi
  • Upgrade ksonnet version to v0.10 for kubeflow. #727
  • Liveness/Readiness checks for TF Serving #368

Merged pull requests:

  • Tag Jupyter notebook images for v0.3.1 #1830 (richardsliu)
  • reorganize tests to be under their respective dirs, add aws,gcp tests for tensorboard #1821 (kkasravi)
  • fixes 'add the metacontroller component #1819' #1820 (kkasravi)
  • add ssl cert reuse logic in e2e & prober tests #1817 (kunmingg)
  • Fix TFMA and TFDV in Jupyter Images #1815 (jlewi)
  • Remove root level makefile #1814 (r2d4)
  • stop e2e test for deploy app till fix letsencrypt rate limit #1813 (kunmingg)
  • Remove K8s dashboard link from central UI #1811 (jlewi)
  • Knative build for in-cluster image builds in Kubeflow #1804 (swiftdiaries)
  • skip readarray if --all provided to jsonnet fmt script #1800 (r2d4)
  • fixes 'add parts attribute to libsonnet for ksonnet incubator/best practices' #1799 (kkasravi)
  • Pin tf-serving version in tf-notebook #1797 (r2d4)
  • Tf serving with istio 1.0 #1795 (lluunn)
  • cherry-pick PR #1754 onto V0.3 branch #1794 (kkasravi)
  • remove empty readme #1791 (r2d4)
  • remove travis config #1790 (r2d4)
  • Add a few basic metrics for deployment service #1787 (abhi-g)
  • Cherry pick #1779 #1781 (leoncamel)
  • Update katib suggestion images \#1779 #1780 (leoncamel)
  • Cherry pick #1727 #1777 (leoncamel)
  • Presubmits should be triggered if the DM configs are modified. #1776 (jlewi)
  • Cleanup the OWNERs files. #1773 (jlewi)
  • Enable new beta Stackdriver Kubernetes Monitoring feature if using v1beta1 \#1768 #1772 (dsdinter)
  • add jupyterhub test #1771 (kkasravi)
  • Use new stackdriver agents in DM config #1765 (lluunn)
  • pin to minor GKE version in dm config #1763 (r2d4)
  • gofmt: run go fmt github.com/kubeflow/kubeflow/bootstrap/cmd/... #1762 (r2d4)
  • remove statsd from ambassador deployment #1760 (r2d4)
  • Cherry pick #1746 and #1747 #1756 (jlewi)
  • fixes tensorboard generated under kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet is incomplete #1754 (kkasravi)
  • Seldon ksonnet refactor #1752 (cliveseldon)
  • openmpi
  • Ensure namespace exists #1747 (jlewi)
  • Pin jupyter-console to 6.0.0 #1746 (jlewi)
  • periodic e2e test for click deploy app #1732 (kunmingg)
  • change clusterrole to role #1728 (swiftdiaries)
  • Add katib's hyperband/bayesianoptimization suggestion images \#1464 #1727 (leoncamel)
  • gke/deploy.sh: test uuidgen exists before using it #1674 (rabierp)
  • Allow folks to have the kubeflow github repo over https + improve error message #1626 (holdenk)
  • Improvements to kubeform_spawner.py form UI and others #1551 (tlkh)

v0.2.7 (2018-10-10)

Full Changelog

Fixed bugs:

  • gcp
  • gcp

Closed issues:

  • How to submit multiple OpenMPI jobs? #1730
  • Permission denied errors when pip (un)installing without --user in new nb images #1722
  • GCP
  • jsonnet test is failing but no jsonnet files changed in the PR. #1707
  • Error while updating iap-ingress for custom domains #1689
  • Jupyterlab service account token #1648
  • Create 0.3. Release #1541
  • upgrade to ksonnet 0.12.0, jsonnet v0.11.2 in the Docker image for our test workers #1540
  • Chief worker cannot start #1440
  • Web app deploying kubeflow through deployment manager #884
  • Proposal: kubeflow-scheduler #68

Merged pull requests:

v0.3.0 (2018-10-04)

Full Changelog

Fixed bugs:

  • v0.3.0-rc.1: ERROR no prototype names matched 'pytorch-operator' #1663

Closed issues:

  • Accessing custom metrics in our Python model #1681
  • v0.3.0-rc.1
  • Run tests periodically on the release branch starting with 0.3 #1603
  • Kubebench-job default param mainJobConfig points to a wrong path #1596
  • Release process for kubebench #1510

Merged pull requests:

v0.3.0-rc.3 (2018-10-02)

Full Changelog

Closed issues:

  • v0.3.1-rc.1
  • v0.3.0-rc.1
  • GCP
  • Update Katib image in 0.3 branch #1604
  • central dashboard image build workflow doesn't work #1575
  • Image Auto Release
  • Image Auto Release Cron Job is failing #1563
  • ack_guide.md out of date #1293
  • GKE: can not read from google cloud storage in Jupyter notebook #1249
  • Friction log for bootstrapper documentation #927
  • Create a minimal release process for our ksonnet configs #215

Merged pull requests:

v0.3.0-rc.2 (2018-09-30)

Full Changelog

Fixed bugs:

  • Deploy kfctl.sh apply k8s : Service "ambassador" is invalid ? #1566

Merged pull requests:

v0.3.0-rc.1 (2018-09-28)

Full Changelog

Closed issues:

  • GCP
  • Update CentralUI image used at head and 0.3 branch #1435
  • kfctl.sh needs to get initial cluster version based on get-server-config #1359

Merged pull requests:

  • Cherrypick \#1657 tag image and change libsonnet for centraldashboard image #1660 (swiftdiaries)
  • Tag and update centraldashboard image #1657 (swiftdiaries)
  • Cherry-pick #1589 to v0.3 #1656 (lluunn)
  • Cherrypick #1650; automatically set master version to supported version. #1654 (jlewi)
  • Update initial clsuter version in cluster.jinja based on what gcloud get-server-config returns #1650 (ashahba)
  • Enable periodic prow tests #1649 (richardsliu)
  • Build image for centralui part of presubmit #1623 (swiftdiaries)

v0.2.6 (2018-09-28)

Full Changelog

Fixed bugs:

  • Ambassador Version 0.34.0 causing DNS Issues on Worker Node #945

Closed issues:

  • Update document on Tf Serving #1634
  • test flake
  • tensorboard prototypes should include the optionalParams available in that prototype #1610
  • Katib ksonnet component needs an E2E test #1607
  • Update Kubebench images on 0.3. branch #1602
  • Update Jupyter images on 0.3. branch #1601
  • Update PyTorch Job image on 0.3 branch #1600
  • Update TFJob on 0.3 Branch #1599
  • Update Seldon to 0.2.3 on 0.3 release branch #1598
  • Envoy unable to read config #1588
  • deploying kubeflow with bootstrapper failed #1586
  • How can we modify jupyterhub configuration to use GitHub Authentication? #1585
  • Jupyter Image Builds Are failing; Dependency issue related to TFMA? #1576
  • bump gke version to 1.10.7-gke.2 #1572
  • presubmit build is failing with a quota error #1562
  • refactor tf-job-operator to match style-guide of libsonnet #1534
  • Test to verify we can deploy Katib #1483
  • test flake
  • Make TF serving component more readable and extendable #1264
  • E2e test of TF Serving using built-in HTTP api #1258
  • Discussion
  • bootstrapper should support push the ksonnet app to a source repo #912
  • Investigate using tensorflow serving's built-in http server #896

Merged pull requests:

4e7f4ed (2018-09-19)

Full Changelog

Closed issues:

  • kfctl.sh needs to call get-server-config to get GKE version #1570
  • Certificate not working #1567
  • "Patching IAM bindings" halt during deployment. #1559
  • cert-manager missing clusterrole #1554
  • Jupyter Notebooks for TF 1.9 and 1.10 #1546
  • test_jsonnet is failing in postsubmit #1543
  • Directory ${KFAPP} already exists #1530
  • Move kubebench package to kubeflow repo #1513
  • cloud endpoint prototype breaks on master. #1507
  • Error: Failed to apply app: find objects: RUNTIME ERROR: Field does not exist: v1 #1506
  • PR shows review is not required to merge #1503
  • update tensorboard to use the same pattern as kubeflow/core/prototypes #1500
  • No prototype names matched 'kubeflow-core' #1492
  • cannot list namespace on tfjob dashboard #1491
  • Issue installing on GKE with deploy script #1489
  • Installation fails on Amazon EKS #1488
  • add ability to only generate parts of a component in the jsonnet file #1486
  • Prometheus for seldon models #1484
  • Multiple issues with gke/deploy.sh #1481
  • Katib StudyJob failed to mount directory #1480
  • New image release for pytorch operator #1479
  • simplify tensorboard as separate aws, gcp prototypes #1477
  • Cut release 0.2.5 #1476
  • do you have a performance benchmarks when run Horovod with your openmpi component? #1461
  • Review/extend jovyan permissions in TF notebooks #1438
  • gcp
  • kfctl.sh unable to find component ambassador #1429
  • kfctl
  • standardize remaining <component>.{jsonnet,libsonnet} files #1414
  • Update 0.2 blog with new deployment script #1390
  • Update E2E test to use kfctl.sh and delete gke/deploy.sh; #1331
  • Docker image building workflows are failing #1135
  • bootstrap
  • camelCase for some recently fixed params #1050
  • Add document on Stackdriver agents #997
  • Suggest using simple port forwarding instead of LoadBalancer for cloud deploy in User Guide #860
  • Need docs for TFJobs UI #573
  • Update docs to mention known issues with ksonnet and windows #501
  • Need: User facing website for Kubeflow that details how to choose a stack #213
  • Tutorial(s) that correspond to CUJs #85

Merged pull requests:

v0.2.5 (2018-09-04)

Full Changelog

Fixed bugs:

  • JupyterHub Version Mismatch #1393

Closed issues:

  • Error setting up kubeflow in minikube #1459
  • Current documentation for setting up Kubeflow in minikube not working #1455
  • presubmit failure for jsonnet test: name must be set #1453
  • What is image registry.opensource.zalan.do/teapot/external-dns ? #1446
  • What's the function of tfReplicaType: Master ? #1442
  • Bootstrapper fails in docker-for-desktop #1430
  • tf_job_simple_test results not being report #1426
  • deploy.sh should be restart-aware in terms of directory structure #1422
  • kfctl.sh should not assume uuidgen is present #1415
  • How to spawn the jupyter container as a root user #1412
  • Don't use DM for IAM policy management #1401
  • Trigger minikube E2E test on presubmit when minikube test is modified #1350
  • GKE version "foo" is unsupported. #1348
  • CentralDashboard returns 404; Ambassador can't parse the route #1306
  • When creating releases we should pin the version of source.tar.gz used in the deploy.sh #1239
  • provide the ability to add imagePullSecrets to different ServiceAccounts so that private images can be fetched #1231
  • Restrict privilege of Kubeflow services accounts such as tf-job-operator to namespace level #1213
  • Minikube deploy script should start minikube #1153
  • gcp
  • Make jupyterlab discoverable/default #1124
  • Enable test for tf-job-simple prototype for v1alpha2 #1048
  • Initial report for spartakus metrics #351
  • Batch Prediction using GPUs with local runner #251

Merged pull requests:

v0.2.4-rc.0 (2018-08-21)

Full Changelog

v0.2.4 (2018-08-21)

Full Changelog

Closed issues:

  • The testing/install_minikube.sh script assumes the host OS is Ubuntu. #1383
  • Add Argo UI to Ambassador and Central UI #1310

Merged pull requests:

v0.2.3-rc.0 (2018-08-17)

Full Changelog

v0.2.3 (2018-08-17)

Full Changelog

Features and improvements:

  • Extra packages in jupyterhub-image #1175
  • Use GKE auto-scaling when configuring node pools in the provided Deployment Manager configs #1033

Fixed bugs:

  • apparent issue with 0.2.0 tag re: ks registry #1115
  • Ambassador crashing on minikube #734

Closed issues:

  • Jupyter-role error applying kubeflow-core component with ksonnet #1353
  • Jupyter notebook Connection failed because Ambassador doesn't enable websockets #1344
  • Need to update glide's ksonnet version to ^0.11.0 #1340
  • PS still running after tfjob is complete. #1334
  • Central UI should include a link to Kubeflow docs website #1318
  • Istio integration doc: Point to kubeflow/website documentation for #1315
  • Build and debug improvements for bootstrapper #1312
  • Fix incorrect links to user_guide in kubeflow.org #1300
  • JupyterHub login unauthorized 401 #1296
  • Katib apply fail with error: Field does not exist: modeldbDatabaseImage #1291
  • ambassador crashing on node with wrong DNS resolver address due to misconfigured kubelet #1289
  • Move Katib documentation to kubeflow website #1286
  • How can we change the tensorflow image in kubeflow? #1285
  • gcp
  • TFJob operator v1alpha2 doesn't work with TF.Estimator API for TF <=1.6 #1283
  • GCP
  • GCP
  • GCP
  • gcp-deployer
  • Deploy argo by default; add it to deploy.sh scripts #1268
  • Test Flake
  • ERROR no prototype names matched 'kubeflow/core' #1263
  • TF Serving GPU test failing #1262
  • Keras training in Kubeflow on GKE gets "Killed" #1261
  • Better installation guide on kubeflow.org #1257
  • Create a batch predict example #1250
  • GPU support on GKE not available #1246
  • TF-job package missing #1245
  • Spawning Jupyter failed; user jovyan does not have permission to write to default storage class #1241
  • "Getting Involved" in README.md should point to kubeflow.org #1237
  • scripts/gke/deploy.sh fails when kubeflow_deployment_manager_configs/ exists #1233
  • openmpi
  • GCP
  • Cant start jupyter-notebook pod in kubeflow version 0.2.1 #1221
  • Test Failure
  • Error from server NotFound: tfjobs.kubeflow.org "mycnnjob" not found #1217
  • Getting started error: No such file or directory: 'cluster-kubeflow.yaml' #1206
  • Getting started error #1205
  • README.md QuickStart should refer to kubeflow.org Getting Start #1202
  • "getting-started-gke" installer fails #1201
  • gcp
  • deploy.sh is broken; wrong directory for the unpack? #1193
  • unable to spawn jupyter notebook - volume name is too long #1177
  • Delete old GCP configs #1171
  • Test Flake gke teardown failed; insufficient quota #1166
  • Create Prometheus Component in Kubeflow Core. #1160
  • Jupyter image suitable for running the examples/codelabs #1157
  • Create links on kubeflow.org to redirect to ksonnet tarballs and deploy scripts #1156
  • Make it easy to customizes PVs attached to Jupyter pods #1125
  • Split all prototype into separate prototypes #1107
  • Support for AVX2 when using deployment manager #1082
  • gcp-click-to-deploy
  • Bootstrapper release instructions need to explain how to build at appropriate commit #1053
  • Don't check in vendor for bootstrap; it adds 160M which slows down cloning the registry as part of Kubeflow deployment #1051
  • unflake TF serving testing #1031
  • Support and testing different versions of TF serving images #1005
  • Serving path should support logging request input/output #1000
  • 'ks delete ${KF_ENV} -c kubeflow-core' doesn't take down user notebook pods #968
  • Bootstrapper should support file and http registries in a consistent manner #962
  • Click-to-deploy UI upgrade. #959
  • Remove "alpha" in deployment manager config when gke-1.10.2 is public #821
  • TF Serving test flaky #815
  • TFJobs UI doesn't work behind IAP; React APP needs support IAP? #574
  • Can not launch TensorFlow Serving because AVX not available on VM #421
  • Trigger rebuild of TF serving image in E2E test only when files change #371
  • HTTP Proxy and TFServing should not use the same resource defaults #360
  • Add script to export inception into SaveModel from checkpoints #229
  • ks apply -f tf-job.jsonnet, -f is not a valid flag #201
  • e2e test for http-proxy #198
  • Make https://hub.docker.com/r/kubeflow/jupyterhub/ a community resource #197
  • How to port model developed in Jupyter notebook to TFJobs #110

Merged pull requests:

v0.2.2-rc.0 (2018-07-13)

Full Changelog

v0.2.2 (2018-07-13)

Full Changelog

Closed issues:

  • TFMA plots don't render; GET tfma_widget_js.js returns 404 #1130

Merged pull requests:

v0.2.1-rc.1 (2018-07-12)

Full Changelog

v0.2.1 (2018-07-12)

Full Changelog

Closed issues:

  • Use PV by default mounted at /home/jovyan #1187
  • Central UI image needs to be updated in 0.2.1 release; it is too old. #1147
  • metrics_collector should emit K8s events to indicate when Kubeflow is ready #1142
  • Jupyter images in 0.2.1 need to be upgraded #1129

Merged pull requests:

v0.2.1-rc.0 (2018-07-11)

Full Changelog

Closed issues:

  • Test Flake deploy.sh fails trying to enable the deployment service #1158
  • Make downloading our ksonnet registry for getting started efficient #1154
  • kubeflow cluster cannot pull image from GCR within same Project. #1139
  • Test Flake
  • Ambassador pod failed to run because kube-dns not running #1134
  • Test Flake
  • PyTorch job prototype should contain the full job spec #1114
  • Finalize release 0.2.0 #1070
  • gcp
  • gcp click-to-deploy
  • Cherry-pick for release v0.2.0-rc.1 #1024
  • Cut a 0.2 release branch #964
  • gcp
  • Include GPU daemonset in GKE configs? #288
  • KubeFlow or Kubeflow? #44
  • Tooling to manage configuration and deployment #23

Merged pull requests:

  • Cherry pick upgrading the central UI to v0.2.1 #1182 (jlewi)
  • Make the deploy scripts more efficient and other fixes. \#1174 #1180 (jlewi)
  • Cherry Pick: Don't check in bootstrap/vendor. \#1152 #1176 (jlewi)
  • Make the deploy scripts more efficient and other fixes. #1174 (jlewi)
  • Make GKE VM service account storage.objectViewer to have read access of gcr #1164 #1172 (jlewi)
  • Cherypick: Tag the latest Jupyter images with v0.2.1 \#1144 #1170 (jlewi)
  • Delete kubeflow namespace before deleting the cluster #1167 (ankushagarwal)
  • Make GKE VM service account storage.objectViewer #1164 (kunmingg)
  • Give KF_USER_NAME service account roles/cloudbuild.builds.editor role #1163 (ankushagarwal)
  • Skip project setup during deployment. #1162 (jlewi)
  • Don't check in bootstrap/vendor. #1152 (jlewi)
  • Make Katib work with Ambassador. \#1103 #1150 (jlewi)
  • Fix the makefile so that we tag the image with the comit. #1149 (jlewi)
  • Tag the latest Jupyter images with v0.2.1 #1144 (jlewi)
  • gcp-deployer
  • Set requests and limits for RAM and CPU in TF notebook image releaser. #1136 (jlewi)
  • Make the test robust to test flakes due to problems initializing the ksonnet app #1133 (jlewi)
  • Fix TFMA jupyter extensions. #1131 (jlewi)
  • Fix typos #1127 (idealhack)
  • add service monitor prototype for monitoring deployment status #1123 (kunmingg)
  • Add Dockerfile and Makefile to build docker images #1122 (ankushagarwal)
  • Pin and fix katib images. \#1113 #1120 (jlewi)
  • Put the full PyTorch prototype in the jsonnet file. #1119 (jlewi)
  • Pin and fix katib images. #1113 (jlewi)
  • Create a deployment script for gke and minikube #1111 (ankushagarwal)
  • Add a jupyter-notebook-role and use it for notebooks in jupyterhub #1110 (ankushagarwal)
  • gcp-deployer
  • add katib releaser #1102 (YujiOshima)
  • Create a script to update a ksonnet app to the latest Kubeflow package #1100 (jlewi)
  • Tfjob create fails in tfjob UI #1099 (kkasravi)

v0.2.0 (2018-06-29)

Full Changelog

Fixed bugs:

  • gcp click-to-deploy
  • gcp click-to-deploy

Closed issues:

  • Wrong comment in setting default CleanPodPolicy #1081
  • Deprecate tfserving http-proxy? #1080
  • user guide disappear? #1078
  • user_guide link is dead #1075
  • TFJob prototype's default TFVersion should be v1alpha2 #1049
  • TF-Serving 1.8 Images #845

Merged pull requests:

  • gcp-deployer
  • cherry pick 3 commits #1104 (kunmingg)
  • Make Katib work with Ambassador. #1103 (jlewi)
  • add seldon to ks config; pre install all pkg #1098 (kunmingg)
  • Create a version of echo-server to echo headers. #1097 (jlewi)
  • add chainer-job/chainer-operator ksonnet package #1095 (everpeace)
  • Install gpu driver in deployment manager #1094 (lluunn)
  • Improvements to deploy.sh #1093 (jlewi)
  • Delete the tf-job package. #1091 (jlewi)
  • cherry-pick: update tf job default version \#1086 #1087 (kunmingg)
  • update tf job default version #1086 (kunmingg)
  • Add katib tag to images #1085 (inc0)
  • fix for file_cache is unavailable when using oauth2client >= 4.0.0 #1084 (kkasravi)
  • add a readme file to the mpi-job ksonnet component #1079 (rongou)
  • fix #1075, user_guide link is dead #1077 (theofpa)
  • TFJobs UI doesn't work behind IAP #1073 (kkasravi)
  • point bootstrapper to v0.2.0 in release branch #1069 (kunmingg)
  • Tooling to make it easier to tag images and update the ksonnet prototypes #1066 (jlewi)
  • Create a script to update some of the docker images in the prototypes #1063 (jlewi)
  • gcp-deployer

v0.2.0-rc.1 (2018-06-22)

Full Changelog

Closed issues:

  • Cross-Origin Resource Sharing with TF Jobs Dashboard #1046
  • nvidia-smi fails for TFJob's but not for similarly configured Job's #1042
  • Jupyter can't start pod; the default spawner image is way too old #1041
  • TFJob pods deleted on completion/failure impairing debugging #1039
  • Invalid value: "v1alpha2": field is immutable #1029
  • Update PyTorch Image to officially released one #1020
  • Bootstrapper in 0.2.0-RC 0 doesn't set tfJobsVersion to v1alpha2 by default #1018
  • TensorFlow 1.8 not included in stock Jupyter Images #1014
  • Investigate supporting of TF serving 1.7 gpu #1009
  • Parameter names for Ambassador images improperly named; have extra tf prefix #994
  • Bootstrapper: Friction log from minikube on macOS #981
  • Enable TFJob v1alpha2 by default in 0.2 release #977
  • Central UI sometimes doesn't render if screen too small #957
  • gcp
  • central UI
  • Verify Central UI is working #805
  • Batch Prediction Beam Library #662

Merged pull requests:

v0.2.0-rc.0 (2018-06-16)

Full Changelog

Fixed bugs:

  • JupyterHub authenticates login but doesn't redirect to user home #430

Closed issues:

  • Bootstrapper should apply components in an order #1006
  • Make Katib work with reverse proxy and ambassador #991
  • ambassador memory leak statsd:0.30.1 #986
  • presubmit failing: couldn't open import "X": no match locally or in the Jsonnet library path #983
  • Verify Katib is working #973
  • Issues with image release workflow app #970
  • ks error: ERROR open /home/jlewi/app.yaml: no such file or directory #966
  • Bootstrapper throwing error when deployed on GKE #961
  • gcp
  • Nightly builds of TensorFlow notebook images do not have have the same commit hash in the tag #943
  • Create ksonnet prototype for auto image release #941
  • How to configure a local volume for Jupyterhub_spawner.py #933
  • Unknown variable error when applying kubeflow-core prototypes #932
  • GCP Deployment Manager needs to delete IAM roles when DM is deleted #910
  • Unable to use image-releaser for tf-notebook-workflow #909
  • Regression Jupyter notebooks no longer include python2 runtime #906
  • Deadlocks configuring envoy for IAP #903
  • IAP setup script needs permission to list backend services #902
  • bootstrapper should not crash loop on error #901
  • openmpi-controller:0.0.3 image pull error. #887
  • deployment manager should allow setting IAM roles in YAML config #883
  • Bootstrapper should support packages from other registries #880
  • Deployment manager config should create service accounts and set IAM roles #878
  • Deployment manager config should enable Cloud Endpoints API #876
  • dev.kubeflow.org envoy; iap sidecar stuck waiting for backend #871
  • Presubmit unhealthy? Networking issues #869
  • Failed to run TfCnn example from User Guide: ERROR find objects: parse jsonnet snippet: params.libsonnet:22:5-13 Expected a comma before next field. #863
  • unknown node type issue related to mycnnjob.jsonnet #858
  • Verify PVC for /home/jovyan works #854
  • E2E test for TFJob v1alpha2 ksonnet package #852
  • Include ssh in notebook image to support authenticated git push #850
  • Jupyter image for TensorFlow 1.8 #846
  • Some questions about tf-serving on NFS #844
  • E2E test verifies Kubeflow installed via Deployment Manager #836
  • JupyterHub GitHub OAuth Setup missing manifest/config.yaml files #835
  • GCP deployment manager test handle internal errors #833
  • bootstrapper error trying to create the PyTorch Operator #832
  • Unify and dedup release workflows #830
  • Bootstrapper should optionally use config map to specify ksonnet parameters #829
  • horovod error #826
  • Give release service account access to kubeflow-images-public #824
  • Ambassador failed to start up #811
  • IAP Envoy route should map / to central dashboard #809
  • Central UI
  • Bootstrapper should deploy new Stackdriver agents #807
  • deployment manager should disable legacy Stackdriver agents #806
  • Invalid envoy config duplicate value IAP #804
  • On GCP trigger bootstrapper with deployment manager #802
  • TF serving GPU test failing #794
  • openmpi
  • Minikube E2E test timeout waiting for VM to be ready. #788
  • Spawner options error(401) #784
  • jupyterNotebookPVCMount is not work in v0.1.2 #770
  • Swift Jupyter Integration #763
  • It didn't return [dead loop] when Python 3 is being used. #760
  • Support using bootstrapper image as a kube deployment. #758
  • Support install kubeflow by Deployment Manager #757
  • the tfserving prototype missing label after generation #746
  • E2E test for bootstrapper #742
  • E2E test for Argo Package #740
  • cloud-endpoints fails when using bootstrapper #735
  • Use Pod Preset to add environment variables and volume mounts to pods #732
  • Bootstrapper fails if no ~/.kube/config is present #722
  • openmpi controller exited with error #718
  • openmpi
  • Create individual prototypes for just TFJob #669
  • Nightly regular build of container images #666
  • Bootstrapper should enable IAP on GKE #665
  • Bootstrapper should check if user has appropriate permissions and if not create cluster role #664
  • Bootstrapper should create namespace if it doesn't exist #663
  • Tests are failing because we are running out of PD quota in us-east1 #618
  • ksonnet packages for Pachyderm #611
  • Friction log for TFX Chicago taxi cab example on minikube #594
  • TFJob prototype should contain the full TFJob spec so that ks generate is mostly just copying the prototype #564
  • Can we put all of /home on PV for Jupyter notebooks #561
  • Create gcr.io/kubeflow-images-public #534
  • Central UI Ambassador Integration #528
  • Have the JupyterHub spawner report issues with spawning the user's server #505
  • Recommended minikube setup #502
  • TF Serving component logging #495
  • JupyterHub Spawner complains notebook is already spawning; upgrade JupyterHub to 0.9 #479
  • Build TFServing images for different TF versions #468
  • Discussion
  • E2E test for tf-job-simple prototype #462
  • Unable to mount volumes for pod "jupyter-... #424
  • multiple Matplotlib libraries #423
  • ksonnet style guide #403
  • Reformat only modified files #395
  • Establish a pattern for creating/using secrets used by multiple kubeflow prototypes #372
  • ks delete default fails #364
  • Python script to set parameters for ksonnet prototypes #322
  • Flakes building TF Serving image; problems downloading unbuntu packages #310
  • ksonnet prototypes for TensorBoard #297
  • Enforce formatting for all jsonnet files #282
  • Assess usability on vanilla dockers #267
  • Discussion: Eventing model/solution #263
  • Recommended Kubeflow Setup on GKE #241
  • Recommended setup for different K8s Solutions #240
  • Use Ambassador/Envoy as proxy for JupyterHub #239
  • ks apply cloud -c newjob silently fails #217
  • Tracking Central UI #199
  • Kubeflow logo? #187
  • discussion
  • Add kubeflow into tesorflow/ecosystem #177
  • Central UI #141
  • Investigate file server connection errors with Jupyter and IAP #140
  • TfJob controllers are not namespace scoped so tests aren't isolated #134
  • Add resource request and limit fields to tf-job ksonnet prototype #116
  • Figure out ksonnet repo organization #106
  • Add Argo package to our ksonnet registry #21

Merged pull requests:

v0.1.3 (2018-04-26)

Full Changelog

Closed issues:

  • all training run on the first worker #721
  • Jupyter images pruned too aggressively? #719
  • cert-manager in "setting-up-iap-on-gke" not working #709
  • New repo for benchmarking #708
  • minikube VM unavailable for e2e kubeflow-presubmit check #707
  • Kubeflow Jupyter notebook images are large because of all the extra Python libs; can we shrink to improve pull times? #568

Merged pull requests:

  • CP to v0.1-branch: restore some important NB pkgs; http_timeout; bool fixes for tf-serving #726 (pdmack)
  • update ksonnet version to v0.10.0-alpha.3 #725 (kunmingg)
  • Add a few packages to the jupyter notebook image #724 (ankushagarwal)
  • add kunming to reviewer #715 (kunmingg)
  • controller bug fix #714 (kunmingg)
  • openmpi
  • Adding ksonnet components for tensorboard files. Issue #297 #710 (abkosar)
  • Point instructions to 0.1.2 release #706 (ankushagarwal)
  • CP to v0.1-branch: Update http_timeout to 5 minutes in jupyterhub \#691 #705 (pdmack)
  • openmpi
  • openmpi: namespaced resource names should be prefixed with component name #698 (everpeace)

v0.1.2 (2018-04-21)

Full Changelog

Fixed bugs:

  • Segfault in bootstrapper while processing .kube #657

Merged pull requests:

  • CP to v0.1-branch: TF notebook slimming, joyvan pip installs, and new gcr.io locations #703 (pdmack)

v0.1.1 (2018-04-20)

Full Changelog

Fixed bugs:

  • tf-cnn: no matches for kind "TFJob" in version... #643

Closed issues:

  • ks 0.10.0-alpha.2 does not work with install instructions #686
  • Bootstrap image not found #682
  • Unable to get kubeflow working on hyperkube in a circleci vm #674
  • Cannot install pip packages from jupyter notebook #668
  • Expose istio dashboards #644
  • Timeout waiting for simple-tfjob-gke #636
  • Refactor tf-serving image build for multiple TF versions #632
  • Some individual tests not showing up in test-grid #631
  • User guide sprawl #629
  • inception server not working #621
  • Add Link to Intel's tutorial #612
  • Add auto-generated TOC for all docs #603
  • Error occurs when following user guide #598
  • GKE
  • kubeflow version #578
  • Presubmit failure: No such file or directory XXX/.kube/config #562
  • Make IAP config robust to updating the Ingress #550
  • TF MPI support #535
  • TF serving param deployHttpProxy needs to be transformed from string to bool #531
  • Cut a 0.1 release #506
  • TF serving component monitoring #496
  • TF serving GPU e2e test is flaky #484
  • Build more efficient tensorflow-notebook images #472
  • Jupyter/Azure
  • Make it easy to get started with Kubeflow #105
  • Size of gcr.io/kubeflow/tensorflow-notebook-* #37

Merged pull requests:

  • permission bug fix and image tag update #699 (kunmingg)
  • Update the hub spawner dropdown for latest NB images #697 (pdmack)
  • openmpi: master/workers sync mechanism is replaced with k8s api from redis #696 (everpeace)
  • Migrate images to kubeflow-images-public #695 (ankushagarwal)
  • Pin instructions to ks 0.9.2 for now #694 (pdmack)
  • add export to ksonnet github token instructions #693 (mattf)
  • openmpi: slots clause should be generated when gpus '> 0' #692 (everpeace)
  • Update http_timeout to 5 minutes in jupyterhub #691 (ankushagarwal)
  • openmpi: fix failing installing redis-tools in init.sh #690 (everpeace)
  • Refactor tensorflow-notebook-image/Dockerfile #689 (ankushagarwal)
  • openmpi
  • openmpi: make 'schedulerName' configurable to use custom schedulers. #683 (everpeace)
  • create rolebinding within namespace to guarantee permission #680 (kunmingg)
  • Support rollout new model with istio #679 (lluunn)
  • Add documentation for exposing grafana dashboard #678 (lluunn)
  • openmpi package doesn't work on kubernetes cluster having custom dns. #676 (everpeace)
  • Add batch support to openmpi package #671 (jiezhang)
  • make Dockerfile & Makefile cross platform #670 (kunmingg)
  • Update ksonnet_packages.md #661 (pdmack)
  • Include tensor2tensor in jupyter notebook image #659 (ankushagarwal)
  • Add willingc to reviewers #656 (willingc)
  • Azure
  • Bootstraper polish: #654 (kunmingg)
  • add link to Intel's tutorial #652 (raddaoui)
  • Azure
  • Remove zjj2wry as a reviewer. #648 (jlewi)
  • Remove the outdated YAML specs for TFCnn job. #647 (jlewi)
  • update user_guide #646 (lluunn)
  • Add components to work with GCP #645 (jlewi)
  • Change naming from jupyter to jupyterhub when referring to hub #642 (willingc)
  • Troubleshooting note on cluster-admin privileges for tf-job-operator #641 (tmckayus)
  • Use dashes not underscores in junit file names. #638 (jlewi)
  • Update various images in kubeflow to kubeflow-images-public #635 (ankushagarwal)
  • Create a kubeflow version file: version.txt and a configmap kubeflow-version #634 (ankushagarwal)
  • Cleaned up README.md #628 (ddutta)
  • TF Serving + Istio #627 (lluunn)
  • Improve user guide wrt exposing notebook in non-cloud setup #625 (xyhuang)
  • Add openmpi package #624 (jiezhang)
  • Fix setup/teardown of VM for minikube. #620 (jlewi)
  • Update and clarify JupyterHub README #616 (willingc)
  • images: Add the link #614 (gaocegege)
  • Improve documentation #613 (jlewi)
  • Initial ksonnet package for Pachyderm. #610 (jlewi)
  • Provide documentation about adding ksonnet packages to Kubeflow. #609 (jlewi)
  • Upgrade tf-serving version to 1.6.0 #608 (pdmack)
  • import of cloud-endpoints component and support in iap-ingress component #605 (danisla)
  • docs(TOC): add auto-generated TOC for all docs #604 (DjangoPeng)
  • docs(troubleshooting): add a new item into troubleshooting #602 (DjangoPeng)
  • Update the docs to point to the v0.1.0 release now that it is cut. #601 (jlewi)
  • Create a doc with information about our docker images. #600 (jlewi)
  • Typo in UG #599 (pdmack)
  • Create a POC of an app to simplify deployment of Kubeflow #595 (jlewi)
  • CP: Install graphviz in tensorflow notebook image \#583 #585 (pdmack)
  • Enable TF serving gpu test #558 (lluunn)
  • Branching and tagging policy for releases #519 (willb)

v0.1.0-rc.4 (2018-04-04)

Full Changelog

v0.1.0 (2018-04-04)

Full Changelog

Closed issues:

  • Explicitly specified version of tensorflow being replaced in the python 2 environment with the latest version from PyPI #571
  • Jupyter pod stuck at ContainerCreating when spawning #336

Merged pull requests:

  • Cherry pick : Update tensorflow notebook version to v20180403-1f854c44 \#589 #590 (ankushagarwal)
  • Update tensorflow notebook version to v20180403-1f854c44 #589 (ankushagarwal)
  • README: Add link for tf-operator #588 (gaocegege)
  • Clean up minor formatting errors in README #587 (pdmack)
  • Correct typo for param defaultHttpProxyImage \#556 #584 (jlewi)
  • Install graphviz in tensorflow notebook image #583 (ankushagarwal)
  • Disable PVC by default \#577 #582 (inc0)
  • Fix the tensorflow version in prebuilt images for python 2.7. #580 (ojarjur)
  • Disable PVC by default #577 (inc0)
  • Add pdmack to OWNERS #565 (pdmack)
  • Correct typo for param defaultHttpProxyImage #556 (pdmack)
  • sidecar for envoy pod to keep IAP up to date #552 (danisla)
  • Update instructions about releasing TFJob #532 (jlewi)

v0.1.0-rc.3 (2018-04-03)

Full Changelog

Closed issues:

  • IAP should not store client secret in ksonnet #569
  • other tf-operators? #539

Merged pull requests:

  • Cherrypick #570 and #572 into v0.1-branch #575 (ankushagarwal)
  • Moved OAuth secret from param to named secret #572 (danisla)
  • Update tf jupter notebook images to include tfma from \#544 #570 (ankushagarwal)
  • Rety upto 3 times while building tensorflow notebook images #559 (ankushagarwal)
  • Add the "tensorflow-model-analysis" package to the notebook images #544 (ojarjur)
  • add willb to approvers #542 (willb)
  • Docs should show how to pull a particular release #524 (jlewi)
  • Changes to support running E2E tests on minikube. #523 (jlewi)

v0.1.0-rc.2 (2018-04-02)

Full Changelog

Closed issues:

  • Tensorflow no longer working in the GPU image if you build from HEAD #549
  • TFJob UI not included in ksonnet configs #546
  • dev.kubeflow.org 502s #545

Merged pull requests:

v0.1.0-rc.1 (2018-03-30)

Full Changelog

Closed issues:

  • Upgrade ksonnet in our Jupyter images to 0.9.2 #490
  • Javascripts widgets don't work in JupyterLab #489
  • Option to disable use of PVC for Jupyter #365
  • Katacoda Demo Scenario Python incompatibility - Warning #70

Merged pull requests:

v0.1.0-rc.0 (2018-03-27)

Fixed bugs:

  • tf-cnn fails to create #458
  • Presubmit failure; cannot import k.libsonnet #447
  • Presubmit shows succeeded, but some test actually failed. #436

Closed issues:

  • Tf serving component should provide default HTTP image #511
  • Continuous testing for release branches #507
  • construct base object: Failed to filter components; the following components don't exist: [ 'kubeflow-core' ] #481
  • Document GITHUB_TOKEN #478
  • ks env set default --namespace=kubeflow doesn't change the namespace #477
  • Build Jupyter Notebook images for supported versions of TF #467
  • ERROR
  • JupyterHub terminates with 500 Internal Server error #433
  • Proposal: We should have basic tests for every ksonnet prototype #432
  • Remove tensorboard link from Jupyter notebooks hub? #428
  • jovyan user cannot sudo in terminal #425
  • Cannot not run tensorboard from Jupyter notebooks #422
  • IPyWidgets not displaying when using a Python 2 kernel #419
  • normalize libsonnets so their "all(params)" is only called from kubeflow/core/all.libsonnet #417
  • libcublas.so.9.0: cannot open shared object file: No such file or directory #414
  • Executing child process 'start-notebook.sh' failed: 'Permission denied' #412
  • Use local NFS server as PersitentVolume #410
  • Create a ksonnet component to deploy seldon-core models #405
  • Replace two envoy containers with one in the envoy pod #404
  •  question 
  • Build Envoy Container with JWT validation #394
  •  support 
  • cnn tfjob status never change to completed #389
  • TFServing prototype for using GCS with service account key #385
  • Jupyter notebook image should pin TF version #375
  • ClusterRole's and ClusterRoleBindings should include namespace in the name #374
  • Ambassador can't watch services at the cluster level. #373
  • Missing TFServing monitoring features? #369
  • Set userid and group id in TF Serving GPU container #367
  • Reviewable is blocking automatic merging #356
  • Add tensorflow-serving-api to jupyter image #355
  • Build http proxy for TF serving as part of our release workflow #353
  • Better docs for http proxy #352
  • Failed to pull image "gcr.io/kubeflow/tf-benchmarks-gpu:.... #348
  • autoformat_jsonnet.sh unknown predicate -E #345
  • ambassadors are crashed and cannot be created #344
  • How to login the Jupyterhub on the remote server? #343
  • ksonnet
  • Build TFServing GPU Docker Image #338
  • Automatic PR merge #331
  • Add additional links and information about JupyterHub to README #329
  • Jupyter pod can't start; jinja exception Encountered unknown tag 'trans'. #325
  • Avoid code duplication in Dockerfiles for Jupyter notebook images #321
  • Kubeflow not deployed #318
  • Investigate notebook container sidecar to enable in-notebook container builds #312
  • JupyterHub Spawner should have a UI element that specifies default docker images #309
  • Need an OWNERS file #308
  • Discussion
  • option naming inconsistencies #303
  • Test the TFServing configs with default image #301
  • Configure JupyterHub spawner to allow root operations by default #300
  • TFServing docs are incorrect; we no longer assign a public IP by default #296
  • Add an option to set service type to TFServing deployment #295
  • Example/Docs for using IAP with TFServing #293
  • TFServing deployment should support GPUs #292
  • E2E Test for TFServing with GPUs #291
  • TFServing component should add an ambassador route #290
  • How to add persistent volume in jupyter_spawner.py file ? #285
  • Prow jobs should use a common docker image #276
  • TFJob test is failing #273
  • Presubmit failing: Time out waiting for Workflows #272
  • Better identity management in K8s #266
  • JupyterLab not working in latest notebook images #262
  • JupyterHub spawner: upstream request timeout #261
  • bug
  • IAP component should use cert-manager to get a signed certificate #255
  • Run util_test.jsonnet added in #246 as part of E2E tests #254
  • tf.transform libraries in our notebooks #244
  • user guide - tf-serving: ClusterIP instead of LoadBalancer service #233
  • Typo in argo-ui rolebinding #230
  • JupyterHub fails to load image properly, but starts a notebook anyway #226
  • Move troubleshooting guide into user_guide #223
  • Build and publish Docker images using Argo #221
  • E2E tests need to verify that we can submit a TFJob #207
  • Use kubeflow/testing to run Argo workflows #205
  • KubeFlow on AWS - tf-hub-lb not created and cannot change jupyterHubServiceType to loadbalancer #203
  • ERROR Server is unable to handle tensorflow.org/v1alpha1, Kind=TFJob #200
  • Increase rate limit for installing kubeflow packages #195
  • Add GPU Support for k8s-model-server on Kubeflow #194
  • TFJob controller cannot terminate job #193
  • Remove tensorflow/k8s as a submodule #190
  • Delete components/tf-controller #189
  • Error from server Forbidden: error when creating "deploy_crd.yaml": clusterroles.rbac.authorization.k8s.io "tf-job-operator" is forbidden #188
  • tf-job.libsonnet is issuing the wrong CRD job type #186
  • Test failures #176
  • Create a repository for examples: kubeflow/examples #174
  • Create a testing repository #173
  • Change google/kubeflow links to kubeflow/kubeflow #168
  • Setup prow for the new org #165
  • Add TFJob CRD creation to ksonnet components #164
  • ksonnet component for Seldon.io deployment #159
  • E2E Solution for GitHub Issue Summarization On Kubeflow #157
  • CreateSession still waiting for response from worker: /job:ps/replica:0/task:0 #153
  • Test cluster is unhealthy no ready nodes #150
  • ClusterRoleBinding.rbac.authorization.k8s.io "tf-job-operator" is invalid #148
  • worker restart result in the TFJOB can not finish expectedly #147
  • PVC created but not added to volume list #145
  • Code of conduct #143
  • Add ksonnet to our Jupyter notebook Docker images #138
  • What's up with Tensorflow? #135
  • Postsubmits appear to be broken #133
  • Make ArgoUI for our tests publicly accessible #131
  • tf-cnn template doesn't set TerminationPolicy correctly #129
  • Copy Jupyter notebook Dockerfiles into Kubeflow repo #126
  • tf-job-operator service account is missing roles #125
  • JupyterHub should not open up external IP by default #123
  • where is the Dockerfile of gcr.io/kubeflow/tensorflow-notebook-cpu? #122
  • Default service type for ModelServer should not be loadbalancer #121
  • Add tensorBoard field to tf-job prototype #113
  • po/tf-job-operator logs said user cannot get endpoints in the namespace "default" #109
  • Typo in tfjob.jsonnet #107
  • Problems connecting to Jupyter Kernel with IAP #104
  • Create only one service for JupyterHub and make type a parameter #99
  • Delete inception.tar.gz #98
  • Postsubmits need to use registry components at commit being tested #96
  • Presubmits need to pull registry from the PR branch #95
  • link to ksonnet might be confusing #91
  • release .yaml manifest #90
  • Incorporate tf.transform #88
  • ProwTestCase library #83
  • Set suitable defaults for JupyterHub service type and make it configurable? #80
  • if i use kubeflow on local cluster with gpu #79
  • Failed to connect to my Hub at http://tf-hub-0:8081/hub/api attempt 1/5. Is it running? #78
  • Support for other Deep Learning Libraries #74
  • Setup PR Dashboard #73
  • TfServing uber tracking bug #64
  • tf-job missing from ksonnet registry yaml #62
  • Add missing features to TfJob controller ksonnet component #61
  • Support IAP on GKE #60
  • Syntax error of juypterhub.yaml for Kubernetes 1.6 #58
  • Proposal: Include very basic tracking of usage by default #55
  • Should Kubeflow publish and maintain TF Serving Docker Images? #50
  • tf-cnn prototype doesn't add GPU resource requests #48
  • Remove namespace as a package parameter #43
  • Clean up repo after switching to ksonnet #41
  • Doc gen for kubeflow ksonnet registry #39
  • E2E Testing For Kubeflow. #38
  • Proposal: Discuss Kubeflow organization and community #35
  • Proposal: our expectation on KubeFlow #33
  • Add LICENSE file: Apache License, Version 2.0 #27
  • Fault tolerant storage for Jupterhub #19
  • Tensorboard support #17
  • python model server #15

Merged pull requests:

* This Change Log was automatically generated by github_changelog_generator