Kubeflow 1.3 : Shiny, Secure and Scalable
Background: I tried Kubeflow 1.0 in May 2020, with a narrow focus of Cloud native ML pipelines.
With the latest Kubeflow 1.3 release, they streamlined the setup process, improved security and user experience. Even with these updates, there is still a learning curve for non-technical/non-engineering users. Another improvement is the ability to pick and choose the components you want to install.
IMO, the ideal use-case is a cross-functional Data Science team with a mix of Platform Engineers, ML Engineers and Data Scientists.
- Setup and Improved UX/UI:
Relatively easy to setup compared to version 1.0, easy to understand UX and responsive UI.
I tried this on a GKE cluster with 6 nodes (Total Cluster resources = 12 cores, 60GB).
Oh, they now have ready to use JuptyterLab, Open-source VS code (Code-Server) and RStudio.
Code-Server (VS Code in the browser)
Compared to version 1.0, Ready to use Pipelines and Experiments work seamlessly. There is a lot of customization you can do (which I did not explore).
Other updates and enhancements
- Katib is now AutoML (for hyper parameter tuning)
- Volumes to create and attach Object storage or PV/PVC
- Tensorboards for Visualizations (I am not sure how widely this will be used)
- KFP comes in 2 flavors
* KFP with Argo workflows as default version.
* KFP with Tekton (This is an additional option with contributions from IBM and RedHat)
* Adhoc manual runs
* Scheduled runs (Using Argo Workflows)
- Metrics and Metadata UI have improved as well
There is a renewed interest in Kubeflow from almost every major cloud services company (AWS, GCP, Azure, IBM, RedHat/OpenShift). In addition to these cloud providers, you can now run Kubeflow as an Operator or on Microk8s, MiniKF, Kind etc.