My MLOps Journey
My totally un-planned and fortunate journey into the fascinating world of MLOps
Background : ICYMI, Here is a good blurb about MLOps on wikipedia.
MLOps is still in the early stages. Different companies use MLOps, ML Platform or ML Engineering interchangeably to describe this nascent field.
I have a slightly different take purely based on my personal experience.
(Yes, every MLOps post needs a mandatory Venn Diagram as below)
MLOps is a highly multi-disciplinary field with a need to have expertise in at least 2 of the 3 sub-domains. It’s very rare to come across engineers who are good in all 3 sub-domains (I call them MLE unicorns).
My 3 year journey in to MLOps
It all started in January, 2018 at Discover. I was hired to work in a Data Engineering team. Due to an organizational change a new team was formed to evaluate different ML solutions/platforms. I was fortunate to be moved in to that team, and we started the exciting journey of building our own Cloud-Native Data Science Platform.
As I already had experience in Application development with building Web applications, Micro services and CI/CD pipelines, I started working on building those components of the platform. In parallel, I started shadowing other engineers in our team with deep expertise in Docker, Kubernetes, Snowflake and AWS.
2019 was all about developing my own knowledge of Docker, Kubernetes, Snowflake and AWS while continuing to work on Application development.
We as a team encountered lots of interesting challenges with respect to Scalability, Security and Integration of different components of the platform (Kubernetes/OpenShift, AWS and Snowflake).
After several re-designs, we built a scalable platform as a service called AIR9 with the goal to support over 1000 users (Data Scientists, Data analysts & Data Engineers).
By the start of 2020, I became fairly comfortable with Docker, Kubernetes/OpenShift, AWS and Snowflake.
I started getting involved in Data science and Machine learning side of the platform. I was primarily working on Spark on Kubernetes, Jupyter/Python, RStudio/R, MLflow and tuning/troubleshooting Data science and Machine learning workloads.
We are in the middle of some of the most interesting advances in AI/ML. There is lot of hype, but the potential for great breakthroughs in AI/ML is undeniable.
Here is my advice to someone starting or planning to start their MLOps/ML Platform journey.
- Get into this field, if you are passionate and enjoy complexity.
- Make a plan to develop basic competency in multi-disciplines.
- Start with your area of expertise and learn from your team mates with expertise in other sub-domains.
- There is no substitute to the real world experience. So, seek opportunities in the field or adjacent fields like Data engineering or Platform engineering.
- Develop a good work ethic and be willing to be highly collaborative.