My MLOps Journey
My totally un-planned and fortunate journey into the fascinating world of MLOps
Background : ICYMI, Here is a good blurb about MLOps on wikipedia.
MLOps is still in the early stages. Different companies use MLOps, ML Platform or ML Engineering interchangeably to describe this nascent field.
I have a slightly different take purely based on my personal experience.
(Yes, every MLOps post needs a mandatory Venn Diagram as below)
MLOps is a highly multi-disciplinary field with a need to have expertise in at least 2 of the 3 sub-domains. It’s very rare to come across engineers who are good in all 3 sub-domains (I call them MLE unicorns).
My 3 year journey in to MLOps
Year 1:
It all started in January, 2018 at Discover. I was hired to work in a Data Engineering team. Due to an organizational change a new team was formed to evaluate different ML solutions/platforms. I was fortunate to be moved in to that team, and we started the exciting journey of building our own Cloud-Native Data Science Platform.
As I already had experience in Application development with building Web applications, Micro services and CI/CD pipelines, I started working on building those components of the platform. In parallel, I started shadowing other engineers in our team with deep expertise in Docker, Kubernetes, Snowflake and AWS.
Year 2:
2019 was all about developing my own knowledge of Docker, Kubernetes, Snowflake and AWS while continuing to work on Application development.
We as a team encountered lots of interesting challenges with respect to Scalability, Security and Integration of different components of the platform (Kubernetes/OpenShift, AWS and Snowflake).
After several re-designs, we built a scalable platform as a service called AIR9 with the goal to support over 1000 users (Data Scientists, Data analysts & Data Engineers).
Year 3:
By the start of 2020, I became fairly comfortable with Docker, Kubernetes/OpenShift, AWS and Snowflake.
I started getting involved in Data science and Machine learning side of the platform. I was primarily working on Spark on Kubernetes, Jupyter/Python, RStudio/R, MLflow and tuning/troubleshooting Data science and Machine learning workloads.
Summary:
We are in the middle of some of the most interesting advances in AI/ML. There is lot of hype, but the potential for great breakthroughs in AI/ML is undeniable.
Here is my advice to someone starting or planning to start their MLOps/ML Platform journey.
- Get into this field, if you are passionate and enjoy complexity.
- Make a plan to develop basic competency in multi-disciplines.
- Start with your area of expertise and learn from your team mates with expertise in other sub-domains.
- There is no substitute to the real world experience. So, seek opportunities in the field or adjacent fields like Data engineering or Platform engineering.
- Develop a good work ethic and be willing to be highly collaborative.