This is a round-up of all the New Tool Tuesdays that have gone into the MLOps community newsletter thus far. Feel free to subscribe to the newsletter and stay up to date here.
No need to waste all your GME & AMC earnings on infra costs anymore. Every now and again something comes along that makes you think, why was this not already a thing?
Today I found out about the new project BudgetML that scratches an itch for a severely under-served market. I was talking to the co-creator Hamza Tahir of ZenML to get the full story on the origin and inspiration of BudgetML.
This is a tool that is built explicitly to deploy models on a budget. but at this moment let’s be very transparent about it, by no means is this meant to be a full-fledged production-ready setup. Hamza told me he has been working in ML for a while now hence why he started ZenML, and for fun, every 6–8 months he and his brother get together and hack together some fun projects. A little brotherly love and bonding time. The intention here isn’t to monetize the projects it’s just to stay sharp and let their two expertise be put to good work. (Hamza backend + ML, brother frontend).
Some of the duos past projects were a photo enhancer entitled PicHance and you-tldr which gives you the tl;dr of any youtube video in seconds. Both of these projects involved deploying models and spending serious $$$ was out of the question for some play projects put together for a few laughs over the weekend. As good engineers do when they encounter problems Hamza didn’t settle and decided to build something that would fill his need.
“So I clobbered together this solution like 1 year ago for PicHance. It’s shoddy, but it deploys the model, uses spot instances, and always makes sure its up.” -Hamza
Did it work? Short answer: yes. It cut costs drastically. No cluster/GKE costs and the instance cost reduced by 80% which is especially impressive as image models need some big ass instances as you know. So, these side gigs are cool and all but not the guys’ main jobs. All the more reason as to why they needed to be running these websites on the cheap.
Hamza had been using the best kept secret of 2020 for over a year before he decided happiness is only real when shared and he was going to unleash it on the world. Over Xmas he prettied it up a bit and made the API having become a bit of a pro from all the blood and sweat he sunk into ZenML.
Bam, just like that another win for the little guys! All you folks out there with your side projects and looking to downsize on infra costs this bad boy is for you! As I said before, this is an underserved market and I look forward to seeing how this can help enable more side projects with ML to get off the ground and not require so many resources from the creators.
Last words from Hamza…
“but if you do a New Tool Tuesday on BudgetML, please also talk about ZenML because my co-founders will kill me if you don’t mention it too!”
Cookie Cutter On Steriods
I can not take credit for the description of this one, it was all Alexey this time. However, I want to chat for a minute about a new tool that just came through the community from Neal Lathia head of ML at Monzo. The name Neal Lathia might sound familiar if you saw the awesome coffee session we had with him a few months back.
I wanted to get Neal’s take on what it is and why he created this tool called Operator (not to be confused with k8s operators). So Mr. Monzo ML told me….
“Working with ML systems today is a bit like living in two worlds. In one world, I’m thinking about the ML model I’ve trained and how excited I am to ship it. In the other world, I’m reading the AWS docs (and tutorials, blog posts, and stackoverflow questions) about IAM roles and API gateways and how to tie them all together. I’d like to spend less time in that second world, and more time in the first!”
Neal is a proactive dude and his goal was clear. Inspired by a few internal tools he has at Monzo he decided he’d hack together a CLI tool that has two functions:
Create: to get the boilerplate that’s needed to get going.
Deploy: to ship that code to a serverless function in AWS or GCP
Although he is still in the lab creating a model store that hopefully will be released to the world soon, he wanted to get this out so the world can have a play and hopefully spend more time hanging out in that first world he spoke about earlier. (Although multi planetary species are ok in case something happens you know like global warming)
Anyway, Lathia finished off our convo by telling me “I’ve open-sourced operator so that anyone else can give it a spin and see the ✅ emojis pop up while they deploy their serverless in two commands. If you have any feedback, you can find me on the ML Ops community slack!”
Personally, I know a few people that have started using it are loving it, (especially the emoji part), but I’d also be really interested to hear what your thoughts are! You can see a quick video of it in action here, otherwise, click the button to go straight to the source.
For those looking at alternatives to running ML workloads in Kubernetes, this new project bodywork sounds promising from community member Alex Ioannides.
I was able to talk with Alex about the project for a minute and gain some nice insights into his motives and reasoning around creating the framework. The tool was born out of too many headaches while he was working on the MLOps platform a few years ago while he was at Oracle. The time suck of having to learn k8s forced him to reevaluate his life (maybe not his exact words but I gotta put in some drama for the story’s sake). He understood the benefits of containerizing and deploying models on Kubernetes yet he also saw how difficult it was.
Around that same time, he got turned on to the idea of GitOps and realized this might be the cure for his MLOps headaches. After convincing his brother to come and help him build out this new tool they got to work creating GitOps for ML which they labeled ‘Bodywork’.
So just like that a new open-source ML tool was born. When I asked Alex what exactly it did he replied
“Its a fully-functional tool that will automate the execution of multi-stage ML workflows and can also deploy services (e.g. train-and-deploy), in containers, on Kubernetes — without needing to build any images or write any Kubernetes config YAML, giving machine learning engineers full control of their pipelines and API definitions.”
The support so far for the tool has been incredible which obviously shows there is a need for something like this, I know a big debate in the community is around how much k8s a data scientist /ML Engineer should know. Although I don’t think it will ever hurt to learn a bit of Kubernetes and all the ecosystem of tools that goes along with it, this does appear to be a nice solution for an important piece of the puzzle. If any of you are using Bodywork already let us know what you think in slack, I am really curious to hear reviews.
More Version Control For ML
Background: I realize that I have the benefit of meeting and hearing about many tools that are coming out in the ML tooling ecosystem whether that is from people reaching out to me, or just cause I spend an absurd amount of time researching this stuff. So I thought I want to create a section of the newsletter for any time I hear about a new tool that I think could help the greater community.
These posts are NOT sponsored and you cannot get me to do a feature on you (so please don’t ask). The whole point is to shine a light on smaller lesser-known tools that I randomly come across. I will not be doing this in every newsletter, just when something grabs my eye.
Replicate.ai caught my attention last week because they are attacking the classic problem of reproducibility within ML. The founding member of this open-source project is Ben Firshman the same guy that created docker compose…. so they got that going for them.
Why tackle this problem, now?
As they put it on their Github page, everyone uses version control for software, but it is much less common in machine learning. Why is this? We spent a year talking to people in the ML community and this is what we found out:
- Git doesn’t work well with machine learning. It can’t handle large files, it can’t handle key/value metadata like metrics, and it can’t commit automatically in your training script. There are some solutions for this, but they feel like band-aids.
- It should be open source. There are a number of proprietary solutions, but something so foundational needs to be built by and for the ML community.
- It needs to be small, easy to use, and extensible. We found people struggling to integrate with “AI Platforms”. We want to make a tool that does one thing well and can be combined with other tools to produce the system you need.
I am anxious to see if they can set themselves apart from other tools out there that already exist, mainly DVC. I know they aren’t the only ones who have tried to create a culture of version control in the ML world, so I wonder if it will stick this time. When I talked to Ben he told me “the primary thing we are trying to do right now is create community because we think this thing should be built by a community not, just a single vendor”.