This episode is the epitome of why people listen to our podcast. It’s a complete discussion of the technical, organizational, and cultural challenges of building a high-velocity, machine learning platform that impacts core business outcomes.
Orr tells us about the focus on automation and platform thinking that’s uniquely allowed Lemonade’s engineers to make long-term investments that have paid off in terms of efficiency. He tells us the crazy story of how the entire data science team of 20+ people was supported by only 2 ML engineers at one point, demonstrating the leverage their technical strategy has given engineers.
To have dedicated ML engineering relatively early in a company's Data Science journey.
That was awesome, man. We just talked with Orr – Orr Shilon, for those who want the full name – he's
one of the ML platform engineers at Lemonade. What do you think of the conversation, Vishnu?
I thought it was great. It's always rare, I think, to find the kind of person who is both in the weeds as a
talented individual contributor, but also has to think strategically about company and business needs.
Usually, those kinds of people are either plucked very quickly to the top or are off doing their own thing
nowadays. And it was cool to see someone like Orr, who has stuck around at Lemonade for a while and
has really pushed the kinds of ML platform infrastructure that I think a lot of companies would love to
have. It was a great conversation.
That's true. When he was explaining the platform that they have, and basically, the conversation that we
have coming up for you right now is centered around how Lemonade created their platform and what it
looks like, and what are some things that they keep in mind – this platform thinking mentality. When he
was talking about it, I just kept thinking, “Wow, this is pretty advanced! This is cool. I hope this becomes
the standard, as opposed to the exception.” So hopefully, when everyone's listening to it, you get some
good feedback, or some good nuggets of wisdom that you can bring into your job. I wanted to mention
something else, man. There are some things to call out. At some point, he was with one other platform
engineer, serving 20 data scientists – that's pretty incredible. Now, the team has since grown to four, but
2 to20 – one platform engineer for every 10 data scientists – that's a pretty big one. Then they have a
Slack bot that they can train different models with? They have a Slack bot that they can push models to
production with? That's pretty wild, too. I've never heard of that.
Yeah, the emphasis on automation, platform thinking, and efficiency was pretty impressive to see.
Because you don't get that level of leverage (in terms of two people being able to support that many
people) very often, unless you've made long term investments in the efficiency and productivity of those
people. The fact that they were able to do that shows that they have really done that for a long period of
time and with a lot of vision. I think everyone listening will learn a lot about how if you make the right
kind of investments up front, over time, they can really pay off in terms of efficiency.
There's one thing that I wanted to point out. We were chatting on Slack, while he was talking about this –
he goes over their five different pillars of their ML platform. The first one he talked about was how they
built their own feature store and it reminded me of when Jesse was on here, like a week two weeks ago,
and Jesse was talking about this conundrum they have for a lot of these companies who started doing ML
early. And it was exactly the narrative that Jesse laid down for us. If you haven't listened to the
conversation with Jesse, go back and listen to it. It was a fascinating one. Not a lot of people checked it
out. I'm surprised why it didn't get as many views or listens as I thought it would, because the
conversation with Jesse was just incredible. But I think we must have messed up on the title or
something. People weren't working on it. It's called “Scaling Biotech” so people probably thought, “Well,
I'm not in healthcare. I'm not doing anything with Biotech, so it doesn't apply to me.” Jesse had amazing
wisdom. So I highly encourage you to go listen to that. But the premise of it, and what Jesse talked about,
was how these companies start building their own tools internally, because there is nothing on the
market. And they all start doing that around the same time, because they start seeing the need and there
are these bottlenecks that are happening. So everyone's kind of building the same internal tools – say like
a feature store, we can use that because that's like the canonical example. But it's not just for feature
stores. A lot of them, like monitoring or deployment platforms, we can use any of these. But what
happens is, Uber's building their Michelangelo and they have a feature store, and then they spin it out
and it becomes Tecton. And Orr was talking about how they were building a feature store and it's the
same thing. They have this problem now because Tecton wasn't out before they started their whole
journey on that feature store. But now it's so customized – their feature store at Lemonade is so
customized to their needs, that he was saying “I have a really hard time seeing us go out and buy
something off the market because we have something that is so customized to what we do. We're just
going to keep doing it.” And Jesse was talking about that too. He called it exactly how it was. And that's
what made me really appreciate the conversation with Jesse more after hearing Orr talk about this. But
anyways, I think we've talked enough about what you're going to hear and we’ll just get into the
conversation with Orr right now. Any last thoughts from you, Vishnu?
Real quick, do you want to just go through his bio and read it out? I'm happy to do it. Just so that
everybody gets a sense of what he does?
Yeah, that's probably a good idea.
Yeah, let me do that real quick. So, Orr Shilon is an ML Engineering Team Lead at Lemonade, who’s
currently developing a unified ML platform. Trust me, it's really cool. This team's work aims to increase
development velocity, improve accuracy, and promote visibility into machine learning at Lemonade.
Previously, Orr worked at Twiggle on semantic search, at Varonis, and at Intel. He holds a B.Sc. in
computer science and psychology from Tel Aviv University. Orr also enjoys trail running and sometimes
Last thing before we jump into the full conversation – we're looking for people to help us edit these
videos. Basically, you don't have to know anything about editing, you just have to tell us where the
nuggets are – where the gems are – in these conversations so that we can trim it down and get like a 10
minute highlight reel for people to watch on our ML Ops Clips secondary channel. If you're interested in
that – because we can't hire a producer, you're probably thinking “Well, get somebody that's actually a
pro at this.” The problem is that producers and podcast producers or video producers, they don't
understand machine learning and so they don't know what's a gem and what's just us talking and it's not
that cool. So if somebody out there who is knowledgeable in this space, and passionate, and a listener,
who wants to help out, we would love some volunteers. That's it. Let's get into the conversation.
Hey, Demetrius. How's it going?
What’s up, man? I'm stoked to be here. As always, we've got another incredible guest. What's going on,
Hey, man. Hey, guys. I'm doing great. Thanks for having me.
It's a pleasure to have you on, Orr. We're really excited to talk about the ML platform at Lemonade. I'm
excited to hear more about what the company does, how you guys have built the team you have, and get
to learn from you in terms of what this really fast-growing and hot company, in a very interesting
industry, is doing from an ML platform standpoint. So to kind of kick things off, can you tell us a little bit
Sure. So, I'll start off with Lemonade a little bit. Lemonade is a full-stack insurance carrier. And I guess we're
sort of powered by AI and social good. But we do basically four kinds of insurance. The first is PNC insurance
for homeowners and renters. The second would be pet health insurance. The third is life insurance. And the
fourth is – we recently launched car insurance. We do this mostly in the United States, but in several
countries in Europe as well, depending on which kind of insurance. Like I mentioned before, we're a full-
stack insurance carrier, which means we're not an aggregator – we actually are the underlying insurer.
Got it? It's funny, I'm actually a Lemonade customer. [laughs] I’m here in New York City.
So you’re highly biased. [laughs]
Yeah, definitely. [cross-talk]
That's great to hear. [laughs]
I'm pretty happy with the product. With that in mind – so, you guys are a full stack insurance carrier, you
have all these different lines. Where does machine learning fit into what Lemonade does? Why do you
need machine learning?
I would say sort of on this line with tension between two different things. The first would be improving the
product – we want to improve our users’ lives by improving the product. That would be something like –
maybe I'm doing intent prediction for a chatbot for customer service before you actually reach a customer.
You could maybe handle something before reaching a representative. Or maybe doing other things
automatically for customers. Then the second big chunk would be trying to improve our business with
things like maybe predicting lifetime value for users – things like that – which, you know, can be done in any
That makes perfect sense. I think it's always helpful for us to set the context for why machine learning is a
part of the business. And with that in mind, I have seen from the outside a little bit of how this business
has grown and also just how ML engineering at Lemonade has grown – from the job postings you guys
have, from the descriptions of your platform that I see online, and talks. Can you tell us a little bit about
what the end-to-end process (going from data to model in production) looks like at Lemonade on your ML
Yeah. The platform sort of provides what we call “point-in-time data”. Basically it provides data on our data
warehouse directly in Snowflake. And if researchers want to do something like exploratory data analysis,
they can do that directly on the data warehouse on these giant dimension tables with hundreds of features
that have already been created for them, or they can do it on raw data. I think at this point, we have maybe
1,500 features in our feature store. So we're at the point where, hopefully, researchers will be able to not
have to create features for new models, or maybe have to create only several of them. They'll start there.
They'll do their modeling in a notebook. Then, the most important step for us, is that we translate modeling
code – both for training and for inference – into our internal platforms framework, which kind of
democratizes training, where anyone can kind of train anyone else's model. We have a Slack bot to be able
to train models with Cloud resources. So we can kind of run training there, and configure periodic training
That's super cool.
Yeah. Then finally, they'll use that same Slack bot to deploy a service to production since we usually do
online inference. Then that service will be exposed to developers, which will integrate with it at Lemonade.
Finally, we use a third party for monitoring. We use Aporia, which I think alone has been on here, at least
once or twice. I think what we've learned over time is that we really want our data scientists to manually
configure monitors to further machine learning models. We don't want anything automatic out-of-the-box
so we don't get this “alert fatigue”. It's something that needs to be constantly tuned all the time. They'll
configure both their data drift monitoring, concept drift monitoring, and the performance drift monitoring,
with the help of these very specific alerts per feature, in order to be able to monitor something that
matters. We want high precision, even if the recall isn't that great.
Can we double-click on that real fast? Because I'm wondering how much of this stuff you can recycle, such
as the features, for example. Then, when it comes to the monitoring – how much of that has to be custom
or bespoke every time? Or how much can you recycle?
Like I mentioned before, we hopefully recycle very many features. We have dozens of models running in
production and each one of them has dozens of features. So we definitely recycle features. Then, we
actually have several kinds of monitoring. We'll monitor features separately from models. We’ll monitor
maybe the amount of null values. We'll monitor if features are equal between training and inference time –
this is something the feature store will do completely separately. Also, if a model even uses the feature.
Then when it comes to model monitoring, or like the features within models – we had a time where we
tried to do auto-monitoring to take this burden off of data scientists. And we kind of reached the
conclusion, at least at this point in the platform, that we're not able to provide that service in a very good
way. So we're at the point where data scientists will manually configure monitors per feature for a model
and that's part of our process of reaching production. They'll get these alerts in Slack and whatever. But
that's a big part of the process of reaching production. We do these iterations once in a while where we go
over the monitors and make sure that everything's up to par.
How did you guys decide on a Slack bot as being the right way to have this model training process work?
Yeah, actually the Slack bot is really intriguing isn't it? That's probably the first time I've heard about a
Slack bot being brought into the whole ML Ops equation or into just the training and machine learning
equation in general.
I think I'm lucky enough to be working at an organization where automation is the top priority. So this Slack
bot – there's a Lemonade Slack bot with blog posts about it. It's called Cooper, like from Sheldon Cooper.
And it's really easy to integrate with this platform – the Slack bots platform. It's kind of standard Lemonade
practice to do things with a Slack bot. So we kind of have our own service and our own commands that
integrate with this platform and do a bunch of different things. Maybe we'll have commands to bring up
Sagemaker notebook, or take it back down. We'll have commands to maybe remind people that their
notebooks are up so we don't spend a lot of money. Then we'll have commands for managing the machine
learning lifecycle – for training models with Cloud resources, for deploying them, for taking things down. I
guess several other small pieces as well.
You said something really interesting there, that I would like to dive deeper into, which is – automation is
a top priority at Lemonade and that you're fortunate to work at an organization that does that. Why do
you think automation is so important to Lemonade in particular? What is it about the company that
makes you guys want to really prioritize that? And why do you feel like you're lucky, in that context?
I think we prioritize automation from a business perspective. One of the top metrics that we try to look at is,
maybe, the amount of customers that we have per employee, or the amount of ARR, (we call it IFP) per
employee. That's a very important metric that we look at. So automation is a big part of the company. And I
think I'm lucky to work at a place like that because, I guess, it's a priority for everyone to automate things,
which is really nice. Then we have cool things like Slack bots which deploy machine learning models or train
machine learning models.
That's really interesting to hear that you guys actually apply almost business-level metrics to what is
usually considered just a technical imperative of automation. I think a lot of times when we talk about
automation mindsets and such at different talks and podcasts that we've been in, it was mostly like,
“Well, how do you automate more so that the engineering team can be more efficient internally and can
get rid of bottlenecks?” But to hear that it's such a business focus that trickles down into everything you
guys do – I think that's a very unique approach that clearly yields some pretty interesting results, like a
Slack bot that serves what seems like a lot of customers in the company.
Yeah. I also want to note, when you're looking at that automation side of things, what are some points
where it's gone wrong? Not “gone wrong,” per se, but just where you tried to automate something and it
shouldn't have been automated? You mentioned the monitoring before, where you tried to take that to
automation but then you had to dial it back. Has there been a point where the Slack bot was trying to be
implemented and you realized, “Whoa. Actually, it's not the best use case for a Slack bot. We need people
to be doing this.”?
I don't think I've seen that. I guess maybe not all Slack commands that ended up being implemented are
utilized 100%. I don't think I've seen metrics on it. But I don't think I have a really good story there. I think a
great story would be like something we talked about – about trying to automate machine learning
monitoring, and kind of failing there and resorting to having low precision, high recall, and having dozens of
alerts a day without anyone being interested in them.
Ah. And that's how you knew “This isn't working.” [cross-talk]
Yeah. I mean, I knew that it wasn't that it wasn't working for us to configure monitors automatically.
Yeah. Okay, last question about the Slack bot and then we can move on. But it's just so fascinating to me.
Is there someone that is looking at metrics on which commands are being used with the Slack bot? Do
you have a team that just babysits the Slack bot and decides what to put in there?
I think there's maybe a person. He definitely doesn't babysit the Slack bot. It's a platform. Technically, they
don't even know the commands that we've written to integrate with this platform. I said I'm lucky enough
to work at an organization that has automation as the key – platforms is something else that's really big
there. Decentralizing development in every sort of way is also a big priority.
That is fascinating to hear. I think, with that in mind, I want to ask a little bit more about what tools you
guys use in the context of your ML development stack. You mentioned Snowflake. You mentioned
notebooks. Are you using enhancement on top of notebooks? Any kind of managed service? You
mentioned Aporia. We’d just love to hear what your tool universe looks like.
I think you guys have seen this blog post by Ernest Chan, where there's like five main building blocks? So
maybe I'll describe each building block that we have.
That’d be great.
So, we have an internally implemented feature store. We kind of started before Tecton was public – or
before they exited “stealth mode”. The feature store is so customized towards our specific needs that I'm
having a really hard time seeing how we can use any other party tool, even though people are developing
things that are much better and much more comprehensive. We have some super-specific needs on our
very specific data and use cases. So we have an internally-implemented feature store. It's implemented in
Python. There are different contexts that it runs in. It reads streams with AWS lambda, it serves real-time
features from Kubernetes service with fast API framework. We recently ported all of our code to a
synchronous delegate event loop, which has been very successful for us in terms of model-serving latency.
The online feature store is backed by DynamoDB, and offline, like I said, it's backed by Snowflake. Then we
also run different ETLs over Kubernetes, as well. The workflow management that we use is Airflow, which
also does our periodic training if we need and then it'll also manage the different ETLs if we want to do
batch ingestion into the feature store. We use MLflow, both for experiment tracking and as a model
repository. And we're very heavy on infrastructure as a code. So we don't use the MLflow feature where
there's a production model, maybe like a Blue-Green deployment, or however they do it there. We actually
literally take the model ID and stick it in our code, like in Git. Everything is versioned in Git at Lemonade.
Like all provisioning code is versioned in Git. So like I said, we're very heavy on platforms and decentralizing
So I’m… I'm sorry, go ahead.
I was going to keep describing different building blocks, but I'd love to hear a question.
Well, I mean, I'm just sitting here as an engineer, and I'm kind of like – Well, you have all these different
sort of nicely configured – or architected – building blocks. It's pretty clear to me how you guys have
solved all the different natural friction points that a lot of other companies, including my companies, have
faced in building and productionizing machine learning models, and then also monitoring and maintaining
them in production. So what are the areas of friction that still exist? I'm curious. In terms of maybe your
tool selection, or just in terms of putting more ML models into production?
I think maybe that biggest challenge that we face in general is like a “people” challenge. The market is very
hot for employees at the moment and I think we try to solve problems where we want to enable data
scientists from all different backgrounds to be able to use this platform. That's still our biggest challenge –
enabling people that have only ever delivered notebooks to reach production, and at the same time, enable
data scientists who have been software engineers for 15 years and are very opinionated on frameworks and
would like to do everything (they want to know what's going on under the hood). The interfaces that we
provide is really the biggest challenge – It's not the underlying code. It's really deciding on those interfaces
and keeping them current and having them work for everyone from the least experienced to the most
Wow, I love that. And I love hearing about how you're serving this whole spectrum – from the data
scientist who has been only a data scientist and loves Jupyter notebooks and doesn't want anything to do
with anything else and then the data scientist who is transitioning from being in the software engineering
world, and is very opinionated. I'm thinking about – when you're opinionated like that – or you as the
platform person, you have to ultimately make some decisions and you have to be opinionated about
some things. How does your choice – how does that look? And are you serving the different users of the
platform? Or is it something where you just say, “Alright, we can't let this happen anymore because
we've seen the downstream effects.”?
I think we're kind of at the point where we've gone full circle. It's kind of where we started with something
that was very open and then the second version of our model-serving framework, which is the main area
where data scientists work – the model training and serving framework, where they’ll sort of implement an
interface basically to fit a model, to predict on the model, and give a list of features. That framework was
very open in the beginning, where you could make all these different decisions. Then the second version
was kind of closed, where it was really good for 80% of the use cases, but did not work very well for about
20%. We've kind of gone full circle in the way that we can sort of allow both at this point. But there's
nowhere in the middle. Some people will make all of these decisions about where their features come from,
and specific queries, and they'll maybe write custom queries to bring their features from Snowflake in a
custom way. But they'll kind of have to take responsibility for that. While others would just provide a list of
features and it all happened for them.
I think your statement about how you think about your platform work, really is the clearest articulation
I've seen of this sort of customer mindset that has to happen for internal platforms to work. You
mentioned, “The hardest problem I have is not picking the tools, it's not setting up the architecture – it's
really figuring out what my customers (in the form of data scientists) need, and it's designing those
interfaces thoughtfully enough that I'm serving all their needs without making my work impossible to
do.” And I think that is the central challenge that we talk about so much on this podcast. It's kind of fun to
hear that framed really elegantly by you in your example.
And… Vishnu, sorry to interrupt there. But another point that I wanted to add was that it sounds like, or
what I understood as well, is that, if you make a really cool platform that someone can have a great time
using and these data scientists enjoy using, you're going to be able to attract better talent because they
enjoy using the platform – they enjoy the problems that they're working on. Is that another piece of it?
Or did I just make that up in my head? [laughs]
I think that's a big piece of it – having people. I like to talk about ownership a lot. I think it's quite obvious
from the way that I've described the platform that we're not a “throw it over the fence” kind of team,
where data scientists have ownership end-to-end. This attracts a certain type of data scientist, but I think
people that have worked on teams that were “throw it over the fence” have experienced the frustrations on
both sides – both with different ownership models and lagging delivery. And I think that's something that's
very important to us, having this clear ownership. There are obviously gray areas in the middle, but there's
clear engineering ownership and clear data science ownership. Data scientists will write the code that runs
in production and that's so important to us.
Do they also get the call at three in the morning if something goes wrong?
They're tagged in Slack, yeah. They’re auto-tagged on their models.
[laughs] Cooper's letting them know.
Yeah, exactly. I mean, we're definitely auto-tagged on more things, but they do get tagged on different
types of alerts on their models, whether it be applicative alerts – just, like, exceptions – or data and concept
drift, null features, things like that.
Going back to something you said before, about the “people” challenge and hiring in general in this global
job market that we now are experiencing – can you tell us, as a team lead, what parts of the hiring
process are so hard right now? Is it finding qualified candidates for this type of work, in terms of ML
platform and the combination of software, data, and machine learning? Or is it closing on candidates?
What are some of the challenges you're facing?
So like you said, I'll speak only to challenges on specifically machine learning platform engineers.
I think we're having more trouble closing qualified candidates. The candidate pipeline is definitely not what
it was two years ago. I see much, much fewer candidates and we have to approach them more than they
would approach us. Lemonade’s engineering brand is quite good. I also think this is like a very interesting
role in general – building a machine learning platform. And yet, we're still having to approach candidates at
this point in time, whereas before, we didn't have to at all.
I have a theory about that. Just before we move on.
I'd love to hear it.
What I think it is, is that there are so many new startups that are just getting an influx of cash – the
amount of VC money that's going into all of this stuff that has to do with machine learning, now, these
machine learning engineers are being brought on to all of these different startups and maybe they're the
first engineer for the machine learning platform, or they're being tasked with a lot of responsibility – for
them, for all of this different talent that's out there, there's a lot of opportunity, right? So the reason, I
think, if you go back a notch as to why there are so many job openings for machine learning engineers, it's
because of that, but the amount of VC money that's going into all of these startups that have anything to
do with machine learning or using machine learning in their product – in the core piece of the product –
has just gone up drastically. So that's why it feels like there's not enough talent out there. But that's my
No, I completely agree. Like two years ago, there may be 5 tech unicorns in Tel Aviv, and now there are 50.
Wow! I mean, Tel Aviv – I know Tel Aviv is a boomtown. It's really cool to see. But I did not know the scale
was that massive.
Yeah. There's a lot of VC money being poured in at the moment. I don't know what's going on specifically,
now-now. But in the past year, there was.
Yeah, you're seeing the repercussions of it, now. You're seeing people that have taken jobs maybe three
months ago, or five months ago, because they got the VC money a year ago and now they’ve finally found
someone to take that machine learning engineering position. So it's harder to find those people. But,
Vishnu, I know you had a question and then I cut you off. Orr, sorry – I cut you off with that theory. Tell
us more, if you can remember what you were talking about. [laughs]
Go ahead, Orr.
I was gonna continue with the building blocks, but this conversation is more interesting.
I think one last question I had before maybe we can jump back to the building blocks piece is – you've
described a really interesting process, in terms of what ML looks like at Lemonade. How many people do
you have on your team right now? And how many people do you support? What does that org chart and
headcount look like?
We're currently four people on our team and we support just over 20 data scientists. I think one of the
things that I'm most proud of is that there were several months where we were two people supporting over
20 data scientists. I'm really proud of the engineering to data scientist ratio that we've had there. We still
kind of managed to compartmentalize all this from the organization. It's a big part of the platform thinking
at Lemonade in the way that even our DevOps organization has exposed building blocks for us to use and
the simplicity of us being able to get up and running with anything open source even within a medium- to
large-size organization at this point.
First off, I'm glad to hear number one, what you're proud of – because I think that that is always an
interesting lesson in terms of what – you’re team lead and a leader on your platform and your company –
and it's always interesting to hear what leaders celebrate. That tells you about their values. So that's cool.
And number two – two people supporting 20 data scientists for a company of Lemonade’s size and
customer base, that's pretty remarkable. And it does speak to the quality of the vision behind your
engineering and operations internally. That's really awesome to hear.
Yeah, man. We're really kind of standing on the shoulders of giants. Both cloud computing in general and
then, Lemonades infrastructure above that.
I'm glad that people are getting to hear about this through this podcast, because I work in the healthcare
sector – I work at an early stage startup – And I think, for us, in the industry that we're (in healthcare) we
deal with a lot of administrative bloat in the US healthcare system by design. There's not as much
emphasis on efficiency and not as much understanding, almost philosophically, of the power of leveraging
tools and infrastructure to make one person the equivalent of two people five years ago. I think
industries and companies like yours are really showing the way to people like mine. I've been a first
machine learning engineer hire or a first data scientist hire, I look to companies like Lemonade, or
Pinterest, or companies that are a little bit further along in industries where that level of efficiency and
infrastructure leverage, I guess you could call it, is prized. So thanks for sharing that.
Infrastructure leverage – that's a new one. I like that. Might have to coin that.
I don't know if that's quite the word, or the verbiage that I want to go with, but we'll keep that there for
Well, let's talk for a minute about the other building blocks that we kind of cut off and derailed. We went
on a little bit of a tangent. So we got to the first two, right? The feature store, and then MLflow. But there
were three more that you mentioned.
I also mentioned Airflow as training orchestration. Then I've kind of mentioned monitoring beforehand, for
which we use Aporia. The final one is model serving. We have this internal model training and serving
framework, which I also sort of ended up discussing that the interfaces there are some of the most
important things that we handle. There's a set of methods that if they're implemented, then the platform
will guarantee a bunch of things, like a highly available service, and several different types of monitoring. I
mentioned before that we use applicative monitoring, generic monitoring of features, and then we have
specific monitoring for things like data drift and concept drift. We'll get alerts on those things. CI/CD – you
know, all these engineering guarantees – if we implement this set of interfaces. We also get batch inference
out of the box, which is kind of nice if people just want to go and do batch inference on this model, even
though it's an online model.
There's something that has been coming up quite a bit recently with the ML ops community, not only in
Slack, but also on the people that we have on. It's all centered around testing and how to do testing for
ML. Specifically, “What kind of things do you look at?” How have you guys cracked that nut?
I would say that's still one of the biggest challenges that we have ahead of us. On the data side, we do
testing. Since we have a feature store, we're able to do testing of feature unification. Features, even if
they're generated from different sources, we'll still be able to test that they're correct in inference versus
training. But on the model side, I would say we're still at a manual phase in this process. It's still something
that's done in notebooks.
Yeah, that's a really classic one. I feel like that's probably why – we put out a few videos on testing
recently, and they've gotten a ton of traffic. And I think it's just because most people are hitting that
bottleneck right now and they're saying, “How do I do this? What are some best practices? Who can I
learn from?” There's not a lot of literature out there when it comes to testing and some people,
depending on which space you're in, depending on your use case, you're going to do testing differently.
You need to think about different things and keep certain things in mind when you test. Not to mention,
there's a ton of different kinds of tests that you can do and which ones you focus on. So, yeah.
I’m of the opinion that, just like in monitoring where we thought we could do it automatically and we found
out that it's something that – at least at this point in our platform – we have to have data scientists work on
this as part of the process. I think we're still at the point where it's the same for testing. Like, we can't auto-
test something. I don't know what we could auto-test to have someone feel safer deploying a new model to
production. It's still something that people have to take responsibility for, at least currently, on our team.
So, I kind of want to zoom out here from the technology and go back to the big picture. We've talked
about some of the wins that you guys have experienced in terms of efficiency. We talked about some of
the lessons you've had in terms of automation, and its upsides and downsides with monitoring. We
talked a little bit about what the future looks like in the sense that interfaces continue to be a challenge
that you are trying to think through from a platform vision. I want to go back to the mission statement
that is in your bio – you, as the team lead at Lemonade, and the ML platform team at large, focus on
development velocity, improving accuracy, and promoting visibility through machine learning. How did
you guys get such a clear mandate? Can you talk us through, historically, what that looked like? Was it
sort of the CTO kind of saying “This is the way I want the ML platform to work,” or was it a little bit more
dynamic? How did you get to such a clear vision that's then translated into all these results?
I definitely would say that it's dynamic. I want to say that the head of data science at Lemonade, I think he
made a good decision by bringing in engineering quite early in the data science lifecycle at Lemonade, and
bringing in dedicated engineering quite early. There was a time, obviously, where Lemonade ran models
within the service, and features were sent to it automatically, like our first version of running machine
learning at Lemonade. Like, I was the third person on the data science team. I think that's quite early to
bring in engineering and I think that's a decision that paid off with building the team in general – to start
with platform thinking early within the process. These are goals that have developed along the way.
Development velocity is like the biggest thing – it's very easy to state. Then, improving accuracy is
something that a platform can allow. And then visibility, both internally in the team, and externally are
priorities for us. We want data scientists to be able to see what other people have done and we want model
training to be democratized so that anyone can train anyone else's model with different hyperparameters.
Then externally to the organization is also something that's been understood along the way – that we kind
of need to explain what we're doing organizationally and the platform is maybe a place to start.
I want to tell you, just real fast – there's a quick funny story I have about the head of data science at
Lemonade. I can't remember his name right now, for the life of me.
In English, it's Nathaniel, but it's not in Hebrew.
Yeah, that's it. So, back in my Dotscience days, when I was in sales, I reached out to him and tried to sell
him the Dotscience platform. And I think I remember just asking something like, “Hey, we do this X, Y and
Z with Dotscience. You want more information?” And he was like, “Yeah.” I was like, “Oh, my God!
Lemonade! The guy said, ‘Yeah.’ This is amazing!” And then he ghosted me for the rest of my time at
Dotscience. So whenever I look at Lemonade, I always remember that. [laughs]
[laughs] Maybe he hired me at the same time.
[laughs] That's probably it. Yeah, it was like 2019 – was it?
Yeah, that's when I started.
[laughs]There we go, man.
[laughs] “Sorry, Demetrius. I have Orr now.”
Yeah. I mean, he made the better choice, to be honest. Dotscience went out of business. But maybe they
wouldn't have gone out of business if Lemonade was a customer Who knows?
I think a lot of people – going back to what you just said about you coming in early – I think a lot of
companies struggle to embrace platform thinking early because they're not sure if they're ready for the
expense and the investment. It's definitely one of those things that pays off long term, but you have to be
committed to it. You can't pull the plug early. I see that now at the company where I'm at now – where I
have to advocate a lot for really thinking about wanting to invest in our limited time, energy, and resources
into building a data platform that'll be really good in a year, and not focus on generating a bunch of one-off
reports right now – about different analytics, or insights, questions. So it's good to hear about a story where
this does work and I'm definitely gonna send this podcast to my boss and say, “Hey, look what happens.”
It kind feels like this was in the culture, though, of Lemonade – already, before you got there, Orr.
I agree. Platform thinking is something that's kind of big at Lemonade in general. But it is a gamble, right? A
premature optimization is one of the evils of engineering.
Yeah, that's true. I want to close with a quick question about your talk at ApplyCon, which I highly
recommend. We're gonna put it in the show notes, so everybody should check it out. It's just 10 minutes.
Great overview of how to engineer into real business problems. You had a quote, which is verbatim “If
you're making an online prediction, consider making the business point in time part of your machine
learning platform.” Can you quickly tell us what ‘business point in time’ is and what it's allowed you to
do? Why do you think other people should adopt it?
Yeah. I’ll start with the fact that it really depends on the product. There will be companies with products
that this is completely irrelevant for, and then companies where it may hit a spot where “Oh, this is
perfect!” The ‘business point in time’ is basically when you make most of your predictions. So, we'll make a
lot of predictions at Lemonade during these specific times during our business flow. Maybe when a user
creates a quote, or when they purchase a policy, or when they make a claim since we're an insurance
company – I think it's quite easy to understand this flow. Then we kind of want to know how data looks
during this specific point in time, because that's when we're making predictions. So this is a training data
notion. And creating training data for these very specific points in time, instead of having this open way
where data scientists can say where they want, when they want data from for each data point, and have it
be totally open and open to issues just as much as it's open to anything, is something that we found is just
not needed. People don't need that at our company. They want to know how data looked when someone
purchased a policy. And that's all they want to know, in most cases. So providing data both allows us to test
it very well – to make sure that it's unified between training and inference – and then also to just minimize
the amount of mistakes and the amount of engineering that goes into making decisions. Because it's kind of
only done once – people create data for this point in time.
So I couldn't help but notice something there – when you talked about this. It felt like you got a little bit
passionate when you started talking about the way that data scientists want their data, or the different
ways, or the rainbow of choices that they have. What's behind that? Why are you so passionate about
that? Have you seen it go the wrong way?
I think this is just how it sort of worked out organizationally at Lemonade. This is what the customers have
wanted and they were very adamant about it. But, I mean, I could totally see why some customers would
want to be able to choose data from any time. It just kind of depends when you're making predictions. You
have to look at what's going on in most of the use cases.
Amazing. Man, this has been – it blew my expectations out of the water. I knew we were gonna have a
great chat, but I didn't realize it was gonna be this good. I want to thank you for coming on here and just
blowing my mind. Vishnu, as always, a great time hanging out with you, too. And that's about it. You got
any final words, Vishnu?
We're gonna take a second afterwards and think about it and come up with lots and lots of quotes, which
we have from here. But, I think your emphasis on platform thinking, the lessons you shared with us, and
the engineering quality at Lemonade really stand out, Orr. Thanks a lot for joining us.
Thank you guys for having me. I really enjoyed it.
Your team is hiring, right?
Always. [laughs] There we go. If anybody is… is it Israel only? Or anywhere?
For my team, we're hiring only in Tel Aviv.
Okay, Tel Aviv. There are quite a few people in Israel in the community. So if you want to go work with Orr
and get some of this incredible way of looking at the ML platforms, hit him up. You're in the community
Slack or just I imagine people can get ahold of you on LinkedIn and all that good stuff.
In this episode
ML Engineering Team Lead, Lemonade
Orr is an ML Engineering Team Lead at Lemonade, currently working an ML Platform, empowering Data Scientists to manage the ML lifecycle from research, to development and monitoring.
Previously, Orr worked at Twiggle on semantic search, at Varonis on data governance, and at Intel. He holds a B.Sc. in Computer Science and Psychology from Tel Aviv University.
Orr also enjoys trail running and sometimes races competitively.
Demetrios is one of the main organizers of the MLOps community and currently resides in a small town outside Frankfurt, Germany. He is an avid traveller who taught English as a second language to see the world and learn about new cultures. Demetrios fell into the Machine Learning Operations world, and since, has interviewed the leading names around MLOps, Data Science, and ML. Since diving into the nitty-gritty of Machine Learning Operations he felt a strong calling to explore the ethical issues surrounding ML. When he is not conducting interviews you can find him making stone stacking with his daughter in the woods or playing the ukulele by the campfire.
Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.