Coffee Sessions #127

Reliable ML

By applying an SRE mindset to machine learning, authors and engineering professionals Cathy Chen, Kranti Parisa, Niall Richard Murphy, D. Sculley, Todd Underwood, and featured guest authors show you how to run an efficient and reliable ML system. Whether you want to increase revenue, optimize decision-making, solve problems, or understand and influence customer behavior, you'll learn how to perform day-to-day ML tasks while keeping the bigger picture in mind. (Book description from O'Reilly) It was great that they wrote this book in the first place in a space that's new and lots of people are entering with a lot of questions and this book clarifies those questions. It was also great to have all of their experiences documented in this one book and there's a lot of value in putting them all in one place so that people can benefit from it.

Transcript

because Yeah you mentioned that it's really good for managers but I 22:06 also think for individual contributors themselves to get a sense of like how to mature and grow I think like myself I 22:12 don't really see myself being a manager but as an individual contributor I do want to think more maturely and and part 22:18 of that is is thinking broader you mentioned in your book I think it was about like staff Engineers so they 22:23 should be very cross-functional right or work across the org instead of just being so focused in their narrow domain 22:29 I I'm curious from your experience why why does that work well why does it work well to be 22:35 yeah I guess how how does that work well because it seems like kind of my earlier question of getting burnt out being all 22:41 over the place I feel like they're related but maybe in your experience that's not the case it just naturally 22:46 works and there's no issues getting people to kind of collaborate across orgs there absolutely is like I think 22:52 that there's an analogy in something uh I'm not sure this is a Perfect Analogy but in analogy and something Niall said 22:58 about so when you go back it's hard for us but like when you go back in time to the introduction of Information 23:05 Technology into Enterprise like so we're we're wandering through the 50s and 60s 23:11 and people have big businesses and they're moving pieces of paper all over the place and they're sending memos and 23:17 they have steno Pools full of almost entirely women who type things and mail things up like this is how business runs 23:24 and computers start to come in and no one's really sure what to do with them and I think there's an interesting 23:29 analogy here because they pick a few things that they're positive that computers work really well for and have 23:35 them do those things but that's not where it matters it matters it turns out everywhere not somewhere not in some 23:42 places but everywhere and if you look at the 80s and the 90s and the 2000s as people are trying to decide like do we 23:49 have technology divisions per business group do we have an I.T organization do 23:55 like how do we think about and is technology a cost center or a profit Center like and so like this is all ml 24:02 is going to do the same thing and I think the reason it's going to do the same thing is the data comes from everywhere like the data are already 24:09 everywhere like if you do a thing that matters to your business that thing you're doing produces data and those 24:16 data could be used to make that part and other parts of your business better so 24:22 if you sell Airline seats and you know what airline seats you sell and who you 24:27 sell them to that can make the catering division of your airline more efficient but only if you use the data from the 24:34 seats you sell to predict the meals you're supposed to be producing but like you would have to share those between 24:40 those business groups and similarly for magazines and for car manufacturing and 24:46 for you know serving ads on search Pages or whatever the heck you're doing so I think that's really where it comes about 24:52 David yeah go ahead now yeah I was just going to say I I agree with what Todd said there except the 24:59 data is everywhere rather than the data are everywhere but anyway 25:05 that is a plural word and they bullied me in this book we use data as a 25:11 collective singular I know it 25:16 out voted half making this mistake on purpose 25:22 because Niall is wrong and Kathy is wrong and D Scully they're all wrong 25:27 okay sorry go ahead data is everywhere in your organization even if you don't know it there are probably places where 25:34 it doesn't escape the context of the local team or the person who's keeping it or whatever but ml is a way to 25:42 instrumentalize that data it's a way to do something with it it's a way to surface its actual significance maybe in 25:52 in different places than it would otherwise be I mean to to come back to Todd's Point about your lines and 25:59 catering and so on so forth probably there's some kind of rough rule of thumb where the catering Department orders 26:05 approximately statistically about the same amount of vegetarian meals that somebody told them to do in 2011 or 26:12 something like that and then maybe they changed it every year but 26:17 being geared on the data and surfacing that every day and processing that would 26:23 allow you to make much stronger and more relevant predictions and maybe you have 26:29 to spend less or maybe you can predict with more accuracy and so on that's yeah that's a good point yeah and 26:35 you know I guess what I'll show that maybe we can kind of finish the train of thought but Ty made a point about like 26:41 it's yeah what I thought was it seems like tooling can really help like not only harness the value of data but allow 26:49 other people to collect like I'm thinking of like a feature store like this whole idea of like you know cross-functional data right getting 26:55 people to share data and you reuse it but I guess what I wanted maybe we can kind of when to talk comes back on and 27:00 is that that interesting relationship between like tooling and also at the 27:06 same time like I guess enabling people or productivity or like like what Todd 27:12 was saying harnessing the power of what's there I feel like it's uh there's a they're they're very closely related 27:18 you know and I feel like there's like you mentioned data is everywhere I always feel like machine learning is mostly about data like a lot of the time 27:25 and I feel like they're they're that's I just I think about that you know how why why that is and how they're related and 27:31 how that shapes the tooling that we have and the processes around that tool building so I'm curious to hear you and 27:36 Todd's thoughts are on that because it seems like it is an important part right like the so I guess a lot of people say 27:41 it's about the business value or doing what's best for the business but uh I guess something underneath that that's 27:46 implicit is like productivity and being able to like really extract value from something that everyone can do but only 27:54 a few people do well because there's like all these complexity uh complexities around it nice 27:59 this has been perfect because I appreciate the ideas around how we need 28:05 to look at a machine learning as so close to the business side of things not just like 28:12 this software side of things that you can like you said Todd like it can be 28:17 siled off and just do something it's and now you mentioned it as what like an octopus and its tentacles are everywhere 28:25 and then the other piece of this is the cultural aspect that doesn't get talked 28:31 about enough with ML Ops and really mlopsis borrowing from devops practices 28:38 heavily and devops is so much based on the cultural side of things and so 28:45 I appreciate you bringing both of these pieces up now I do want to just jump 28:52 into a few things because as David was saying with the staff engineers and 28:58 thinking broader and being able to then utilize that in the business and and how 29:05 they're able to utilize data in new ways does this also work for smaller startups 29:11 and especially Nile I mean you're living and breathing this now right you went from gigantic companies where you had 29:20 these staff engineers and what over a hundred thousand people that were working in the companies and now you're 29:26 leading the ship of a small startup how do you think about the two roles in that 29:33 aspect the two roles of machine learning and staff engineers 29:38 the sorry it's really one role in two different places like the staff engineer 29:45 or just the machine learning engineer in general being able to be broad but at a 29:50 startup are you still like what are the use cases and the difficulties that you're facing there as opposed to when 29:57 you're at a Google sure so I mean there's a lot we could say here 30:03 uh organizational theory has many many books and column inches and so on so 30:08 forth written about it but to my mind there's kind of a couple of differences uh at the start the first thing I'd say 30:16 is um as my friend Tanya Riley writes in her staff engineering book which is due 30:22 to be released soon uh she basically talks a lot about how being a staff 30:27 engineer in a larger organization you're essentially modeling Behavior so there's a a human kind of modeling 30:35 aspect and setting culture and so on which is of course present in a start 30:40 because it is present wherever there are human beings but it's much less a question of scale and much more question 30:46 of can we come into a rough agreement about how this should work and then let's go but the second piece I suppose that a 30:53 staff engineer does is if you're looking around wondering who 30:58 should do something about this problem the answer is probably you because 31:03 you're the staff engineer and that's a very similar situation to a startup where in fact if this thing isn't 31:11 getting done and it needs to be done and usually there's a fairly hard and fast rule that allows you to decide whether 31:18 or not it should be done then first person to get there you know 31:24 leaving kind of questions of individual specialization aside but the first person to get there should should do 31:30 that you know we should pick up after ourselves so I think yeah I think there's a 31:36 there's a connection between that point and the point that David was making earlier which is 31:41 um it is characteristic of the staff engineer 31:46 um to have a broader scope to be sort of undaunted by organizational boundaries 31:53 or technical boundaries which is tough I think but so that sort of like if you're wondering who's supposed to figure out 31:59 how that division is supposed to work with this division um if you're not talking to anybody else 32:05 who's asking the same question that's probably you and that's a really nice fit for machine learning because often 32:11 it is the case that the best the best data or the best idea or the best collaboration happens at the interstices 32:19 of these organizations like frequently you know the data that you need to make 32:24 the prediction or the data that you need to give the users the value whatever it is isn't in one place yet because that's 32:32 expensive and because it requires a lot lot of privacy oversight and it requires a lot of governance and a lot of Technology like so most companies can't 32:39 say oh all the data they're right here here's where they are we're good like just look at what you want I mean that 32:45 sounds nice and it's actually like but it has some downsides right like if all the data right here can anybody take 32:50 them or like they exposed are they locked down as access restricted in appropriate ways and so both by inertia 32:58 and by concerns about governance most organizations have a bunch of different 33:03 data we don't talk everybody's like assume the perfectly spherical data Lake all in one place but that's not what 33:09 like nobody has that and so that's why these these uh inter in intra 33:15 organizational but inter-division roles like the staff engineer are so important for machine learning because they have 33:21 to look around and say like look your data plus my data that's the map so 33:27 let's figure out a way to make this work I love that yeah um I just I I'll get back I had a question but I'll ask in a 33:33 second good features no I was gonna jump into something that Todd said last time 33:39 he was on here uh unless you remember what you were gonna say David yeah yeah so something I thought about when you 33:45 were saying how important it is that you know they the intersection of these different teams is where a lot of the magic happens uh can we talk about that 33:52 in the context of like you know incident response and and why it's particularly important there that interaction between 33:59 these teams goes well and and that they can work together effectively uh in your 34:04 experience uh yeah first off what does that typically look like at most companies where how they interact to you 34:10 with incidents specifically um and uh maybe some best practices that 34:15 you've learned things that work well and things that don't work because I'm thinking along the lines of you know interaction is like should we have a 34:22 meeting like who goes into that meeting uh who gets assigned what and and like again like these interesting questions 34:28 of like who's who has ownership of something I think comes really it comes up when something fails and uh when only 34:36 one person maybe knows how to fix it and and I'm just wondering like what does this look like you know nylon the 34:42 startup context at a larger context and what are some things that we can learn from that that uh someone listening can 34:48 benefit from so if you're talking about the question of in in a world gone mad in a world 34:54 where kind of no one owns everything or the first person to get to a problem fix is that problem how do traditional 35:01 methods of problem resolution work and the answer is well they don't work because the first person who got their 35:09 fixed the problem or they ask for help or whatever like at the at the small 35:14 scale there's only so much kind of escalation you can do so it's much less 35:21 of a question as it is in larger organizations of the organization needs to find the person with the right 35:27 knowledge to fix this problem and the problem is essentially finding that person which I I think is a lot of how instant 35:35 response Works in kind of larger companies and folks with a significant production estate but if you're coming 35:41 back or if you're um digging deep on questions about organizational uh scope and 35:47 communication and so on and saying that actually is it all that realistic to say that everyone needs to know everything 35:54 and can't you have some partitions somewhere and so on yes of course I mean there's this apocryphal story which I 36:01 choose to believe is true but uh anyway of of Jeff Bezos apparently getting 36:07 annoyed it's um relatively early Amazon meeting where lots of folks are kind of 36:12 standing up managers and PM saying we have to communicate more we have to send 36:17 memos to each other or whatever and Jeff says no this is stupid you should be finding out ways that you don't have to 36:24 talk to each other because that will mean you can go faster within in your local scope and this is the two Pizza 36:30 team and the there shall be nothing except an or PC layer team and so on all of which are fantastic kind of rules of 36:36 thumb for running organizations with kind of well-known characteristics I suppose uh so I would say that my 36:45 experience suggests that yes you do need to put some partitions 36:51 in once you go past a certain scale or once you have pii concerns or regulatory concerns of 36:57 various kinds there are all kinds of reasons to to block various kinds of communication 37:03 flow or having to know stuff but what I see in mainstream business culture is 37:09 that we still assume we have way fewer reasons to talk to each other than we actually do and we're I at least in the 37:18 in the stuff we have written we're trying to get people to realize that actually ML and 37:24 liberalizing the value of data should cause you to think again about that 37:30 question but I think yeah I think all of that is true and in the context of machine 37:36 learning like we wrote this whole chapter on incident response in in the book and it was a fun chapter to write 37:44 um because it like you know like for people who do production engineering like thinking about stuff breaking like that's that's like that's the fun stuff 37:51 like their stories they're fun to read but they're actually interesting uh you learn a Time 37:57 uh you do right and all of those were inspired by real like none of them is an 38:04 actual exact outage that happens somewhere but all of them are inspired by real outages that some of us have 38:09 seen um but I think the challenge with machine learning which is I think this 38:16 is sometimes true in other parts of the business but it's always true and machine learning is there is no works or 38:22 doesn't work I mean if you think about what it is to make a model you're like we're making a model 38:27 and it's better than not having a model you're like but is it right and you're like well that's not a super 38:34 well-defined question is it right I mean it's pretty good for lots of stuff 38:39 sometimes you're like okay but so that's fine when you're thinking about model development and you're thinking about 38:45 like building this into your organization or your application but now when you come to do reliability 38:51 engineering or production engineering you're doing ml Ops you're like the model's less good than it used to be 38:58 is that okay I don't know less good for whom under what circumstances and this makes the incident response super hard I 39:05 think and it's super interesting because you're like do we because sometimes you just have an outage you're like oh we 39:10 loaded the model into serving and all of the serving layer crashed okay well that's an out like that's fine we don't 39:16 have a model but like a lot of the times like we have a new model and in some ways we think it might be somewhat less 39:23 good for certain kinds of people than the old model was okay is that an outage I don't know sounds kind of like an 39:29 outage right how are you going to know when it's done and so anyway like David going back to your point like that creates these communication you know 39:37 part of the point that Niall was talking about is now you need to communicate with the people building the model with 39:42 the people who are evaluating the quality of the model with all of the people who produce all of the data that 39:48 goes into that model because like these data in the ways that they're combined and the way that they're trained into 39:54 this model like maybe they were joined wrong maybe somebody like hasn't produced new data for six weeks and 39:59 that's why this model is bad and you don't know that by the time it gets disturbing because you just have a model that says you know what it says so I 40:07 think that's what's that's what makes the ml incident response way trickier and much more fun 40:13 yeah that fun part sometimes sometimes it's stressful you know when 40:18 you're trying to fix something and no one uh no one really understands why something is gone but I think that's something that I I thought about is if 40:25 that if those are the type of questions that you're asking that I would dig into that why do you not know who is 40:30 responsible for that or or why don't we have a system in place to you know frame the right questions it almost seems like 40:36 framing the questions is almost the the important part and that seems related to team structure and goals and what your 40:43 priorities are and things of that nature you should be aware that Todd uses the word fun in the context of someone who 40:49 has been a nursery for well over a decade and therefore their adrenaline glands are close to useless 40:57 but I do think like one of the things I appreciate about uh my current employer is I've never been on call and felt 41:04 alone like abandoned and I think that's what gives the stress if you are in it 41:09 and you have people to help you with it and you are kind of lost but you know what to how to ask for help it's not so 41:16 bad you're like yes this is bad but we're in this together we're going to figure it out I think for most of the people I talk to they're like they're at 41:23 an organization that either because of its size or its maturity or its bad culture it's just like you're on call 41:30 and forget it man it's just you good luck you know holding the hordes 41:35 off and you're like I don't like I just got here I can barely spell LS what is 41:40 happening in this conversation right and so in those cases like if you don't know where the dashboard is and you don't know what the model's supposed to do and 41:46 you don't understand how quality is being evaluated and you didn't understand the data joining system that 41:51 went into it it's time to like get some help and that should be fine and so um so yeah one thing that I'm wondering 41:58 uh oh my bad Niall go ahead no I was just 42:04 gonna say you haven't lost me I'm still here okay 42:11 so but Todd I mean uh as someone who had to write your title uh a few times in 42:19 the last time that you were on here and then when you were in the apply conference 42:25 it's a mouthful do you think that you're or we as a ml Ops or ml Community are 42:33 going to see something like the MRE actually take off 42:39 yeah so honestly I think maybe I don't understand you know this is weird having 42:46 worked at Google for like 13 years but I will just say like I don't understand the way big corporate Tech deals with 42:53 job titles and professionalism like I just don't understand how it works or why it works the way and I know this is 42:58 like a little bit ridiculous thing like I'm a senior director at a giant Tech Corp and I'm just asserting I don't 43:04 understand why or why we do that or how that works so I don't know like what I know is that as hiring manager like I 43:12 know the kind of skills I evaluate for but I also I do appreciate that there is this um there is this like social 43:20 professional wave that we all are creating and participating in and I know it matters because we saw it in SRE like 43:27 before we had SRE um before like you know Google popularized the term and other people 43:33 adopted it and we started collectively thinking about what site reliability engineering would be versus systems 43:40 Administration versus some other kind of like you know versus production engineering or devops like so before we 43:48 had that it was harder to communicate what skills we were looking for so I do think we're gonna get to something 43:54 that's a little bit closer to that but what I wonder I would flip it on you Demetrius and say like well why do we 44:00 think that that won't be all of the production engineering like what is the thing looking out 10 or 15 years what is 44:07 what is the set of because like you don't go back you know you don't say like do you think we're gonna like have 44:13 software skills required to work on computers like what do you think about that I'm like yeah yes I think we are 44:19 and I think we're going to have some machine learning skills required to work on computers in the future now we're 44:25 going to have 5 to 15 years of that transition but I don't see a world 15 44:30 years from now where people are using computers and the computers aren't configuring themselves and the cute the 44:36 the systems that we deploy aren't fundamentally and inextricably dependent 44:41 upon machine learning like I just that just doesn't seem possible to me in which case why do you like you're just 44:48 like inventing a new word for the thing that everyone is so maybe the answer is like no you don't get like ml 44:54 reliability you don't get ml reliability engineer so you just have reliability Engineers who have to know ml so yeah 45:00 now what do you think so toddler's move to occupy The aspirational High Ground so I'll come in with a little bit of low 45:06 ground here and I'll say that to the extent that SRE has obtained a good 45:12 reputation uh in the in the industry as a whole people are gonna want a bit of 45:19 that in their org so somebody comes up to them and says oh you can have an essery except for this ml thing and sres 45:26 are really good didn't you hear that oh yes I heard that yes we better hire an ml ore or equivalent so like there's 45:34 just kind of linguistic degradation overall as people attempt to get the the 45:39 good things for themselves so um that would be my my estimation what's 45:45 going to happen so I I love this uh especially because it does feel like yes 45:53 the and this is something that Todd you've talked about a few times and I really appreciate the idea of how 46:00 the hard problems aren't specific machine learning problems right and it's 46:05 like distributed systems problems and distributed computing and that kind of thing and then this Vision that you have 46:12 of the next wave is going to be everything will be touched with machine 46:18 learning so it's not going to be a thing like machine learning won't be such a it 46:24 won't be as hypey as it is now and it'll just be taken for granted and potentially what we call software in the 46:31 future will have machine learning inside of that just it's assumed that machine 46:37 learning is part of software so the one thing that I want to touch on that is 46:43 uh something you mentioned last time you were on the podcast Todd around trustworthiness being a real barrier to 46:52 implement machine learning and over the span of writing the book I know that you 46:57 all touched on ethics and trustworthiness and things like that and I'm really wondering like 47:04 how has like how have things changed if at all your viewpoint on the 47:11 trustworthiness of not only the data but like if someone wants to do machine learning that is a big risk for the 47:18 company and so they have to trust that this initiative that they're going to 47:24 embark on is going to work yeah so I think 47:30 um we take these points super seriously and I think that some of the most interesting macro structural organizational societal questions that 47:37 we have I think there's some narrow technical answers about explainability and about certain kinds of governance 47:43 and I think those are important answers um but I think that you know so like just to say like explainability is a 47:51 real thing like many models have certain explainability features um I think that the people doing work in 47:57 explainability are doing important work however I also think it's the case that they're they do not see the gap between 48:04 the answers they give and what societies want uh and so what I think is pretty 48:10 hopeful is you know if we go back I am not a policy expert and like my company 48:16 will kill me if I say important things about like big policy but I will say like we look at data protection rules uh 48:24 in the European Union going back you know five six seven years um those were super bumpy super 48:31 problematic they looked almost impossible but in the end I think a lot of people look at those and say like 48:37 wait those are not those are actually not as as a person maybe not as a tech company but as a person I'm like those 48:44 are not too bad wait you're saying I have certain rights to control of my data I have rights to know when you're 48:50 getting my data I can make you get rid of it and I can know what you're doing with it that sounds pretty reasonable to 48:56 me right so something that went from unfathomable to you know Tech in the 49:02 like early 2000s to like seeming pretty reasonable to those of us here in 2022 I 49:08 think we're also starting to see some of those things for artificial intelligence and I think that's good like as a as a 49:14 technologist I'm like I should just get to do whatever I want to but as a citizen of like a couple different places I'm like no I want like I want 49:21 societies to have a deep set of trust and investment in these Technologies because they're good because they matter 49:28 and so if we're going to use these to write loans we should probably not be racist about it for example like if 49:34 we're going to use these to drive cars we should not run into people and we should know under what conditions were 49:41 better or worse than human drivers um but now like you spend a lot more time in that section of the book than I 49:48 did what are your thoughts on you know this general question about like the 49:53 risks of AI and organizations and like where we're gonna go in the next few years yeah I suppose I'd start off by 50:01 saying that the the chapter in the book that we have about this is more or less geared to organizations that aren't 50:08 maybe doing a lot or that much or they're take they're stepping you know 50:14 putting their first toe in the water Etc so it's more or less a guideline to if you aren't doing anything at all please 50:20 consider doing the following things and they are reasonably sensible things like maybe you should care about who has 50:26 access to your data and here's a technique for making sure you don't disadvantage this subgroup of a larger 50:32 set as opposed to this other subgroup yada yada yada basically just what would happen if you took this 50:39 seriously kind of questions right but there's a huge huge amount to talk about 50:45 here with respect to the to the larger question of course I think I'll say I 50:52 mean I have no crystal ball here whatsoever but I I think what I'll say is I see this kind of regionalizing in 50:58 the world I see like in in the writing of this book The the Chinese 51:05 um government has their statement about how AI should be used inside companies 51:11 and the European Union has not just the GDP or but also their guidelines about 51:16 how to be used Etc I see more regulation kind of happening and I see the conflict 51:26 if that's the right word between public policy and machine learning generally 51:32 and I suppose technical development generally as as being very definitely a 51:38 thing which is gonna keep happening for years and years yet and that gap between the conventional 51:47 kind of public understanding of of things particularly in in government and 51:52 politicians who often have this famous background in you know philosophy politics and economics etc 51:59 those decision makers aren't equipped with the Nuance to understand when 52:05 somebody says this ml stuff is Magic pixie does pixie dust and you can sprinkle it and everything and it just 52:11 makes everything better and also it makes everything infinitely terrible like those two messages are clearly 52:18 discernible in public kind of discussions about this thing and my concern is that we will regionalize just 52:26 as a factor of our cultural inertia and the 52:31 direction we're going in in the various different continents right and we won't ever close that gap between the 52:38 understanding of data um science and so on so forth and key 52:44 decision makers and I I think that that is a gap I grow increasingly concerned 52:49 about that's um sorry it's just interesting you say that Nile because like I I see 52:57 what you're talking about but in some ways I perceive the gap between technologists and uh policy as as less 53:06 than it was during the early days of data regulation like during the early 53:12 days of data regulation it felt like people were like dug in fighting tooth 53:17 and nail and what I see on the AI stuff is I see a lot more constructive engagement I see people saying like hold 53:23 on I get it like I totally understand why you want to do some of these things can we talk about the details a little 53:29 bit more so maybe I'm wrong maybe like maybe that'll blow up in the same way but at least in the U.S tech companies 53:37 and the proposed European uh oversight for AI I'm seeing a lot more 53:43 constructive like let's try to understand what you're trying to accomplish and why don't you try to understand what constraints we have and 53:49 how flexible we can be be and maybe we can work something out but maybe I'm wrong I don't think you're wrong but I I do 53:56 think that the separation is between the political layer and decision making and not necessarily the administrative layer 54:02 so I see the administrative layers of government Staffing up with AI regulation units and people whose job it 54:08 is to know all about this stuff and there will be local competence is the political layer connected to that in the 54:15 way it needs to be like not in all regions in like all of the time shall we 54:20 say yeah you guys bring up some really interesting points and and something that I thought about a question I asked 54:26 myself is how do we prepare the next generation of practitioners to not have this as like an afterthought I feel like 54:32 with with with tech that's what it's always felt like uh it's just something you've got to think about after the fact 54:37 but the real focus is on the tech and the cool stuff do you feel like it may 54:42 be in typical computer science uh you know programs there needs to be a class on like the philosophy of Technology the 54:48 philosophy of AI I sometimes feel like that could help that couldn't help but it also feels related to like maybe 54:54 something maybe we could learn from the past like in the past you know the public had a certain perception towards 55:00 a piece of technology and there was a role of government and and shaping that or the technology is also shaping that 55:06 it feels like they're all related like the perception of things our competence is also related to what practitioners 55:12 think about it do they value thinking about these things at the right now it feels like they don't it's an afterthought only when they're in 55:18 trouble or when they're you know under pressure does it become really be relevant um so yeah I just feel I don't know if 55:25 that's like the solution you know because you know we talked about philosophy earlier that does help you think more broadly think cogently about 55:30 things how can we do that in this space when uh you know Engineers just like to 55:35 focus on what like they like to focus on you know it's very hard to kind it's like hurting cats how do you do that how do you bridge that Gap 55:43 well I I mean so I do think I have a thought here and maybe you do too soon I'll like I I think there has there is a 55:50 very very long tradition of Engineers being responsible for the uses of their 55:56 creation um and this goes back generations and generations and like if we think about 56:01 relatively recent examples like no one thinks the engineers at Volkswagen and Audi are like not culpable for the 56:10 cheating of the emissions and the amount of pollution that they caused in particular in Europe by programming the 56:16 computers to recognize when they were being tested for emissions at the emissions testing places using the GPS 56:21 and the specific right like we know that as Engineers we say like oh yeah you 56:26 knew what you were doing and you were responsible for this now I know like one of the things that happens with some of 56:32 these creations is the consequences are a little bit distant so if I have a 56:38 model that learns from biased humans and then the model is biased test is that my 56:45 fault or Humanity's fault right and what I think what I'm starting to see is like a lot of organizations are saying like 56:51 well we'll give you the tools to detect when you've reproduced bias and then 56:58 then it is your fault because now you can detect like did you train a sexist model if you trained a sexist model and 57:05 we gave you the tools to detect that it was sexist and you launched it anyway that you launched a sexist model and so 57:11 anyway I think in the book one of the things we did that I'm super appreciative of is we have this whole 57:16 section on privacy and ethics like an entire chapter written by an independent expert in that which is fantastic but we 57:23 incorporated those concerns into every single other chapter because you don't you don't get to like isolate that and 57:29 to your point David it's not just a separate class it has to be in everything you do sorry Niall did you 57:35 have any thoughts on that yeah I suppose I'd say two things the first one is like as you say there's a long tradition of 57:42 scientists kind of struggling with the consequences of their actions and it's from everything from Oppenheimer and the 57:50 atomic bomb all the way through to like a model that somebody pushed yesterday uh possibly with slightly different 57:57 consequences or scale consequences but the the thing I'd say is 58:03 like coming back to your suggestion um as the possessor of a Humanities and 58:08 the computer science degree or whatever I would say Sam I would like to say yes 58:15 sure we could we should expose to people to these complicated ideas of of what 58:21 the good means and how it should be defined and so on so forth but actually is that going to be a necessary kind of 58:29 improvement like a monotonic Improvement in what we see today I'm not convinced of that I think the thing that matters 58:35 is people see advantage in doing the shitty thing right and they do that because there is 58:43 some advantage to it how can we make that not be an advantage well today the 58:50 answer to that is regulation or jail or finding or anyone of a series of kind of 58:56 activities like that shall we say incentives like that I think we can get 59:02 a certain degree by relying on character and culture and those are both very 59:09 strong forces you know different Scopes Etc But ultimately if we're trying to 59:14 say people you should not use this thing in this way someone else is going to have to say that with a loud voice 59:22 yes so good well said so 59:28 I I could sit here talking to you guys for another couple hours sadly you are very important people and you also have 59:34 lives to lead and we have got to cut it but this has 59:41 been super super cool I highly encourage anyone out there if you have not read 59:46 the book go out and get it you can go on to O'Reilly right now and check it out 59:52 and the final push I think is for what 59:57 when's it gonna come out officially oh it should be out electronically before you can edit this they're talking 1:00:03 about pushing it later this week or early next week Perfect all right so by the time you hear this unless you guys 1:00:08 are like editing this in the background and pushing it later today by the time people read this they can get the book 1:00:13 uh in print within a week or two and like certainly electronically incredible 1:00:19 and we're planning on doing a reading club in the ml Ops community so if you 1:00:24 go to reading group go and check that out and we'll all be reading it together uh and then Todd and Niall are both in 1:00:31 the community and we're so thankful to have you guys in there that it's been super cool so if you have questions for 1:00:37 them just hit them up in slack uh not too many questions because like I said they got they got things to do also but 1:00:45 the uh the last things I was going to ask you guys are 1:00:50 um best ways to get a hold of you is it just slack do you prefer Twitter LinkedIn what's 1:00:58 I hate people I don't like oh like I don't know like uh Niall and I 1:01:04 are both on Twitter both on LinkedIn I think LinkedIn has increasingly had I 1:01:09 think because of Chip's book uh chick Quinn's book which is fantastic you should get her on if you have it already 1:01:15 but like I've seen an increasing and some of the work you've been doing Demetrius over there like I've seen an 1:01:21 increasing amount of ml Ops content on LinkedIn that's been really useful so I think that's a nice focused way 1:01:28 um there's some on Twitter but it's been a little bit more focused on LinkedIn I think excellent I think I'd say if you 1:01:34 come to me on LinkedIn you will contend with the number of people who want to hire me if you come to me on Twitter 1:01:41 you'll contend with the number of people who want to be sarcastic at me if you come to me on slack that's probably 1:01:49 faster but 1:01:56 well fellas this was awesome thanks again for doing this and we will hopefully have you back on here sooner 1:02:03 than another year because this is just super cool to talk to you all yeah but 1:02:08 we're not writing any more books forget it the books do not fight themselves let's say Do not awesome guys thank you 1:02:16 so much it was such a pleasure all right thanks so much [Music] thank you

In this episode

Niall Murphy

Niall Murphy

Co-founder & Consultant, Stanza

Niall Murphy has been interested in Internet infrastructure since the mid-1990s. He has worked with all of the major cloud providers from their Dublin, Ireland offices - most recently at Microsoft, where he was global head of Azure Site Reliability Engineering (SRE). His books have sold approximately a quarter of a million copies world-wide, most notably the award-winning Site Reliability Engineering, and he is probably one of the few people in the world to hold degrees in Computer Science, Mathematics, and Poetry Studies. He lives in Dublin, Ireland, with his wife and two children.

Twitter

LinkedIn

Todd Underwood

Todd Underwood

Director of Engineering, Google

Niall Murphy has been interested in Internet infrastructure since the mid-1990s. He has worked with all of the major cloud providers from their Dublin, Ireland offices - most recently at Microsoft, where he was global head of Azure Site Reliability Engineering (SRE). His books have sold approximately a quarter of a million copies world-wide, most notably the award-winning Site Reliability Engineering, and he is probably one of the few people in the world to hold degrees in Computer Science, Mathematics, and Poetry Studies. He lives in Dublin, Ireland, with his wife and two children.

Twitter

LinkedIn

Demetrios Brinkmann

Demetrios Brinkmann

Host

Demetrios is one of the main organizers of the MLOps community and currently resides in a small town outside Frankfurt, Germany. He is an avid traveller who taught English as a second language to see the world and learn about new cultures. Demetrios fell into the Machine Learning Operations world, and since, has interviewed the leading names around MLOps, Data Science, and ML. Since diving into the nitty-gritty of Machine Learning Operations he felt a strong calling to explore the ethical issues surrounding ML. When he is not conducting interviews you can find him making stone stacking with his daughter in the woods or playing the ukulele by the campfire.

David Aponte

David Aponte

Host

David is one of the organizers of the MLOps Community. He is an engineer, teacher, and lifelong student. He loves to build solutions to tough problems and share his learnings with others. He works out of NYC and loves to hike and box for fun. He enjoys meeting new people so feel free to reach out to him!