Somewhere between the media hype and command line interfaces, machine learning and AI do represent the opportunity to build better products.
Looking after this space here at Intercom is Fergal Reid. Fergal joined Intercom after starting his own machine learning startup, then selling that startup to Optimizely, where he led a group focused on the topic. Today he’s our machine learning strategist, where he oversees how this technology can be applied to our own Intercom products and things like our Operator bot.
To help separate the hype from where the potential really lies, and what the means for product builders today, I hosted Fergal on our podcast. Fergal outlines what we can expect from this technology in the next few years, which realities are far away, and the considerations you must make if you choose to build in this space.
Below is a lightly edited transcript of the conversation. Short on time? Here are five key takeaways:
- There’s a lot of buzz around AI and machine learning right now. In order for any discussion on AI and machine learning to be productive, Fergal believes you have to clearly tease out the definitions.
- Technologists have worked on improving computer vision for a long time, but the process has remained manual. In the last five or so years, we’ve really crossed a threshold in computer vision backed by AI.
- Simulating natural human conversation is a challenge. Fergal believes we’re definitely not at the stage where we have a system that’s intelligent and can hold the context of a conversation.
- Creating a machine learning system that’s fit for users continues to be tough. Often you can build a very powerful machine learning system that will do 90% of a task well enough, but then you’re left with this remaining 10% that prevents it from entering the wild.
- Whenever possible, buy intelligent solutions. These solutions are expensive and time-consuming to build.
Des Traynor: Fergal, welcome to the show. Can you share a little bit about your background, as well as why we’re exploring artificial intelligence and machine learning on the show today?
Fergal Reid: Back in the day, I got my undergraduate degree in computer science. I was always interested in artificial intelligence and machine learning, so I specialized in those areas. I did some general software contracting work, and then I went back to school to get a PhD. In 2009, I started working in network analysis and machine-learning applied to network analysis. I saw a lot of applied machine learning projects during the course of my PHD and came to the conclusion that the machine learning industry is very batch-oriented – it doesn’t take into account how we want to deploy machine learning systems in real life.
After I got my PhD, I co-founded a startup where were working to create a dynamic, adaptive style of machine learning. Our startup was acquired by Optimizely, and we spent a couple of years working in product and machine learning roles for Optimizely. I then started a family, moved back to Dublin from San Francisco and found myself working here at Intercom.
Des: AI and machine learning have, at times, been criticized for being unnecessarily academic. As a former and failed PhD student, I get defensive when I hear people say the word “academic”, because they don’t mean it in a positive way. How did you jump from the world of a very academic PhD into that of producing live code for the world?
Fergal: When people sign on to do PhDs in computer science these days, relatively few go on to have careers in academia. I may have been unusual in that I came back to academia after a few years of doing software contracting. I didn’t get my PhD to launch an academic career. Instead, I wanted to learn more about machine learning and AI, which have always been areas of interest for me.
I actually had a certain amount of software engineering experience before going back for my PhD and that helped a lot when I was back in school. It was pretty natural for me to leave that and do a startup.
Defining AI and machine learning in the context of today’s technology
Des: It seems like in the past two years, you can’t read the news without hearing about a new AI breakthrough or machine learning capability. What’s going on? Is this purely a media thing or have things actually changed in the past few years?
Fergal: There’s a little bit of both. There is definitely a massive spike in stories on machine learning, which blows things out of proportion sometimes. There’s also some interesting things happening that are real and significant. In order for any discussion on AI and machine learning to be productive, you have to tease out some definitions.
Des: What are the key terms here?
Fergal: Machine learning is pretty well defined. It’s basically a branch of applied statistics. You have a problem, you’re trying to solve it, and then you have a system thats performance improves when you give it more training data. It’s like if you had a statistical problem, such as estimating the average height of people in a population. The more data you get, the better your estimate.
Machine learning is a pretty well understood field. It’s been around for awhile, and it’s valuable. In recent years, it’s starting to deliver the goods on a range of harder problems that historically weren’t so easy to crack.
Applied AI is a very real
thing that’s happening
AI is a more nebulous term. It’s a very old term, and it’s a very aspirational term. The origins of AI come from people talking about general intelligence, trying to make systems that are as smart and as intelligent as people are. This is where things kind of get complicated in today’s world: you’ve got applied AI and general human-level AI. The general human-level AI seems like it’s far off, or at least, we’re definitely not there yet, whereas the applied stuff is a very real thing that’s happening.
Des: Can you give me some examples of general AI versus applied AI?
Fergal: With general AI, you’re talking about something that is generally intelligent, like a human. Something that can tackle a wide range of problems. It’s like the concept of science fiction – the computer that thinks like a human does. There is work that’s happening in academia and industrial research labs that seems like it’s making progress on general questions that seems like we couldn’t touch before.
It’s really important to be clear what you’re talking about. A lot of confusion in the discussion about AI at the moment comes from people using these two terms (applied AI and human-level AI) interchangeably. I see articles where someone says, “Elon Musk says AI is going to come and take all our jobs, or come and get us”. Somebody else has an AI company solving very specific industry problems, and the two things get inflated. This doesn’t help anybody.
Is machine learning finally reaching its potential?
Des: When it comes to machine learning, you’ve said there are some things that we’re surprisingly good at now and there are problems that are more solvable then they were in, for example, the year 2000.
Fergal: That’s true. It’s very real and exciting. One great example of this is computer vision. For generations, people were coding algorithms almost by hand, manually coding things to detect features of an image. They tried to detect straight lines and edges in a very manually coded way to recognize a bicycle or bird in a picture. The success was never really quite what we wanted. It was always easy to produce a compelling demo, but hard to produce a system that worked, that you could ship and put in the wild.
In the last five or so years, we’ve really crossed a threshold in computer vision. We now have acceptable accuracy. You can ship Google Photos with a built-in object recognizer to 100 million smartphones, and most of the time it works. There are hiccups and problems, but it’s hit this acceptable error bar for the end user. That’s obviously been one huge success story.
Google Photos facial recognition software can now go as far successfully identify pets.
Other big success stories have been in audio recognition and natural language translation. What all these success stories have in common is that we’re much better at understanding unstructured data – data where things aren’t nicely labeled and classified, data that looks like a big image full of pixels or a big sound file full of bits and bytes. We’re much better at taking this unstructured data and turning it into structure, then we were five years ago.
This is because of something called deep learning, which is a breakthrough machine learning technology. You could also say it’s an old machine learning technology that’s finally come good. We finally have enough computation power and good techniques to realize its potential.
Des: Is there now like a prototypical example of a problem where we’re still struggling? Like if image recognition or vision is going well, is there a corresponding area where we have yet to really make a dent?
Fergal: There’s a lot of demands that we haven’t yet cracked. It’s one thing to look at unstructured data, where you have a 100 million photos and over time you’ve learned to recognize the objects in them, but there’s a huge amount of things we’re not even close to yet.
For example, consider talking to a chatbot, where a chatbot generates fully natural responses like a human. We’re definitely not at the stage where we have a system that’s intelligent and can hold the context of a conversation.
Des: The distinction you’re drawing there is something that generates responses on its own versus something that can make selections from a pre-configured answer bank, right? You’re saying we’re not at a stage where we’ve built a chatbot that can actually create, conceive and return an answer that’s appropriate.
Fergal: Exactly. We’re not yet at the level where we have anything that requires a general understanding of the domain. We have very powerful techniques for taking unstructured data and compressing that down to a simple representation that we can then use to say, “This looks like a cow or this looks like a dog, or this looks like the word, hello.” That’s a very limited, constrained task, something that requires a contextual understanding.
Basically, there’s a small number of problems for which we have figured out good solutions, and a much, much larger number of problems that we’re not anywhere close to solving.
Shipping a product that’s actually ready for users
Des: What other things do you wish product and startup folk knew about AI or machine learning? Is there common information you wish was out there?
Fergal: The number one thing I wish people knew is that it takes a long time to build a product that’s good enough to put in front of users in a very unsupervised way. Building a prototype is often easy, but building something that handles all the edge cases, that’s mature enough to put into production and deliver value for users without supervision, is really hard.
Building something that handles all the edge cases is really hard.
The number one thing people should watch out for is that it’s relatively easy to develop a slick demo that looks like you can do something cool, but it can take a vastly longer time to handle all the edge cases required to make a compelling product that you can just put in front of users and will deliver value to them.
Des: You say it takes a long time. Is that during the lengthy period while you are you pointing your algorithm at some big data source and letting it learn, or are you tweaking by hand? What’s happening during that period?
Fergal: I think it’s the latter. You have to tweak things by hand. Often you can build a very powerful machine learning system that will do 90% of a task well enough, but then you’re left with this remaining 10%. The question is, what happens with this remaining 10%? If you decide you want to build your machine learning system to handle it, you typically face diminishing returns. It can typically take a month to get 90% accuracy, a year to get 95% accuracy and a decade to get 99% accuracy.
What you see in a lot of production use cases is, people don’t get the machine learning system to handle all the edge cases. Rather, they paper over the cracks. They find, by a process of trial and error and issues of deployment, that the edge case system is bad at and they bias their training data in to kind of handle those cases.
They use this process of iteration gradually until the product is good enough to leave in front of users in an unsupervised way. But this process takes calendar time and it takes engineering time. As a result, a lot of people get to a demo that solved the 90% use case and everybody celebrates. They think they just need to put some UI polish on it and ship it. In practice, to get it to the point where it’s really solving the problem in a reliable way, it takes all that extra time.
A high tolerance for error
Des: There are some people who might say, “If this works 10% of the time, that’s a win.” There are other cases where you might see some AI get it right 51% of the time, and be blind to the fact that 49% of your customers are now having a horrible experience. There’s a certain point where it’s cost effective for the business to release the AI into the wild; however, I worry that those two bars might be quite far apart in some sense.
Fergal: There’s a product development tactics question here: what products should you choose to ship? If you’re trying to ship a machine learning product, you really want to ship one where there’s a good tolerance for occasionally getting things wrong.
For example, Google recently shipped these smart replies for Gmail. They unintrusively provide suggested replies at the bottom of your email. If one of the replies isn’t very good, it doesn’t matter. If one of the replies is good, the user clicks on it and it saves some time. That’s a really nice way to deploy a machine learning product. Rather then say, “It’s going to respond on your behalf”, it simply suggests options.
Gmail smart replies, machine learning in action.
Des: It suggest things I should say, and worst case, I won’t use the suggestions.
Fergal: Exactly. A successful machine learning products picks its battles carefully. It’s about choosing to ship something that has a high tolerance for occasional errors, baked into the nature of the product. Even if you want to ship something that does something on the user’s behalf, getting manual approval is a sound approach.
What’s the bar for success for this? It depends on the product. A good product manager has to be very thoughtful about trying to ship pieces that have that affordance and the robustness to combat occasional bad behavior.
Des: So if we’re finding a product feature that’s really well positioned to make use of these technologies, a simple requirement would be that the AI should augment, but not replace, anything that exists today. If you can make things easier for the user, simplify things, reduce things to a click, but don’t click on their behalf, that’s a good start.
Fergal: That’s a fair summary. But it depends on the domain. Take self-driving cars. People speculate that there’s a cliff, where if the self-driving cars are good, but not perfect, we’re actually worse off than when we started.
A wealth of data is a wealthy asset
Des: If you were assessing a project to find out if you could help, what are the other things you look for?
Fergal: A wealth of data is always nice if you can get it. Some companies have access to that, and some don’t. There is an emerging trend in startups at the moment, where people are taking models that have been trained on wealth of data. Maybe that data is open or maybe it’s a model that’s been provided by a company. People are saying, “How can I take this pre-trained model that’s enabling me to do something I couldn’t do before and adapt it to a particular domain or a particular challenge I have? Maybe I can start delivering something that’s valuable, before I amass that data set, and then later as I amass that data set myself, I can get that last 10% of accuracy by training it on data that’s specific to my use case.”
People are still figuring out exactly how to build companies around this. Also, an MVP that you can quickly get in front of customers that will pay you money before you’ve cracked your hard machine learning problems can help is always lovely, if you can get it.
There is definitely a pattern out there where a lot of startups claim to be backed by AI or machine learning. But, they’re deferring the hard machine learning problems, the core of what they’re trying to do, until they raise their series B. They start by putting a simple rules engine into production, with plans to come back to solve those big problems later.
Des: That seems fine, as long as you’re not crossing your fingers for three rounds of funding.
Fergal: Yeah, exactly. It’s a great thing to do if it’s actually solvable.
When to buy versus when to build
Des: As I look around this space, there is obviously a proliferation of both startups and established companies believing they are extremely advanced in AI. Some problems are relatively abstract, though, such that they apply to many companies. For example, I would guess that many email companies are wondering if they should offer their own version of smart reply. There are also survey companies wondering if they should provide further sentiment analysis.
Basically, is this a space where people are likely to build things out themselves, or should startups look to buy first?
Fergal: You should generally buy if you can. That’s a golden rule. This stuff is expensive to build and to develop. Looking back four or five years ago, we saw a lot of people trying to build horizontal companies, and horizontal offerings in this space.
Des: These are the general purpose engines, right?
Fergal: Yes, and keep in mind that my startup, the one acquired by Optimizely, was trying to build a hands-off general machine learning decision engine.
Des: Is that what Watson is?
Fergal: Oh man. That’s a deeper question. Watson, to a third-party without a lot of information, seems like a marketing term. IBM caught flack for that. I don’t really know what Watson is. If anybody knows, please send me an email. I’d love to know.
Companies are trying to find very vertical problems.
People are trying to build horizontal platforms, but the industry has stepped away from that a little bit. People realized that getting these horizontal platforms right was hard. In this second wave of machine learning, companies are trying to find very vertical problems, where they really own a narrow vertical.
They get the data necessary for that vertical, and then they organize the data. They figure out a good way that the data can drive action, drive impact for the business, and then they build something simple that solves that, which gradually over time gets more and more powerful. That seems to be the emerging paradigm at the moment – highly vertical applications of AI. It’s relatively easy to build an okay horizontal platform, but then really hard to integrate it to do the feature engineering for a particular task.
Avoiding risk as you create new solutions
Des: I do believe that we will have very successful general machine learning solutions that work across a lot of businesses, but they are very hard to build. The hard problems of gathering and organizing data, as well as the output of your system, is best solved in a very vertical way. People have converged around that idea. But Amazon and Google are gradually building suites of horizontal pieces of technology and we’re still learning exactly which of those are valuable.
A lot of our listeners are thinking about artificial intelligence, machine learning, or even rules engines. You mentioned that these projects tend to be unpredictable in their output. How do you work with that level of uncertainty? How do we effectively productize it while also not creating this massive uncertainty in our product roadmaps?
Fergal: How do we productize these things? It’s a great question. We’re still figuring it out. I think teams should try to find the machine learning piece and then isolate it. Try and see if there’s a way of delivering the customer value without building the machine learning piece yet.
Des: So back to this augmentation idea?
Fergal: Yes, back to augmentation. We’re back to doing things that don’t scale. Don’t try to take on the technical risk of building the machine learning, along with the product, engineering and design risk at the same time. Try and isolate. Try and cut down that risk, but be very thoughtful about the order in which you do those risks. The wrong thing to do is to start a machine learning project with six months of machine learning to solve that problem, when it’s a problem that customers aren’t actually interested in. To accommodate that change, it will take you another six months.
Des: Is there a tension there? I worry that relegates the machine learning components to the sugar on top of the product.
Fergal: I’m talking to startups or mid-size companies that want to add a bit of machine learning. Obviously, there are some products where the machine learning is so central that it is the product.
For example, if you want to ship self-driving cars, you have to make sure the machine learning works – although, Tesla and Google are approaching this problem very differently. Google seems to be working to nail the machine learning part, while Tesla is using machine learning as the sugar on top.
So, there’s a world in which they’re really disappointed, because it turns out in “x” years time, trying to do like an optical-based self-driving car is way harder than it seemed. You actually do need a giant lighter installation, or a solid-state lighter installation, or something like new hardware, to solve the problem.
Des: But in which case, they still have a really successful car company assuming, you know?
Fergal: Yes, and obviously, that gets into the specifics of that particular business, but what I’m saying is that there are some large scale projects where people are leveraging these technologies. In the case of Tesla, they’re building the machine learning as a little thing that sits on top. They hope later on it becomes part of the core value proposition of what they’re currently shipping. In a lot of ways, that’s a sensible way to do things, as long as it doesn’t turn out that some path-dependent decision you made very early on ties your hands for later, and you discover you can’t do it.
It really depends on your core value proposition, too. Is machine learning something that is going to add on to your product and going to make everything 10% more efficient, or, is it something that’s core to what you’re building? You should be really, really thoughtful about that. There’s a lot of startups out there building products enhanced by machine learning in order to attract funding. I advise people to be very thoughtful about that. Certainly don’t let the tail wag the dog, in those cases.
Des: If you build your entire brand around one technology, and you name your company accordingly, it could come back to bite you if you can’t deliver.
Fergal: It could. I am bullish about machine learning – I don’t think it’s a flash in the pan. However, I do think we’re in a period of hype. That’s why people need to be more thoughtful than ever about how they build their brands.
Des: This has been a great conversation. Thanks so much for your time today, Fergal.
Fergal: Thanks very much for having me, Des.