Customer support just got smarter: A closer look into our new ChatGPT-powered bot

What if you could have a trustworthy bot that captures the magic of large language models without going off-script? For this webinar as we introduce our newest, breakthrough AI bot and answer all your questions about it.

Ever since ChatGPT-3.5 launched last November, we’ve been asked one question again and again: “Can it answer my customer’s questions?” And up until this point, the answer has been no. We’ve conducted some initial explorations and even released a number of beta GPT-powered features for the Inbox, but we couldn’t build a bot that was suitable for business needs – GPT-3.5 had this unfortunate tendency to make things up when it didn’t know the answer, and you definitely don’t want a bot going rogue when you’re trying to support your customers.

But with the arrival of the newest GPT-4, designed to minimize hallucinations, things have changed. We went back to work to see if we could build a bot that would require minimal setup and that could converse naturally and answer questions reliably about your business, using information you control. And lo and behold, we think we’ve done just that. Its name is Fin, and we believe it can be a valuable asset that augments your customer support offering.

And so, just last week, we decided to host a webinar to answer all your burning questions about Fin: Does it really work? Will it replace customer service reps? And how should support teams prepare to incorporate this new technology?

In today’s episode, you’ll hear from our own:

They’ll talk about how ChatGPT is transforming the customer service industry and take a deep dive into the work behind Fin.

Short on time? Here are a few key takeaways:

  • If people have the choice between getting a very good answer immediately or waiting 15 minutes to get a handcrafted answer, most will pick the instant one.
  • Businesses should leverage AI to provide instant support for simple issues, enabling support reps to focus on the more complex issues that drive the most value.
  • To incorporate AI in their workflows, support teams should document their knowledge in a clear and unambiguous way so GPT-powered bots have a robust source to draw answers from.
  • Support is an extension of your brand. One of Fin’s merits lies in its ability to ensure that only information from your knowledge base is shared to avoid potential brand damage.
  • To drive Fin’s trustworthiness, we not only train it on your knowledge base, it actually links to the source material.
  • Fin works seamlessly with the rest of your system. When it doesn’t know something or gets asked a complex question, it will say it doesn’t know and hand it over to a support rep.

If you enjoy our discussion, check out more episodes of our podcast. You can follow on Apple Podcasts, Spotify, YouTube or grab the RSS feed in your player of choice. What follows is a lightly edited transcript of the episode.


Striking the perfect balance

Catherine Brodigan: Welcome everyone, and thank you so much for joining us today. I’m pleased to introduce Des first of all. And Des, I want to throw a quick question to get us started on the general trends we’re seeing in the industry at the moment. I sit on our sales floor in the Dublin office, and there’s just been a ton of excitement over the last week, both from our Sales team and our customers, about Fin. Two things that have really stood out from our customer feedback so far are, number one, how easy it is to set up, and number two, how quickly it gets going with intelligent answers. But as we know, sometimes new technology comes out and seems like it’ll change the game, only to fall short of expectations. So, I’d love to hear from you, what’s convinced you that we’re not on a hype curve and that this is the real deal?

Des Traynor: Seeing the product live is the biggest indicator that this is not smoke and mirrors; this is not vaporware; this is not even VC-pumped-up hype shit like Web3. Generally speaking, hype is something everyone’s trying to generate to profit from it. But our customers want this product. No one who gets what it does is in any way in doubt about its value. Even the demo bot, the Staybnb bot live here, gives really good answers to common questions. I personally ran through all the questions I’d ever asked Airbnb, a type of competitor of the Staybnb product, and for every question I asked, I got at least a 7 out of 10 answer, and in a few cases, I got a 10 out of 10. And I got them instantly.

“At this point, you’d have to be the darkest of skeptics or cynics to call it hype”

If you had the option between a very good answer immediately or you could wait 15 minutes and get a handcrafted artisanal answer, most people are picking the instant answer. And that’s good for the business too. If you’re trying to do something and you have to wait 15 minutes before you can get to the next step, that’s not an effective funnel. You’d never design it that way. And here’s the difference between this and, say, a hype train. You could look at gamification if you went back, you could possibly talk about AR and VR, and you could even say a self-driving car is yet to actually really land. This is here today, people play with ChatGPT, people play with Bing, people play with Bard, people play with DALL·E… and there are more coming. No one is not experiencing this in any way, and the reason customers are asking for it is pretty much because everyone is seeing it and living it and breathing it, and it feels like every week that passes is a decade of AI progress at the moment.

Even for us, things like Fin went from being, “Nah, that’s probably not going to happen anytime soon” in early November to, “We’re close, but it’s still a good bit away,” which is where I think we were in maybe December, to, “This is happening in January.” And that’s the pace of progress we’re seeing here. So, not only have we taken a gargantuan step forward – we, the industry – but every week, we seem to be taking faster steps as well. At this point, you’d have to be the darkest of skeptics or cynics to call it hype.

Catherine Brodigan: Yeah, I think, as you say, we’ve moved really, really fast here, and as such, there’s a huge mindset shift required of the industry at large, particularly the customer service industry. So, how do you tackle questions about striking a new balance between that human, personal, artisanal answer, this incredible new tech out there, and the macroeconomic climate and this desire to achieve economic efficiency and consolidate tools? How do you see this changing the game in how businesses can strike that balance?

“You don’t hire a support team to have a bunch of professional apologists or explainers of, ‘click here to reset password.’ That’s not useful”

Des Traynor: I think businesses now have the option to provide top-tier, super-personal service on the issues that need it, and incredibly fast support on the issues where a simple, fast answer will do. On any given day in Intercom, we will deal with, “How do we get a new API key?” And we’ll also deal with some confusion about, “I put a Series live two weeks ago, and I’m just checking in; one of the customers should have received it, but I’m seeing they’re blocked, and how do I unblock?” Blah blah, blah. The first one takes a little bit of time to answer – it takes a bit of our time, and it takes a bit of the customer’s time.

The second one is actually a messy one, and it could take an hour or two to diagnose, and that’s an actual interoperative thing, but the first one drags on our ability to do the second one, and I think every support team has a version of that. They’ve complicated queries, like, “Hey, I’ve booked a room for seven nights but only need it for five, and I’m checking out halfway through and coming back,” or something like that, and they also have, “Where’s the swimming pool?” And the idea is that by removing much of what I would call undifferentiated transactional support, you really enable and empower support teams to actually deliver high-quality support in the more complicated moments, the more emotionally charged moments, the more urgent moments, the more emotive moments. They’re the ones where the support team really drives business value. You don’t hire a support team to have a bunch of professional apologists or explainers of, “click here to reset password.” That’s not useful. But at the same time, it’s still unavoidable.

I think the balance, for me, is in finding that sweet spot of, “Where does the support team drive the most value, and where are we just frustrating our customers?” It’s not a brand-building opportunity to explain how to reset your password – it’s just a bloody link. That’s where Fin shines. And support teams shine where they know how to shine. That, to me, is the balance.

Catherine Brodigan: It’s, as you say, figuring out what’s automatable and what’s actually going to need that deeper, more meaningful human conversation. The game is going to change again and again and again over the next 3, 6, 9, 12 months, and there’s going to be inherently more and more value for support teams in the future. But if we were to focus on the here and now for a second, what’s the main value-driver for support teams today, and how should support customer service teams think about getting ahead of the game and getting ready to use this technology in the best possible way?

“How do you prepare for this world? The short answer is you prepare for it by documenting all the knowledge support teams know”

Des Traynor: Yeah, I think any future-facing support team should start assuming we’re entering a world where AI is going to hugely augment and empower them in their workflows. You have a massive opportunity to deliver world-class support for your company and ultimately give your company a competitive edge over its competitors by basically saying that your support’s better than anyone’s.

Now, how do you prepare for this world? The short answer is you prepare for it by documenting all the knowledge support teams know. Why does that matter? Well, the advancements here are in the realm of large language models that can ultimately consume information and return conversational answers around them. They don’t know things you don’t tell them. You don’t want them to make up facts. You want them to work off things that are known. If, for example, you have a policy upon which you’ll reissue an API key, but that’s not explained anywhere – it’s like tacit knowledge conferred by osmosis around the support team – Fin’s never going to work that out unless it starts scraping through backlogs.

I think the best way to be prepared is to have a clear stance on all of the most common things that support teams do and have that stance written in a pretty clear way that’s easy to parse. Honestly, the thing is so good that it’ll work it out anyway, but for your own sake, you should be clear. To the degree that you document the majority of the things you need to know to be a support agent, that’s the degree to which Fin will become a rockstar member of your team. The best way to prep is to do that. Thankfully, a lot of our customers who use articles already have hundreds of articles explaining all this, so they’re good to go, but if you’re not there yet, now is a great time to invest.

Catherine Brodigan: Got you. The job of help center content writers and content designers has all of a sudden just become much more of a desirable commodity.

Des Traynor: Yeah, and maybe not even a commodity. Yeah, content will play a really important role, and people with great content will be able to deliver world-class support, so I think it’s well worth the investment.

GPT, but make it trustworthy

Catherine Brodigan: Got you. Fergal, I’d love to bring you in for the next question and go a level deeper into Fin. We’ve launched it last week, and I’d love to hear, from a technical perspective, what makes Fin different and what makes Fin powerful.

Fergal Reid: Sure, Catherine, thanks. And thanks, everyone, for coming along. Yeah, look, large language models are the new breakthrough, and we’ve had Resolution Bot, but we’ve had it for years. It uses neural networks and works very well at what it’s good at once you saddle up. But the models we use for that are not as good at understanding the complexity of human conversation, and so many times, in a support exchange, somebody asks a question, they get a piece of information back, and then they ask a clarifying question or a nuanced question. They’re like, “Oh, that’s not exactly what I meant – I actually meant to ask about that.” With Resolution Bot, we tried to build prototypes that understood the complexity of natural language and could never get them to work as well as we wanted on that multi-turn, messy human conversation.

What I think is new here is that large language models in transformer models, GPT, have got that down. If you go and play with Fin, we’ve watched people play with the demo, and we’ve just seen so many examples where it does the right thing on a follow-up question, and we think that there’s a qualitative change there that’s like, “Oh, it is 10% or 20% better at answering a follow-up question.” It transforms the user experience and makes people think, “Okay, suddenly I can talk about this. I can talk with it, and I can trust it.” That’s new. There’s a fundamental UX change in the quality of a bot that you can build and deliver.

“Over the next few years, we’re all going to learn a lot about bits of our help center articles that are accidentally ambiguous, and we’re going to get them a lot sharper”

And the second big piece is that the language model is better at natural language, so it can understand help center content better. If you give it an article, it is amazingly good at picking out an answer from that article and giving the correct answer, to the point where we’ve had many instances where we’re like, “Oh no, it hallucinated.” But actually no, that is what the article says. Because we were testing this on articles from a public help center that we’re not experts in, and Fin did a better job understanding that than we would.

And again, it’s not perfect, and it requires nuance. To Des’ point earlier, you really want your content to be unambiguously written because we’re trying to design the bots so that it doesn’t give the wrong answer if something’s ambiguous. If you play with Fin, you’ll notice it’s pretty conservative if there’s something written in an article that’s ambiguous. I bet, over the next few years, we’re all going to learn a lot about bits of our help center articles that are accidentally ambiguous, and we’re going to get them a lot sharper because we’re just going to see those edge cases and iterate on that. That’s what’s new. Those are transformational capabilities.

Catherine Brodigan: Yeah, absolutely. I think it’s fair to say ChatGPT has reset expectations around some of the most common misconceptions about AI. Where would you say you’ve seen the most significant shifts?

Fergal Reid: Obviously, it’s a huge question. Just seeing people play with our demo, one of the big misconceptions for users at the moment is that you can come along to a bot like this and ask it anything; ask it to help you with your homework. And that’s not what Fin is designed for. Fin was designed very explicitly and clearly to stay away from that. It’s going to answer questions about your help center, or it’s just going to say, “Sorry, I can’t help you with that.” There’s definitely an end-user expectation that once the bot has a natural language understanding, it’s okay to ask it to help me with my homework, where the capital of Argentina is, or any other question. And I think that misconception will change pretty fast. Everyone has seen ChatGPT, and I think that now we’re going to see the next wave of folks like Intercom, “Hey, how can we take our existing tech and setup and marry that with GPT-style tech to make better and more constrained experiences?” Over the next six months or a year, I think the user expectation will change.

There are lots of misconceptions on the technical side as well. These models do really well out-of-the-box, without a great degree of training. You can’t even train today. At the moment, if you want to go to GPT-4 or any of the other large language models, you can’t train those for your specific business or even your domain. There’s an extent to which they do amazingly well out-of-the-box without that training, and then there are other ways around it, like how we’ve built Fin and engineered Fin – we give it a lot of context on the business as you interact with it. We’re all learning here, and I think the industry is going to have to learn a lot about the parameters of these models and what makes a good user experience.

Catherine Brodigan: Absolutely. We all had a good play around with ChatGPT when it came out, and it’s super impressive, but we’re doing more than just that with Fin here, outside of what these LLMs are doing in the market for free. What would you call out as Intercom’s secret sauce here? The thing that’s going to be most impressive for customers and their customers, who are going to come in and get a hold of Fin and take it out into the market.

“With Fin, even if the underlying language model knows the answer from something it learned about your business or a competitor from the internet, if it’s not in your knowledge base, it won’t respond”

Fergal Reid: Weirdly, I think it’s that we’re both doing more and less, in that we feel that it’s really important to have a bot that will just respond with curated content from your help desk. Someone can go and ask it a question where you might not want a bot answering their particular question. People will ask questions that could cause brand damage. If you just went and deployed a more naive ChatGPT-style bot, people will ask you questions about your competitors, and it’ll talk to them about your competitors. And who knows what it will say. It will say whatever the internet says about your competitors, and you almost certainly don’t want your customers in those conversations. You would not be happy if your support representatives were doing that, and you’re not going to be happy if your bot does that.

What we feel is so exciting about Fin is that it’s limited. It’s limited to your help center, and we’ve put a lot of care and attention into building it, trying to capture the magic of the natural language dialogue with the ability to limit and trust it. As Des alluded to earlier, we’ve had a bit of a rollercoaster here. After ChatGPT launched, we were like, “Oh my God, this is going to disrupt support; this is going to happen really fast.” And then we’re like, “Oh, no, it’s not because it’s not trustworthy; it gives very superficially appealing answers.” And I think now we’re like, “As a technology, it has gotten better.” And as we’ve learned to use it more, it’s possible to build trustworthy, business-ready tools. And they have limitations.

With Fin, even if the underlying language model knows the answer from something it learned about your business or a competitor from the internet, if it’s not in your knowledge base, it won’t respond. We’ve deliberately engineered it like that, and we have a lot of conviction that’s what customers are going to want. Now we have to deploy it across a few thousand customers, and as always, there will be edge cases and so on, but the initial response from our customers has been very positive.

Catherine Brodigan: Of course. To pull on that thread around applications of this tech, for folks who aren’t aware, back in January, we launched a bunch of features in the Intercom Inbox with AI assist behind them – things like conversation summaries or text expanders, whereas Fin is obviously a customer-facing product. Where do you see AI being most heavily-weighted or valuable for support? Will we continue to invest in AI for support agents as well as for that end customer experience?

“For the conversations it can’t address, we’re going to have way faster support reps. We believe very strongly in investing heavily in both”

Fergal Reid: Absolutely, we’ll continue to invest in it. The question of where is most important is really hard, and I’m totally bought in on the value of AI here. I’ve been in the machine learning team at Intercom for about five years, but I’ve been skeptical. Part of my job has always been to be skeptical. When someone came along and said, “Hey, my bot will resolve 90% of customer queries.” I’ve always been like, “No, it won’t.” I’m much less skeptical now. This next generation of tech is going to be really transformative.

And for the conversations that it can’t address, we’re going to have way faster support reps. We believe very strongly in investing heavily in both, basically. There’s no way that both aren’t going to radically change – the pace of the underlying technology continues to amaze and astonish. Even for people like us, who are very close to it, things are changing month by month. I think it’s going to be a wild few years for customer support and customer service, and we’re really excited about it. We’re determined to be there, turning them into valuable features as fast as possible.

Watch Fin in action

Catherine Brodigan: I feel like this is a really good segue into our demo. Emmet Connolly is our VP of Product Design here. Emmet, before we get into the demo, I’d love to get a quick summary from you on what we built and the standout features in Fin.

Emmet Connolly: Des and Fergal provided a lot of background on the technology and the context we’re launching this into. We’ve built and launched Fin, a natural language chatbot, inside our Messenger, which can exist inside your product. We’ve had chatbot functionality in Intercom for years, but Fin introduces big improvements over the state-of-the-art. First of all, it’s excellent at understanding natural language questions – all sorts of questions that get typed at it – actually making sense of them, and providing natural language answers that are generated to be a direct answer to the question, not just a quote from an article or a pre-canned snippet, but a, “Yes, you can do that,” or, “No, you cannot do that,” in response to straight answers.

From a language point of view, it can do other things as well. It can hold full English conversations, where there’s a back and forth. You can say, “Oh, can I bring pets when I’m staying in my…?” And it’ll say, “Yes, you can.” And then you can say, “Well, how many can I bring?” And it understands that “how many can I bring?” is pets. You get this very natural, back-and-forth flow. It can ask clarifying questions, follow up, and so on.

“The support you provide is an extension of your brand, a key touchpoint, and you don’t want the bot going rogue”

All of this is based on the GPT-4 model, this cutting-edge language model that lots of people have heard about and tried at this stage. For direct use in a product, these things have some problems. They’re trained on all of the content of the web, so anything anyone says about your company on the internet could potentially be piped through. As Fergal said, whatever you would not want your support staff talking about, we would not want the bot to talk about. We would not want it to talk about almost any subject outside of the domain of your company, or provide almost any answer it can find from the wild web.

And then, finally, it has this tendency to hallucinate, to make up very confident-sounding answers that tend to not be true. With Fin, we set out to solve a lot of these problems: first of all, it’s trained on your knowledge-based content, so it can talk about and it’s willing to answer questions within that area but not outside of that, and it will actually decline to engage in conversations about other topics. We realize that in some cases, having a bot say, “I don’t know,” or, “I’m not going to talk about that with you,” is actually a feature and something you really want.

We wanted to drive the trustworthiness as much as we could, so we also have it link to its source material. This allows people to get a simple answer from Fin but also click through, read the article, and learn a lot more. We put a lot of effort into that, and part of that is because the support you provide is an extension of your brand, a key touchpoint, and you don’t want the bot going rogue. And it’s trained on your knowledge base, which already exists in many cases, so there’s essentially zero setup required to turn this thing on. You point it at your knowledge base, set it live, and straight away, the bot slurps up all of that information, treats that as its corpus of knowledge, and begins answering questions instantly.

One thing that stands out to me, apart from all of these fancy capabilities, is that the barrier of entry to actually adopting the product is so, so low, there’s almost no reason to not give it a go and see how it works for you. We think that the cost-benefit ratio of pointing it at any help center and turning it on is a ridiculously positive ratio and a really good reason for people to adopt Fin and give it a go.

“It’s a conversational, trustworthy, zero-setup AI bot that’s going to really complement support teams and work alongside them”

And then, something special and unique about this is it works with the rest of your system. It’s not this standalone chatbot that’s stupidly trying to answer questions and sometimes failing. We were able to build these constraints and safety features around it because we have the rest of Intercom, particularly support teams, that we can pass those queries to. Fin will say, “Look, I don’t know,” or “I’m not at liberty to talk about that topic, but I can pass you on to my support team.” That goes back to what Des was talking about, having the bot answer the questions that it’s good at and allowing support teams to shine where they’re best.

So, in short, it’s a conversational, trustworthy, zero-setup AI bot that’s going to really complement support teams and work alongside them. We’ve even had people saying, “Wow, it feels like having an additional support team member.” In that handover process, it can ask clarifying questions so that the team has a lot more context before they even get the message sent to them. It’s helping the teams rather than simply helping the customers.

Catherine Brodigan: Got you. It’s rooted in high confidence, knows its limits, and knows what it’s good at. Emmet, thanks a million for that.

Fin AI Copilot CTA (Horizontal)