Info iconLetter I in a circle

By continuing to use this site you consent to the use of cookies in accordance with our cookie policy

The 100 answer challenge: How our support team helped train Answer Bot

Senior Customer Support Representative, Intercom

Eric Fitzgerald

@mrericfitz

Customer Support Manager, Intercom

Leanne Harte

@leanne_harte

Main illustration: Roberts Rurans

We recently launched Answer Bot, an intelligent chatbot that provides precise answers to customers and which successfully resolves 29% of your most common frequently asked questions, right there in the Intercom Messenger.

It was the culmination of a huge amount of work by multiple product teams, and vast amounts of research by our machine learning experts.

The development of Answer Bot also saw a very significant involvement from Intercom’s Customer Support team. We have long been a valuable source of customer feedback to the Product team, while we also take part in the QA process and beta test new feature rollouts, providing direct feedback and guidance.

This project, however, marked an entirely new level of cooperation. Designing a product like Answer Bot that sets out to improve and optimize the support experience benefited massively from the input of the Customer Support team, as we aim to be product experts and understand our customers’ top questions from helping them day in, day out.

“We provided 100 answers to our most commonly asked questions in order to help train Answer Bot”

From the very early stages of the project, we worked closely with the Product team to provide feedback on the user experience, and to really help shape this new product as it developed.

And central to our involvement was a significant challenge – providing 100 answers to our most commonly asked questions, in order to help train Answer Bot and make sure it was ready to go from day one.

Finding the right answers

One of Answer Bot’s most powerful features is its ability to analyze your past conversations and surface suggestions for your most commonly asked questions.

But starting from scratch, we needed to collate a pool of answers to help train it and gauge its effectiveness.

The reason we aimed to curate 100 answers in such a small period of time was to gain a strong understanding of the ins and outs of the product and to begin to really understand the data behind these answers – above all, how often they successfully resolved questions, automatically. A sample size of 100 answers allowed us to adequately assess the impact that Answer Bot could have for a support team like Intercom’s.

If Answer Bot could resolve even 10% of our customers’ frequently asked questions, this would free up a serious amount of time. As a result, our team would be enabled to focus less on the frequently asked questions and devote more time to our customers’ tougher problems, resolve the more technical investigations and help new customers get set up as quickly as possible.

“If Answer Bot could resolve even 10% of our customers’ frequently asked questions, this would free up a serious amount of time”

Our first approach to testing Answer Bot was to build a dedicated group of Customer Support teammates who would create answers within Answer Bot on the main Intercom workspace, test the product to see what worked and more importantly what didn’t, and provide detailed feedback to the product team.

For our dedicated team we enlisted a group of some of our most tenured team members, spanning across four offices around the globe, who all boasted an impressive and deep knowledge of Intercom and our products. We worked together to identify and build out our answers, supplying Answer Bot with suggested answers while it was being tested. Communicating regularly, we reviewed the suggested answers to ensure they were firing properly and sending out the correct information. We were, in a sense, helping to train Answer Bot as it became smarter and more useful.

Questioning our intuition

As we collated 100 answers, we discovered something interesting. At first, we relied on our own intuitive sense of what Intercom customers’ most frequently asked questions were. We were the experts in this field, so it seemed like a natural way to start.

“The anecdotal approach provided very different answers from the analytical”

However, it transpired these questions diverged markedly from what the data revealed. The anecdotal approach provided very different answers from the analytical. For instance, teammates who were involved in particular projects were naturally prone towards suggesting questions concerning those topics. And it also highlighted some understandable blindspots – one example of a frequently asked question that we missed was “Where do I find my closed conversations?” This is second nature to the support team, and it didn’t occur to us that it might be confusing to a customer.

When our analytics team provided us with conversation “clusters” (conversations that coalesced around similar questions, themes or topics), we were able to identify the most glaring omissions. We used the clusters as a platform to clearly outline questions for which answers were needed. Using hard data as a springboard from which to craft answers super-charged the product, and saw us hit our goal of 100 live answers within two weeks.

 

The next step was to start monitoring the answers we created, optimizing them for quality, and fixing ones that simply weren’t working.

When an answer was misfiring, there were options to create answers for the questions for which this answer was sending incorrectly. For example, an answer about how to create messages may have fired for a question about creating users, in which case we simply provided a correct answer. Also, we found that one of our answers about how to invite a new teammate was firing for questions about deleting teammates. As part of the optimization process, we created an answer for how to delete a teammate and ensured it would fire at the right question.

To diagnose why some answers misfired, we focused on the supplied examples. Was there enough meaningful variation between the answers? “Meaningful variation” means having different ways of phrasing important words within a question. The difference between “Can I?” and “How do I?” is not a meaningful variation. By contrast, the difference between “admin” and “teammate” is a very meaningful variation indeed.

Here we can see how to train Answer Bot to recognize certain questions on a given issue. We had to ensure that, within the examples we fed to Answer Bot, there was multiple variations of different phrasings of the same issue, while precluding questions that shared words but differed in meaning.

A final tool available to us were trigger words – clearly defining the phrases that should be present in a customer’s question in order for an answer to fire.

Cross-functional development

We didn’t just focus our work within the Customer Support team, but also with our Product Education team as well as our Sales team, helping them identify and convey the value of Answer Bot to leads and customers.

“Our support team now has more time available to focus on the meatier questions”

By spending time testing Answer Bot during its evolution, the Customer Support team were set up for success in time for the launch, and the benefit was two-fold – we were able to hit the ground running in terms of supporting our customers as they adopted Answer Bot, and also we were able to get great value from Answer Bot ourselves.

We have now established a special Answer Bot team to focus on curating answers and maintaining and optimizing these answers. Using Answer Bot ourselves means our support team now has more time available to focus on the meatier questions and to help our customers in a valuable way.

And that collection of 100 answers we started out with is evolving all the time, continuously optimized and growing in number. Answer Bot, it’s fair to say, is just getting started.


Extra resources:

Ad_6_1200x628