A team of researchers studied how AI could assist support agents. This is what they found.

Unleashing productivity: Learn how AI boosts customer service team productivity by 14%, according to Stanford and MIT researchers

Audio Content Producer, Intercom

May 10, 2023

What happens when generative AI meets the workplace? In this special episode, we dive into a groundbreaking study as AI-assisted humans redefine the boundaries of productivity.

Generative AI may have captured considerable attention from the public, and yet, as of now, its real-world economic implications remain largely unexplored. Despite promising signals in test scenarios, any immediate benefit from a business point of view have seemed out of reach, until now.

Researchers from Stanford University and MIT conducted a year-long study to measure the real-world impact of generative AI among over 5000 customer service agents at a Fortune 500 software firm, and the results are in. Customer service worker productivity increased by 14% on average, with a staggering 35% jump among the newest or lowest-performing workers.

The AI system, which combined OpenAI’s GPT language learning model with machine learning algorithms, analyzed conversations between the high performers and compared those to that of the low performers. It then generated real-time suggestions on how to respond to customers, which ended up decreasing chat handling time, increasing chat resolution rates, and improving customer satisfaction. In fact, newly hired customer service agents could, with the help of AI, perform just as well as agents with six months of experience working without AI.

In today’s episode, we had the opportunity to chat with MIT Ph.D. candidate Lindsey Raymond, one of the researchers behind the groundbreaking study, about their work and the transformative impact of AI in the workplace.

Short on time? Here are a few key takeaways:

Generative AI thrives on abundant data, which is what makes customer support, with its wealth of text data, a prime industry for the development of AI tools.
The productivity gap between top-performing and bottom-performing support agents, as well as increasing reliance on contact centers, are major drivers for improvements in the customer service industry.
Low-skill workers benefited most from the AI tool, as it helped them adopt best practices they hadn’t figured out on their own yet.
The significant productivity gains enabled by AI, such as improved problem-solving and customer satisfaction, could even support the rise of the four-day workweek.

If you enjoy the discussion, check out more episodes of our podcast. You can follow on Apple Podcasts, Spotify, YouTube or grab the RSS feed in your player of choice. What follows is a lightly edited transcript of the episode.

Making waves in customer support

Liam Geraghty: Hello and welcome to Inside Intercom. I’m Liam Geraghty. It’s kind of crazy to think that Open AI’s ChatGPT was only launched a couple of months ago. The speed at which AI became a part of our lives is something nobody could have predicted. It’s already starting to transform the customer service and support space.

“A human plus machine is better than a machine, which in turn is better than a human. I think that’s what I see in this world of support”

Intercom co-founder Des Traynor talked about how he believes the future of CS is automation and humans, bots and brains working together, on a recent episode of our podcast.

Des Traynor: A human plus machine is better than a machine, which in turn is better than a human. I think that’s what I see in this world of support. I think you’ll have humans ultimately controlling the intelligence that the AI works off.

Liam Geraghty: Many customer support leaders have dove right into AI and are swimming around in its generative waters. But others, while excited, are only dipping their toe in, feeling a little daunted.

Well, for any of you toe dippers, you might be interested to hear about a new study by researchers at Stanford University and the Massachusetts Institute of Technology, all about generative AI at work, with some really interesting findings. The study was conducted by Eric Brynjolfsson, Danielle Li, and Lindsey Raymond.

Insights from generative AI in the workplace

Lindsey Raymond: I’m Lindsey Raymond. I’m a graduate student at MIT.

Liam Geraghty: Lindsey and her colleagues study the impact of generative AI tools on productivity at a Fortune 500 company. It’s the first time the impact of these tools on work has been measured outside of a lab setting.

Lindsey Raymond: The idea of generative AI itself is pretty new. In terms of what people have studied, there has been some work on how these tools perform on things like the bar exam.

Liam Geraghty: AI crushed the bar exam.

Lindsey Raymond: Or coding exams, very laboratory-based examinations of the capabilities. And ours is the first that says what happens when you study what these tools can do in a real workplace and over a long course of time because our study occurs over the course of the year.

Liam Geraghty: So what exactly was the study about?

Lindsey Raymond: We look at a generative AI-based tool that’s designed to help tech support workers when they’re solving people’s tech support problems.

Liam Geraghty: Sounds familiar!

Lindey Raymond: Telling them both what to say, how to solve the specific tech support problem, and also guidance on how they should communicate that to the customer.

“Generative AI needs a lot of data to work really well. If you look at a sector of the economy where there’s high-ish penetration relative to everywhere else, customer support is that area”

And we do a difference-in-difference analysis – a very slow rollout of this tool across people over time so we can try to get at the causal effect of the tool. We’re looking at workers providing tech support for a Fortune 500 firm that does small business and accounting software mostly for US-based small businesses.

Liam Geraghty: They looked at a lot of different outcomes, like how quickly people resolved calls, how many issues they’re able to resolve, customer satisfaction, as well as some things that are more organizational change.

Lindsey Raymond: How does this impact employee turnover? How does this impact how often they talk to each other or with their managers?

Liam Geraghty: You might be wondering why, out of all the potential areas of generative AI, Lindsey and her colleagues chose customer support to focus on.

“There are pretty huge productivity differences between your top-performing customer service agents and your bottom-performing ones”

Lindsey Raymond: Generative AI needs a lot of data to work really well. If you look at a sector of the economy where there’s high-ish penetration relative to everywhere else, customer support is that area. There’s been a surprising amount of activity for the actual rollout and development of these tools. And that’s because there’s just so much data in that area, particularly text data.

A lot of it is just automatically associated with outcomes – how quickly did that worker resolve that problem? And there’s also a lot of room for improvement. It’s a well-known fact that there are pretty huge productivity differences between your top-performing customer service agents and your bottom-performing ones. It’s also an area where there’s been this huge shift to doing more with contact centers over the past couple of years. And so, it’s an area where there’s a big business need to get better at this.

From zero to hero

Liam Geraghty: So, over the course of a year, they studied all of this using data from 5,179 customer support agents. And what they found is intriguing.

Lindsey Raymond: The headline number is that, on average, access to AI improved productivity by 14%, but that hides a lot of heterogeneity. For the least experienced and lowest skill workers, it actually improved by 35%. The most experienced and productive workers see almost no effect.

Liam Geraghty: So, the gains accrue disproportionately to less experienced and lower-skill workers. Why does that occur?

Lindsey Raymond: I think that is probably the most interesting part of the study. Any machine learning-based tool uses a training data set and looks for patterns in the data. So you, as a programmer, don’t say, “Well, I know this phrase works well, so do this, and I know this is the common solution to this problem, and this is the common solution to that problem,” and you put that information in your program. That’s not how ML works.

“It’s the workers who are very new or at the bottom of the productivity ranking who really benefit from those suggestions because those are the things they haven’t figured out how to do yet”

In our setting, specifically, the tool looks at the conversations of the high performers and compares those to that of the low performers. It looks for differences between what the high and low performers are doing that are associated with successful outcomes. What is the way they greet customers? What are the solutions they propose? How do they start asking diagnostic questions? Then, it takes all of those things and turns those into suggestions that it generates for everyone. The high-skill workers are providing the content for the AI – those are mostly things they’re already doing because that’s where the AI has been learning that from. When you have a tool suggesting you do things you’re already doing, you probably are not going to see huge productivity effects from access to that tool. It’s the workers who are very new or at the bottom of the productivity ranking who really benefit from those suggestions because those are the things they haven’t figured out how to do yet. It’s the low-skill workers who change a lot and start moving closer to communicating like high-skill workers.

What we think is happening is this diffusion of best practices enabled by AI. And that’s why we see those really big productivity increases for the low-skill and inexperienced workers and not so much for the high-skill workers. And that, we think, is just a function of the way machine learning works.

“In any study where you see 35% productivity increases, that’s pretty shocking. You could imagine going down to a four-day workweek with those effects”

Liam Geraghty: Were you surprised by the results?

Lindsey Raymond: That’s a great question. In any study where you see 35% productivity increases, that’s pretty shocking. You could imagine going down to a four-day workweek with those effects. I think that was pretty surprising off the bat. The fact that we saw effects not just in the workers handling calls a little bit faster, but also improving the share of problems that they solve, which is more of a knowledge-based outcome, is enabling them to solve problems that they weren’t being able to solve before. And then, we see pretty big increases in customer satisfaction. Those were, I think, all surprising.

Liam Geraghty: Do you think AI will ever be able to jump in and do these types of studies?

Lindsey Raymond: Probably, yeah. I’m certain there’s generative AI that can write economic papers better than I can write them.

Liam Geraghty: Lindsey, thank you so much for talking with me today.

Lindsey Raymond: Yeah, absolutely. It was a pleasure.