AI in the designer’s toolbox: Shaping the future with ChatGPT

As GPT raises the bar and large language models get more sophisticated, what new horizons will emerge for design thinking? And how will they disrupt the traditional role of product designers?

The launch of ChatGPT has sparked a whirlwind of opinions and discussions, with all kinds of folks engaging in heated exchanges on what it all means for us. Now, we’re the first ones to admit that predictions and speculative forecasting can be a fool’s errand, but one thing’s for sure – these models are already causing a seismic shift in how we think about and build our products.

If you’ve been listening to Inside Intercom in the past week, you know we’ve quickly jumped on the GPT bandwagon, designed new AI-powered features, and shipped them to 160 beta customers (feel free to check out part one and part two of the conversation if you haven’t already). Today, for the latest installment of our GPT chats, we’re joined by some of the people who have actually been doing applied design work with ChatGPT and large language models to build real products that solve real issues for customers.

In this episode, you’ll hear from our very own:

They’ll talk about large language models (LLMs) like ChatGPT, and how they will shape the role of a product designer in the years to come. We can’t know what the future will bring, but if you ask us, the best you can do is lean into it.

Here are some of the key takeaways:

  • GPT is really good at summarizing content, understanding language, and editing text. But a major problem is that sometimes its answers sound plausible, but are factually incorrect.
  • As the tech evolves, support orgs will shift from being reactive to proactive by training the AI and ensuring that the support is done in a conversational way that feels natural to humans.
  • New design patterns might emerge for managing uncertainty and expectations, such as building confidence scores into the features.
  • Over time, businesses will be able to use open source models and build layers on top with specialized knowledge using proprietary data from their industry or company.
  • In the future, interacting with AI may involve conversational interfaces, graphical interfaces for workflow augmentation, and even neural interfaces.
  • The role of a designer will be to create an AI interface which functions as an intelligent, non-threatening colleague who can augment your workflow and make your day easier.

Make sure you don’t miss any highlights by following Inside Intercom on Apple Podcasts, Spotify, YouTube, or grabbing the RSS feed in your player of choice. What follows is a lightly edited transcript of the episode.

First encounters

Emmet Connolly: Hi, everybody. Welcome to the Inside Intercom podcast. I’m very excited today to be joined by Molly and Gustavs from the Intercom Product Design Team. Since the launch of ChatGPT a few weeks ago, there have been a lot of heated discussions, a lot of idle random speculation, and a lot of armchair quarterbacking about what it all means. Most of it, I would say, is from people who have not actually worked directly with the technology at all, which is why I’m very excited to talk to Molly and Gustavs today. Because Molly and Gustavs are two of the fairly teeny tiny minority of the entire world who have actually done real applied product design work with ChatGPT and related technologies. I mean, actually using it to integrate with products and solve real product issues for customers with real products operating at scale. So, Molly and Gustavs, welcome to the show. Maybe you’d like to introduce yourselves very briefly. Molly, do you want to go?

Molly Mahar: Sure, sure. I’m Molly Mahar. I’m a Staff Designer here at Intercom. I’m fairly new. I’m embedded with the Machine Learning team, with a team of engineers, and we do a lot of prototyping.

Gustavs Cirulis: Hey, I’m Gustavs. I’m a principal product designer, and I’ve been here for a bit longer than Molly. I’ve been sort of all over the place, but at the moment, I’m working on the growth team.

“It reminded me of behavioral economics in college and the Dunning-Kruger effect, where you’re incompetent but overconfident”

Emmet: Today, we’re going to talk about AI and large language models like ChatGPT. Specifically, about what they mean for design and designers. We will talk a bit about what new opportunities are available to designers, specifically with this new technology, what it’s been like for you or what’s been different about working with AI versus traditional products, and some of the challenges that you’ve encountered as you’ve started to design these AI-powered features. We might even get into some ill-advised prediction-making at some point. But let’s start with the basics. Molly, what was your first reaction when ChatGPT landed on the scene and made quite a big splash just a few weeks ago? You’ve been working with AI and ML systems for quite a while before that.

Molly: Well, first, I got bombarded by a number of screenshots on Slack and started seeing them come in from people all around the company and on Twitter and everything. I tried it out and was like, “This is very cool. This is also very smart.” Large language models (LLMs) have been around for a while, but now they’ve put a UI on their API. And so, more people everywhere are able to use them without having to be a developer or anything, which I think is pretty awesome and shows just how excited people were about them. I started playing with it, and it’s really powerful.

You can ask it a lot of questions, you can follow up. It feels really amazing. It feels like somewhat of a conversation. Then we started, as a team, digging into it to try and stress-test it. And I felt like I was starting to see the hype. It reminded me of behavioral economics in college and the Dunning-Kruger effect, where you’re incompetent but overconfident. And it felt like that sometimes. This ChatGPT is so much better at bullshitting than I am. I’m amazed at it. And so, I went through a wave of feelings about it. I wonder if it would be useful to give a quick overview of LLMs.

“LLMs have been around for a while, getting better and faster and faster. The amazing thing about ChatGPT is that, as a person, I can actually use it”

Emmet: I think so. I think, for a lot of people, there’s this association with ChatGPT as the AI everybody is talking about. So, would you mind explaining in layperson’s terms what ChatGPT is and how that relates to other terms like large language models that folks might have heard about?

Molly: Yeah, I’ll do my best. So, large language models, LLMs, for short, are models trained on a huge corpus of public text from everywhere – books, the internet, multimodal sources, I think, sometimes. Billions and billions and billions of pieces of data inside. And they’re often trained with human feedback along the way. I think that leads to why you can have this conversation with ChatGPT – you can give it feedback, and it’ll actually respond to that and change its responses. LLMs have been around for a while, getting better and faster and faster. The amazing thing about ChatGPT is that, as a person, I can actually use it. And second, it’s actually really, really good. ChatGPT is the front end, basically, and I’m simplifying this a bit, but it’s the front end for a large language model API that OpenAI has in the background. And they have a number of these.

There are a lot of other companies that also have large language models. Google’s working on LaMDA, and there are other companies. And so, we might say ChatGPT here today, but we’re referring to this technology in general. We’re actually working with the APIs behind it, not with ChatGPT, which is only available through the UI right now.

“Before, it was just kind of, ‘Hey, generate me this poem about whatever’. Now, you can have a back-and-forth conversation. That’s how humans interact with each other”

Emmet: Yeah. And I think one of the things that’s interesting about ChatGPT is that, in some ways, it’s not that new from a technical point of view. ChatGPT is an app built using GPT-3.5 built by a company called OpenAI. But GPT-3.5 has been around for a while – several months, right, Molly? So, I’m curious. Gustavs, what was your reaction? Why do you think there’s a different reaction to what we’re seeing with ChatGPT versus the underlying tech, which was available for some time?

Gustavs: I think the big difference is that the presentation is like a conversation where you can ask follow-up questions and go deeper. Before, it was just kind of, “Hey, generate me this poem about whatever.” Now, you can have a back-and-forth conversation. That’s how humans interact with each other. So, it’s way more familiar than giving it a one-off prompt. When I was playing around with ChatGPT when it just came out, it felt like magic. It was really hard to believe this exists. And I just kept playing around with it, talking about different topics, and it felt like having an on-demand personal tutor that knows everything about everything. It talked about all sorts of things about technology, history, psychology, and even comedy. Turns out, it’s really good at coming up with standup comedy if you give it a good prompt. It was really fun to do that as well.

A case of hallucinations

Emmet: You’ve both spent several weeks working with this now. We all had that very impressive initial reaction, but having spent a few weeks trying to apply this to real customer problems, maybe wrestling with directly applying it somehow, does it stand up to the hype, Gustavs?

Gustavs: Yeah. As soon as ChatGPT came out, we were really impressed and realized we had to better understand what it means for our business. It seemed like it could have a really meaningful impact on the whole customer service industry, so we formed a small working group and explored what ChatGPT is good at, what it’s bad at, and what it might mean for our business. After going through that exercise, my own fears and worries and the hype went down a little bit. It seems like the tech is not quite there yet to take our jobs and automate everything.

“The model wants to please you, so it wants to give you an answer that it thinks you want”

Turns out, it’s really good at some things, but not at everything. It’s good at, for example, things like summarizing content or understanding language and editing and creative writing. But it has a major flaw of hallucinations, where it just makes up stuff that sounds very real but is factually incorrect, which is obviously a big problem for a customer service solution. You don’t want to give plausibly sounding, but factually incorrect answers. But there are lots of interesting things you can apply it to. And I think the big takeaway is that this technology is evolving really fast. And it’s really only a matter of time before it can give factually correct answers. And once that happens, it’s going to be really disruptive.

Emmet: So, what you’re saying is it will give an answer no matter what. And in some cases, this results in what you called hallucinations. Molly, this seems like one big limitation for anyone trying to use this for real. What are hallucinations and why are they happening in the first place?

Molly: Yeah, it’s a huge problem, as Gustavs said. The model wants to please you, so it wants to give you an answer that it thinks you want. Sometimes, it has a reliable source for that information, and sometimes, it’s just making things up. It feels like a child. “Why did you do that?” “Well, I thought that was what you wanted.” The hallucination might be pulling from a lot of different sources. If you ask it a question about Intercom, it doesn’t necessarily know anything new. And so, it might take pieces of what it knows that are accurate, general knowledge from elsewhere, interpolate that and, in a way, try to use common sense, which, of course, it doesn’t have. It doesn’t really have reasoning capabilities. It uses probabilities like, “Well, this might probably function this way, so I can make up an answer about something about Intercom’s API,” or something like that. And as Gustavs said, it’s super plausible. It sounds very confident.

And as you mentioned, different companies are focusing on different things. There are companies focusing a little bit more on how to minimize hallucinations. Whereas ChatGPT, I think, often focuses a lot on guardrails and ethics and being clear about what it’s refusing to answer.

Emmet: Do you think we’ll see a proliferation of lots and lots of different models and you can choose the one that best suits the kind of trade-off between being absolutely correct and hallucinations that you want, or is this a problem that may just disappear as the models get more mature?

“ChatGPT illustrated something interesting, which is that the UI and the UX of all of this are very important”

Molly: I’m not sure it’ll disappear. But yes, there are already a lot of models. There are open source models and there’s the potential to do what we call fine-tuning on top of a model. GPT stands for generative pre-trained transformer, so it generates things. It’s pre-trained on a large corpus and transformers. Different companies are going to focus on different things. There are open source models, and Intercom, as a potential user of these models, might be able to fine-tune on top to get more specialized knowledge of our industry or company. The tech will also get better at using and needing less data to have a great model. And so, the models will get smaller and smaller and smaller. And potentially, at that point, it might be a lot more reasonable for a smaller company to create a model on their data and have it be quite specialized, quite knowledgeable, and very reliable.

Emmet: Let’s shift gears and talk more specifically about design. Clearly, GPT and AI, in general, have been primarily a technology story, but I think ChatGPT illustrated something interesting, which is that the UI and the UX of all of this are very important. There seems to be a shift towards conversational UIs, potentially, for example. Do you think that’s true? What’s the role of design in shaping what we do with this tech from here, Molly?

Molly: I mean, Intercom is very well positioned. Our business is about conversation and customer service, and people are getting really excited about having conversations with this tech. But what we’ve found recently is that, at least for the moment, there’s just so much power available in the tech that is actually not directly conversational, but it’s about conversation and language.

As we mentioned, it’s great at summarization, and there are a ton of workflows where summarization can really help customer service agents. We have recently launched a beta to some customers, and summarization is one of the things that people are finding really, really, really valuable. We’ve also added some generative text tools to allow reps to make modifications to their messages if they want to rephrase things, make them friendlier, make them a little more formal, or get help clarifying things. That’s part of the conversation, but it’s not directly having a conversation with ChatGPT. We’re also finding it useful for helping generate things like help center articles, which was also part of this beta release. A lot of the power of this is in some of the more hidden applications that aren’t so obvious to laypeople but are really time-consuming for reps. And we can provide a lot of value with that.

“You’re looking for that intersection of things that the tech is good at and things where there’s a relatively low risk. And we’ll see a lot of those in the months to come”

Gustavs: Yeah. There are many ways you can use this technology, and through that, sidestep some of the problems we’ve seen, specifically with hallucinations, where it’s making up stuff that is not correct. But it’s really good at other things. It’s good at re-wording existing content, and it makes sense to lead with that because it can deliver clear value. The end goal would be to be completely automated and actually give answers. It’s just that the tech is not good enough for that yet. But I think we’ll get there.

Emmet: And I suspect that’s how we’ll see things throughout 2023 because I imagine we’ll start to see this creep its way into lots of different products, probably in relatively simple, foolproof ways to begin with and then increasingly pushing the boat out in terms of the complexity of what it can do. We have all, I think, approached this opportunity with a combination of excitement and maybe a little bit of healthy trepidation as well. Molly, you mentioned we have these features backed by ChatGPT in beta at the moment. And the feedback has been extremely heartening and positive. The earliest signs we’re seeing are real customers getting real utility from features like summarizing a conversation before handing it over to someone else. You’re looking for that intersection of things that the tech is good at and things where there’s a relatively low risk. And we’ll see a lot of those in the months to come. So, that’s going to be exciting.

Conversational AI

Emmet: Gustavs, you’ve been thinking about this more in the long-term view. Could you speak to that a little bit? You mentioned Intercom – one of the reasons we’re here talking about this is we’re probably pretty well positioned, given the nature of our products, which is conversational customer service, to make the most of this. What do you think when you think about long-term product and design opportunities?

Gustavs: In the very early days of the ChatGPT launch, we did this workshop to try and think through the future, specifically about how the world would look like if we had a model that didn’t have this hallucination problem and was able to give good answers or say “I don’t know.” It’s been really promising, and it has really increased our confidence in a lot of things we already believed in but are getting accelerated. We believe that the majority of support queries will be resolved completely automatically without talking to humans. It’s already increasing today with more of the “if this, then that” type of builders, with bots and our own resolution bot, which has some machine learning capabilities but not to the same extent as ChatGPT.

“The majority of support will happen in a way that’s most natural to humans, which is through conversation”

We’re already on that path, but it’s going to get accelerated. And as a result of that, support orgs will start to shift from being reactive and primarily in the inbox to being proactive – setting up and training the AI; writing content that the AI can use to resolve conversations.

I think the majority of support will happen in a way that’s most natural to humans, which is through conversation. Imagine if you had someone you can always talk to that has a personalized answer just for you. That’s the most natural way for humans to interact. This search and browse experience we have today, where you search for something on Google and scan it to try and quickly find answers somewhere in the content, is not that natural for humans. There are still going to be some versions of that with suggestions for content that might be relevant for you before you start a conversation. But when you interact with it, it could still be conversational.

We believe we’ll also need to build a bridge to get there for multiple reasons. I think we’ll start by seeing support rep augmentation with things like summarization or rephrasing. Later, we’ll get into suggestions for replies that support reps can edit and improve upon, and later, we’ll get into full automation. It’s going to take a while, both for the tech and the human aspect of it as well, to get used to using more and more automation.

Emmet: You’re describing something where, across a very broad surface area of the product, there are lots of different places where this can change how we work, both what we call the teammate experience and the end-user experience, in the two sides of the conversation. But you’re also describing this cloudy notion of how we’re going to get to this vague future of “we think the tech will get there.” It strikes me as a very different way of thinking about approaching design today and almost a profound difference in how we think about interacting with computers, going from something very deterministic, very hard-edged – of true and false and ones and zeros – to something way fuzzier.

New design patterns

Emmet: Designers are now looking at working with this material that feels way more unknowable and plastic and less rigid than CRUD apps, “create, write, update, delete,” that we have been used to. What have you found? Is there a substantial difference in how designers need to approach their work? Have you found certain things difficult or challenging? Will designers have to learn new skills? How big of a change is this for the act of designing, the fact that the material we’re designing with almost has this element of unknowability baked into it?

“Over time, we’ll see more and more new design patterns emerge for how to manage this uncertainty and expectations on all sides”

Molly: I think there’s still a lot about our job that’s going to stay the same. We’re finding problems, digging around people’s workflows, finding patterns. One big thing is needing to design for a lot more failure cases because there aren’t necessarily guardrails. When you’re having a conversation, it can go off the rails in so many different ways. And it’s the same with a system like this. Humans, as a species, are not great at probabilities. When we look at the weather report and there’s a 40% chance of rain, we don’t have a great sense of what that means.

Emmet: Yeah, you’re disappointed if it doesn’t rain because you were told there’d be rain.

Molly: Yeah. I’m in the Netherlands – when I see any chance of rain, I’m like, “It will rain. It’s just a question of how long.” That’s what the percentages mean to me. But we’re not that great at interpreting them. I think that’s definitely going to be something as we look at how confident these predictions are because they’re predictions of what words should come next. And we’ll look to get better at that. There’s a lot of dealing with how fast this tech moves and changes, and I don’t think that’s going to change. There’s a lot of prototyping and reacting and thinking about latency. The latency right now can be quite long – designing for that. And there are a lot of unexpected results. Those are some of the things I’ve been noticing.

Gustavs: I think, over time, we’ll see more and more new design patterns emerge for how to manage this uncertainty and expectations on all sides. At the moment, everyone is experimenting and seeing what works. We’re already seeing some patterns emerging with small predefined prompts on how to change text like “expand this,” “summarize this,” “make it friendlier.” It’s a relatively new pattern that is starting to emerge, and I think we’ll see more and more of those types of patterns. Even this interaction where, if you ask ChatGPT to generate content, it has this slowly moving cursor. That’s an interesting design pattern as well. It’s technically required, but it could work really well to set expectations that, “hey, this is AI generating content on the fly.”

“In these new systems that might be very automated, are we thinking about adding some friction back in so we retain the skills that feel valuable and that we want to have?”

Emmet: So, you’re saying that this word-by-word, ticker-tape typing effect, which is, to be clear, a function of how the technology makes it up word by word, could become synonymous and a visual calling card. Maybe that’ll happen, maybe not, but the type of thing that tends to emerge when we see these shifts and new technologies emerging might be interesting to drill down into the idea of new design patterns emerging because we do see this when new technologies come along. Molly, are there others that you’ve encountered, either at a very low interaction design level or at the high level of how this gets stitched into products?

Molly: There are a couple of other things that I think will start showing up more. For instance, when we’re trying to develop a feature, engineers are doing backtesting. They’re using past data and making predictions on that and then comparing it to what a teammate actually said, for instance. For things like that, we might need to start launching not on the end user but on the teammate or admin side, where people managing a CS org might want to have what I call a dark launch – they don’t have things live but are able to watch them and get a sense of, “Okay, I now trust this to go.” Varying stages of dark launches, draft suggestions, and different stages of launching some of these tools. I think that’ll be more prominent.

I don’t know which direction it’ll go, but I think about points where we might have to add friction back into the system so we don’t get complacency. Pilots still do certain portions of a flight, even though the autopilot system does most of it, because they need to not forget how to fly. So, they’re doing the landings or other things. In these new systems that might be very automated, are we thinking about adding some friction back in so we retain the skills that feel valuable and that we want to have?

Emmet: And clearly, almost everything has an implicit confidence score for the feature built into it that you have to design. Is this something we would expose to the reps and admins or their customers? There’s a higher threshold for us for exposing stuff to their customers or even at a lower level of detail. Take the ability to summarize a long conversation. Do you post that summarization straight into the conversation thread at the click of a button, or do you give someone the opportunity to review and approve it? Let it straight through versus adding an approval gate? I think we’ll probably see lots of workflows emerge, at least initially, and then, do they just start falling off as the tech builds greater and greater confidence?

Molly: Yeah, exactly.

Gustavs: Even just the ability to tell you how confident it is. If the AI could tell you, “Hey, this is my answer, and it’s 40% correct,” you might present it for a human to approve before it gets sent. If it’s 90% confident, you can just go ahead and send it straightaway and have a “hey, this is incorrect” button on the end user’s side. It really depends on how the tech evolves. The design will have to evolve alongside it.

Emmet: Yeah. God, grant me the confidence of a large language model because it will absolutely confidently say a total falsehood and the total truth without distinguishing between them. And that’s the trust thing. At the moment, there’s nothing that says, “I’m 100% confident in this statement.” In ChatGPT, at least. In some of the other language models, I believe we’re starting to see sources referenced, which seems a positive step.

Adding layers on top

Emmet: It seems like there are lots of unknown things, lots of nitty-gritty, deep-design decisions like this to get involved in. Let’s zoom out to what these megatrends mean for design and product. People have witnessed or been a part of the arrival of big, new technologies. I’m thinking about things like the cloud or massively shifting to web and mobile as big enabling technologies that led to this whole new world of design patterns and products that were not available before. With the cloud, we saw forms and feeds and likes and all of the visual transformation that the web went through.

You could say a lot of the same for mobile – everything from feeds and hamburger menus to pull to refresh and swipe to delete that we now consider part of a designer’s toolkit. Maybe we’re getting dangerously close to prediction time, but what is your early experience working with this? Does it tell you anything about what types of products are going to win or lose and what new things we might see emerging that weren’t even possible before?

“The businesses that are going to win, I think, are the ones that will have some sort of proprietary data and a flywheel effect continuously gathering and improving that data”

Gustavs: I think, over time, most businesses will be using these publicly available large language models instead of creating their own. But to differentiate from one another, they might build layers on top of them with specialized knowledge. For example, you might have business-specific data – for a support tool, it could be answers to specific questions about your product and your support reps giving specific answers as opposed to generalized knowledge. It could be really deep knowledge of a particular field, such as law.

The businesses that are going to win, I think, are the ones that will have some sort of proprietary data and a flywheel effect continuously gathering and improving that data. The other thing I think is going to be interesting is seeing what the big players like Google, Apple, and Microsoft do with this technology and how they integrate it into the OS level. That could have a huge impact on what kind of niches are available for other businesses.

“OpenAI is losing millions a day to run ChatGPT, and it’s probably worth it from a PR point of view or whatever research data they’re garnering, but it also means it’s not going to be free and leisured”

Emmet: You started off by saying most people are going to integrate these large language models in a certain way. I think that businesses that don’t manage to do what you were saying, and actually find some kind of defensive moat, will find themselves basically a thin wrapper over GPT that doesn’t really do a lot else. So, I fully agree with you there. If you think about something like the App Store or mobile app stores, there were lots of toys and flashlights and things like that in the early days. And then, gradually, it shakes out into big enabling things like Uber, which couldn’t exist if we didn’t have this model, and Instagram and mapping and so on. Molly, anything that you would like to add based on your experience?

Molly: I’m not totally sure that everyone will be using public LLMs. I have a small fear they’ll either be too expensive for a lot of companies to make their business model work or that some of the large companies might keep them private. So, I’m not sure if everyone will be using public ones or if people will move more toward open source and put their fine-tuned layer on top. I agree about the data modes. For instance, at Intercom, we have a lot of conversational data and we’re able to do things that Apple can’t necessarily do on an OS level. And that provides us some value. I think the products that’ll be successful are going to be the ones that, as you said, aren’t just a commodity layer on top, but deeply understand a problem or workflow and can integrate that with their data mode.

Emmet: You also touched on a couple of things that, for the time being, are going to be important around limitations. It is slow. It takes seconds to return a response. There are going to be some products or spaces where it’s just unsuitable. It’s also expensive in terms of computing power and therefore expensive in terms of money. You probably know more than me about this, but every request costs a couple of cents. OpenAI is losing millions a day to run ChatGPT, and it’s probably worth it from a PR point of view or whatever research data they’re garnering, but it also means it’s not going to be free and leisured. And while technology has a very good habit of getting faster and cheaper over time, and this potentially could happen here, for the time being, there are certain limitations that restrict the application. Maybe we’ll see it less in real-time apps. Maybe we’ll see it less in B2C apps, where the scale and cost of running those kinds of queries could be massive. It’s going to be interesting to see how things emerge there as well.

The future of interfacing

Emmet: I’m curious to go deeper in terms of the design conversation and actually think about these generative systems and how we are going to interact with them. We’re alluding to all the new taps and swipes and things you can do when a new platform comes along. This is where we will inevitably have to tiptoe our way into the world of prediction. We can all look back at this in a year or two’s time and laugh at how wrong we are, but there’s an interesting sense that maybe this is shifting towards more of a text-based, almost command-line-based way of interacting. Another kind of micro trend in product has been this command + K palette that you can pop up by hitting a shortcut and typing in the action you want to take. We see that in lots of products, which is contributing to this general sense of a shift towards text and natural language as a direct way of interfacing.

“I don’t think we have to pick one way for interacting with AI. It’s a very broad capability that can be applied in different ways for different use cases”

On the other hand, if you look at previous trends, especially the journey we went through from the command line interface, we ended up building very detailed graphical user interfaces on top. And so, I wonder if you would care to speculate as to where you see this going. Does this augur a shift toward more command-line interfaces for the 21st century? Is this a temporary command-line thing before we figure out what a graphical user interface layer on these things looks like? Is it just too damn early to say?

Gustavs: Well, I think we’ll have all of those. I don’t think we have to pick one way for interacting with AI. It’s a very broad capability that can be applied in different ways for different use cases. So, for example, if you’re looking for an answer, conversation will be the primary way for getting an answer. But if we’re talking about workflow augmentation with AI, I think we’ll see graphical interfaces with predefined actions for AI to take. It’s the same as we’re seeing today with summarize, rephrase, and the whole wave of co-pilot for X.

With workflow automation, I mean using AI to improve how you do your work. So, for example, in customer support, it’s when you’re writing replies to customers using AI to improve those replies. Again, expanding a point or summarizing the conversation up until that point. I think there could be graphical interfaces for those types of workflow augmentation.

Molly: I’m terrible at predictions, but we might have kind of a proliferation of, as you said, command + K interfaces or different options of what you can do. One of the challenges with this tech is the discoverability of what it can do. You can type anything into this prompt. “Write me a Shakespearean poem like a pirate,” or something. We’ll be putting some guardrails, but I think we’ll probably go broad and then see things narrow down a bit as things get more common and useful. And then, eventually, maybe be able to go to more of a text-based or conversational-based or wide-open interface once we have a sense of what this tech can do.

As we get used to talking to our systems, I’m also excited about the potential for neural interfaces. Why talk about it if I can just think it? I know that’s a ways off, but when I was at Berkeley, some of my colleagues were working on that. It would be really cool. There are a lot of situations where you don’t want to talk and type, and this opens things up. Maybe farther in the future, we’ll have integrated systems that can take non-GUI instructions and translate them into actions. We’re already seeing that with some of these systems that can take natural language queries and instructions and turn them into actions on your computer. And the fact is that some of these LLMs are also really good at generating code, like GitHub co-pilot. And so, there’s just a lot of potential there.

Emmet: I suspect text manipulation is going to have a great year in software because there’s so much immediate possibility here. It feels very natural to be able to highlight a piece of text and say, “make this friendlier.” It almost feels like that belongs in the tool palette alongside bold and italic. It’s just a way of manipulating the existing text. Then, there are lots of ways of taking that further, like generation or code generation.

I personally found the experience of working with image generators to be quite different. Again, a lot of our experience of these systems is seeing the results scroll by, like screenshots of ChatGPT or something that DALL-E, Midjourney, or Stable Diffusion created. The creation process of the image generators feels clunky to me, and something that will likely be GUI-fied and have a much more tactile on-screen interface. Having to just stuff the prompt with short F-stop trending on deviant art to try and get it to create the outputs you want is so clearly a hack. And there are lots of dimensions of different styles that you want to go through that would be way better served by knobs and dials and sliders of some sort. I guess my prediction is we’ll see prompt engineering as it exists today be replaced by something hopefully much better.

“There’s something interesting about the AI being like a super-powered colleague that can use the tools you have and you can give them plain text feedback to help improve them”

And just to finish the thought, video and audio are very different because you have to sit down for a long time and review the results. You can eyeball a hundred images or skim read some text, but I honestly have fewer opinions on that because I’ve been able to sink less time into it. But I guess it gets back to what you were ultimately saying, Gustavs. It’s not a satisfying answer, but it’s going to depend massively. And I think it’ll depend a lot on what’s the thing I’m manipulating. And we might have very different UIs for that depending.

Gustavs: At the same time, I think there are going to be new interesting applications of giving natural language instructions. For example, one thing we found interesting when we did our initial exploration was that the way you could train AI could be very, very similar or practically the same as if the AI was a support agent and you would give them feedback about your policy on how to interact with customers or what tone of voice to use. Even when you’re giving feedback on individual conversations, you could just give those in plain text because it understands natural language and the context. I think we’ll see that as well. And there’s something interesting about the AI being like a super-powered colleague that can use the tools you have and you can give them plain text feedback to help improve them.

Emmet: Molly touched on what happens when these things don’t just spit out text, but can take actions as well, for example. And that’s probably a whole additional level of what they’re capable of.

Where do we go from here?

Molly: Fergal, for those of you who listen to some earlier podcasts, is the Director of Machine Learning. He says that his ideal for an ML system should be like an intelligent colleague sitting next to you to whom you can give instructions and it’s going to actually execute it well. That’s kind of the dream. And so, as Gustavs said, being able to give natural language feedback is just this sea change in how we can manage it.

“How do we make this intelligent, could-potentially-be-threatening colleague, a teammate who is making you better?”

Emmet: I wonder even how much of a range there’ll be. There was an agency called Berg in London a few years back, and they did lots of experiments with earlier iterations of AI. But one of their principles was “be as smart as a puppy” because they didn’t want the AI to feel threatening or overwhelming. And that was their principle on drawing boundaries around us. I don’t like carving out designers as the finger-wagging “you can’t do that” type, but maybe setting those safe boundaries is an important role for designers to play as well.

Molly: I think there’s a role for those boundaries. I do want to work next to a puppy, but do you want to work next to someone with the intelligence of a puppy? I think the role of designers is: how do we make this intelligent, could-potentially-be-threatening colleague, a teammate who is making you better, that can have this really great whiteboarding, brainstorming session where you’re just riffing off each other? How do we get to that? That’s where we can really add this magic – making the workday better, augmenting workflows, and making AI an actual teammate for people.

Emmet: Self-driving cars are probably the most advanced application currently of AI, even though it’s not at a broad adoption level. The tension of these levels of self-driving and the increasing risk as you go through those levels – a version of that probably applies to a lot of these things, if you think about it.

Molly: Yeah, I mean, that’s exactly what we already mentioned. Is it a suggestion? Is there a review? Is there approval? That’s just our version of the five levels of autonomous vehicles.

Gustavs: Another thing that’s interesting is that, over time, as the AI gets better and is able to not just give answers but also perform actions on your behalf, similar to how a colleague might, it’s going to be an interesting design challenge to figure out a way to make it feel like someone sitting next to you and helping you, as opposed to a hacker hijacking your computer and clicking around things. If you can make it work with design, it’s going to feel magical. Or it could be crazy scary. It’s going to be an interesting design challenge.

Emmet: And it’s possible that the conversational route is the best way to do that. The degree to which it’s framed as a person that’s friendly and conversational versus the system that you interact with at a distance will also be interesting to see.

“Will the nature of the production and the ideation work change a lot? Will we have to learn new skills like prompt engineering?”

A couple of years ago, we had what, in retrospect, you could think of as a bot hype cycle. And actually, Intercom was quite actively involved in experimenting and finding out what we could do. Of course, we have products that took advantage of that, as we mentioned already. Things like the Resolution Bot and Custom Bots. But we also found during that hype cycle that there are a whole bunch of applications that are not good for conversational UI at all. There was a weather bot, and you’re like, “Actually, I don’t need a bot to ask what the weather is – I have an app or a webpage that’s fine for that.” We’ll inevitably see a lot of that happen here as well. Probably an over-application of conversational UI, but then the truly useful use cases coming to the fore.

One additional thing I’ll add that makes me quite bullish on the conversational thing is a problem we’ve been working on for a long time. The Turing test is not new. But aside from that, I worked at Google several years ago. There was a massive amount of work in search and pride in getting it to answer a question like, “How tall is the Eiffel Tower?” Something that just seems super basic in comparison to what we now have available to us. Even voice assistants like Siri suddenly woke up one morning in late November to be almost obsolete.

The speed at which the systems get better will really drive a large part of this, as well. One of the interesting and new things for designers is we’re along for the ride to a greater degree than us working with web technologies or whatever in the past. Where the technology goes from here is going to dictate things as much as our directorial authorial vision as designers.

“I think it’s going to be really important for designers to lean into this and just play around and tinker with these language models and see how you can apply them to your product”

One last dimension I think about in terms of design, specifically, is the tools that we use and the fact that they have the potential to change dramatically. Will the nature of the production and the ideation work change a lot? Will we have to learn new skills like prompt engineering? Gustavs, any high-level thoughts on what this means for the changing nature of actually doing design?

Gustavs: Yeah. In terms of prompt engineering specifically, I think, over time, we’ll see an emergence of best practices for how to do that in the same way we have for any other technology. And obviously, they’ll evolve and get better over time, but I don’t think it’s going to be a key differentiator that is going to fundamentally shape your business. It’s difficult to tell how the role of the designer will change, and it depends on the timeframe. In the short term, I think it’s going to be really important for designers to lean into this and just play around and tinker with these language models and see how you can apply them to your product, how other businesses are applying it to theirs, and try to find patterns and interesting ways of doing new things.

But over the long term, it’s way more difficult to tell what the impact is going to be on designers in the whole industry. So, as the AI gets better, and not just at augmenting humans, but also at doing full automation of writing and performing tasks, I think that can fundamentally disrupt a lot of products and industries and even the role designers play in shaping those products. I guess we’ll see. Lots of open questions, and it’s going to be interesting to see how it plays out.

Emmet: Yeah. One of the nice things about doing what we do is that occasionally, technology gifts you with a whole new kind of avenue that you can pursue. This definitely feels like it’s a thing that is going to substantially alter the landscape that we work in and create a ton of new challenges and opportunities for designers. For us at Intercom, it’s very exciting to be well along the way and on that path and fully committed to it. It’s going to be an interesting year for AI and designing with AI, no doubt. I’m looking forward to seeing where we get to it. Maybe we can leave it at that. Molly, thank you very much. Gustavs, thanks a million. It was great chatting with you and learning from your earlier experience working with this tech. Maybe we’ll do it again when we’re all older and wiser, but for now, thanks very much.

Click here to sign up for the latest AI updates from Intercom.