Two AI chatbots walk into a bar: TV writer Joe Toplyn on teaching chatbots to crack jokes

Audio Content Producer, Intercom

July 6, 2023

We’ve been talking a lot about AI and chatbots; the implications for the workplace and our lives, and what the future holds for humanity. But today, we ask a pretty light question: can chatbots crack a joke?

Humor has long been considered challenging for AI systems to grasp. It requires a deep semantic understanding of text, and it’s reliant on contextual factors and cultural and social nuances. This, however, never deterred researchers from trying to teach AI jokes. There’s, for example, the work of Binsted and Ritchie, who, in 1994, developed a program to generate riddles based on simple puns (e.g., “What kind of tree can you wear? A ﬁr coat.”). Or Matthews and Petrovic, who used unsupervised machine learning methods to generate jokes following the “I like my X like I like my Y, Z” structure, resulting in quips like “I like my coffee like I like my war. Cold.” or “I like my boys like I like my sectors, bad.”

And there’s today’s guest, Joe Toplyn, one of the most qualified people to talk about the subject. Joe not only received a Bachelor of Science degree in Engineering and Applied Physics and an MBA (both from Harvard), but he was also, among other things, co-Head Writer of The Tonight Show with Jay Leno and Head Writer of Late Show with David Letterman. For almost two decades, he wrote and produced thousands of scripts, segments, and jokes – an experience that inspired his book, Comedy Writing for Late-Night TV, where he dissects jokes and comes up with recipes for what he calls punchline-makers.

If anybody was going to teach a computer to have a real sense of humor, it would be Joe. He’s merged his knowledge of jokes with that of engineering to create Witscript, a hybrid AI system powered by GPT-3.5, designed for improvising jokes in conversations.

In today’s episode of the podcast, Joe Toplyn takes us on a journey through his experiences as a comedy writer and teaching chatbots the art of comedy.

Here are some of the key takeaways:

Research suggests people feel more at ease when engaging with tech with human-like qualities. Having a bot with a sense of humor, for example, can create a delightful experience. However, in order to be used in customer service, they must follow guidelines in terms of timing, tone of voice, content, and alignment with the brand’s persona.
Witscript utilizes a three-part joke structure inspired by late-night talk show monologues by taking any initial topic, executing the joke-writing algorithm, and producing a punchline.
The system operates through a series of prompts that execute the steps of the joke-writing algorithm, generates five joke candidates, and selects the one it believes to be the funniest.
Tools like Witscript can be valuable for all kinds of writers, allowing them to generate a large number of ideas quickly and providing a more efficient approach to the joke-writing process.

If you enjoy our discussion, check out more episodes of our podcast. You can follow on Apple Podcasts, Spotify, YouTube or grab the RSS feed in your player of choice. What follows is a lightly edited transcript of the episode.

Reverse-engineering comedy

Liam Geraghty: Hello, and welcome to Inside Intercom. I’m Liam Geraghty. Over the past few weeks, we’ve been discussing chatbots and AI, but one question I haven’t asked or even thought to ask is, can a chatbot have a sense of humor? Can a chatbot tell a joke, at the very least? Well, my guest today, Joe Toplyn, could not be more qualified to answer that question. Not only did he receive an SB in engineering and applied physics and an MBA (both from Harvard), but he was also co-Head Writer of The Tonight Show with Jay Leno and Head Writer of Late Show with David Letterman. He’s currently the Lead Humor Engineer for Witscript, a hybrid AI system for improvising jokes in a conversation. Joe, you’re very welcome to the show.

Joe Toplyn: Thanks for having me. Hi, Liam.

Liam: So let’s jump right into Witscript. What is it, and how did it come about in the first place?

Joe: Witscript is a hybrid AI system for generating jokes. It’s a neural symbolic hybrid, which means that it combines a symbolic system, which consists of joke-writing algorithms that I created as a human comedy writer, and it combines those with a large language model, which is the neural part. So basically, you give it a sentence, which it considers to be the topic of a joke, and it uses GPT 3.5 to execute the steps in a joke-writing algorithm – actually, several joke-writing algorithms that I created based on my experience as a comedy writer.

“All right, what did the writer do with these words to get a laugh? How did the writer go from the topic sentence, the subject of the joke, to the angle and the punchline?”

Liam: I’m presuming that kind of background of those two things smashing together is how you came up with this. It’s so specific between the engineering behind it and the comedy.

Joe: Yeah, I was invited to teach comedy writing, and I decided that people would like to hear me talk about how to write for late-night comedy shows like the David Letterman and the Jay Leno shows. To do that, I had to figure out how I write comedy, how I write jokes and desk pieces and sketches because once I knew how I did it, I could teach other people to do it. So I came up with a course outline, and in the process of doing that, I thought really hard about how humans write jokes because jokes are the building blocks of a lot of the other short-form comedy pieces on a late-night comedy talk show. And nobody had really done that before. I did a lot of research, read a lot of books, and nobody had a system or recipe for writing the kind of joke you’d have in a late-night comedy monologue.

So, I read a lot of jokes and reverse-engineered them. I looked at them and said, “All right, what did the writer do with these words to get a laugh? How did the writer go from the topic sentence, the subject of the joke, to the angle and the punchline?” And I factored in my own joke-writing process. What does my brain do when I’m trying to write a joke? I’m reading the news saying, “All right, I have to come up with jokes ’cause that’s what I’m getting paid to do. How do I approach that task?” So I reduced that process to a bunch of recipes. I call them punchline-makers – there are other techniques involved – taught that to the students, and eventually decided that there might be other people interested in what I had to say.

“Eventually, I decided that if anybody was going to teach a computer to have a sense of humor, it was going to be me”

And so, I wrote a book, Comedy Writing for Late-Night TV. All the algorithms are in there. People were buying it, and I asked myself, who else might be interested in what’s in the book? I did a little research and found out there was an academic field called ‘Computational Humor.’ And I thought, “Oh, this is interesting.” It was a fairly new field. It had only been around for about 20 years. I started contacting researchers in the area, introducing my book, and saying, “Well, you might be interested in a book that explains how humans write jokes because then maybe you could teach a computer to do that.”

I made a little progress, but it wasn’t moving fast enough. Eventually, I decided that if anybody was going to teach a computer to have a sense of humor, it was going to be me. At that time, the tools of AI were starting to get useful. Years ago, IBM’s Watson beat humans on the TV show Jeopardy, and that was a big milestone in artificial intelligence and what a computer can do with language. I read a paper on that and decided that if Watson could beat humans in jeopardy by doing these tasks, it can write a joke because it uses a lot of those same tasks. That gave me encouragement. Then, text generators started coming along, Word2Vec, word embeddings, vector spaces, and I used whatever tools I had to come up with a very crude way of generating a joke that involved wordplay.

Then, the AI tools got more and more sophisticated. As I got a more talented tool, I would incorporate it into the Witscript software. And then, about a year and a half ago, GPT-3 came out and then 3.5, and I tried that and said, “This is really great. This is a much easier and more efficient way to execute the steps of the joke-writing algorithm than what I’d been using before.” So I plugged in GPT-3.5, and that’s what Witscript is now. It’s a way of writing a joke using the latest, most useful large language model I have access to right now.

Quip it up

Liam: That’s great. Why do chatbots need to be able to generate original relevant jokes when they’re chatting?

Joe: There’s a fair amount of research that says that people are more comfortable interacting with technology like chatbots if they seem more human-like. And one way to make a chatbot more human-like is to give it a sense of humor, to allow it to recognize and improvise a joke. So, in the right situations, a chatbot that can occasionally drop in a joke at an appropriate time based on something the user said will relax the user, make them more comfortable, and the experience more delightful. And so, in that situation, it might be useful for a chatbot to have a sense of humor.

“Can you get the chatbot to know the appropriate time to tell a joke? If there’s an angry user screaming at the chatbot, a joke probably wouldn’t be a good idea”

A system like Witscript can also be used by somebody who just wants to write jokes – a comedy writer or somebody who sees something on social media and wants to say something funny about it. That person doesn’t necessarily have the skill to write a joke quickly or doesn’t want to hire a comedy writer to write the joke, so they can use Witscript to come up with a joke and use it for whatever the person needed the joke for, to punch up a speech or maybe come up with a tagline for a product or something like that.

Liam: And outside of that, what areas are you talking about that Witscript could be applied to? Could something like this be used for customer service chatbots?

Joe: It definitely could be. Can you get the chatbot to know the appropriate time to tell a joke? If there’s an angry user screaming at the chatbot, a joke probably wouldn’t be a good idea. Another factor is to make sure the joke was appropriate. If it’s generating a joke completely by itself with no human curation, you have to make sure there are certain guidelines and that the joke is going to be acceptable to the audience.

“If the brand has a playful persona, you might say that the chatbot could be improved or made more entertaining by adding a humor module like Witscript”

For as long as I’ve been testing Witscript, it has never really come up with a joke that involves loving Hitler or anything like that. GPT-3.5 is trained on the entire internet, books, and Wikipedia, so what Witscript thinks about President Biden or Donald Trump is basically kind of the average of what everybody thinks about Biden and Trump, which makes the jokes that it comes up with fairly safe and generally acceptable.

Another factor to consider is the brand that the chatbot is representing. If the brand has a playful persona, you might say that the chatbot could be improved or made more entertaining by adding a humor module like Witscript.

Witscript’s recipe for conversational humor

Liam: One of the signature parts of any late-night talk show is the monologue. And monologue jokes are the models for Witscript’s jokes. Why monologue jokes for a conversation?

Joe: Because the structure of a monologue joke is topic, angle, and punchline. The topic is the sentence the joke is based on. In the case of a late-night talk show, it’s the news item. The angle is the direction the joke takes to get to the punchline, and the punchline is the incongruity at the end that the audience resolves suddenly. That’s what produces the laugh.

One of the insights that led to Witscript was that that structure is basically what happens when you’re improvising a joke in a conversation. Your friend says something to you – that’s the topic of a potential joke. All you have to do as a comedy writer or joke improviser is take that topic and execute the steps of the joke-writing algorithm that a late-night comedy show writer would take to create a joke based on that topic. The news topic in a monologue for a comedy show is the same as the setup you’d get when somebody says a sentence to you.

“When I’m debugging the system, I’m trying to figure out, ‘Well, why aren’t these jokes funnier?’ I can go and say, ‘All right, it selected the wrong topic handles. I have to tweak that prompt’”

Liam: How does Witscript execute all of those steps you’re talking about in the basic joke-writing algorithm?

Joe: It’s a series of seven or eight prompts. The user gives the input – it could be a news item or a funny observation, something that Witscript then takes as the potential topic of a joke. And then, almost very literally, the program has a prompt for every step in the human joke-writing algorithm I’ve used as a framework for Witscript. The first step is to select two topic handles, for example. Topic handles are the two most important nouns or noun phrases in the topic. The first step in writing a joke would be to identify those – that’s something a large language model can do. You can give GPT-3.5 a prompt, “What are the two most interesting nouns and noun phrases in this topic?” and it will execute that step. Those topic handles will feed into the next step of the joke-writing process.

The general term for that process is prompt-chaining – the output of one prompt, which you get back, becomes the input for the next prompt. It’s a series of steps, which allows the system to be very transparent. When I’m debugging the system, I’m trying to figure out, “Well, why aren’t these jokes funnier?” I can go and say, “All right, it selected the wrong topic handles. I have to tweak that prompt,” or, “The associations that it generated for Tom Cruise were not the ones that, as a comedy writer, I would’ve focused on. How do I get better associations for that prompt?” It’s just a series of prompt design and tweaking and adjusting all these little levers.

“I could rely on the system to not only generate the possible punchlines but come up with the one to deliver as its final choice”

Liam: That’s interesting. How does the system evaluate itself? How does it determine what a good joke is?

Joe: The system works by generating five joke candidates. I have five separate techniques for coming up with a potential punchline that I, as a human, use when I’m writing jokes and now Witscript uses. And you can see the five joke candidates – A, B, C, D, E. Then, it selects the joke candidates it believes are the funniest. That’s just something I asked GPT-3.5 to do. What does the machine think will be the funniest to the user? And that was a big revelation too. I could rely on the system to not only generate the possible punchlines but come up with the one to deliver as its final choice. If it’s in a conversational system, it can’t rattle off five potential jokes to the user and say, “You pick one,” it has to pick one and then deliver that.

It’s also interesting to look at the five and say, “Oh, interesting. That approach produced that joke.” Many times, it’s not a joke at all – that was the output of that particular algorithm. And that’s helpful because if you’re a comedy writer, you may not like the final choice that Witscript offers. You might like B better than E. Or you could take B and change that word, and that’ll be a great joke. The system could also be a very useful writing assistant for coming up with your own jokes.

Liam: Yeah, I was going to say it’s perfect in that situation where you can tweak it a bit. Could you give us some examples of Witscript’s jokes? And have you compared them to jokes you might have written and asked people to say, blindly, which is which?

Joe: Yeah, I post jokes that Witscript wrote every day on Twitter. Let me read a few of the recent ones. This is one I posted yesterday. The user says, “It’s National Donut Day, and Krispy Kreme is offering a free donut.” And Witscript says, “Get ready for National Diarrhea Day.” Donuts cause diarrhea. I actually looked that up. And yeah, they do ’cause of the fat and the sugar. Another, “Netflix shareholders voted against big compensation packages for the company’s top executives.” And Witscript says, “Well, I guess they’ll just have to Netflix and chill.” The jokes are directly related and contextually relevant to the input.

A comedian’s toolbox

Liam: Comedy writing is so fascinating. There’s a broader discussion about art and AI, and I suppose this is something you would’ve used back in the day when writing? Some people might be resistant to something like this.

“I could easily see a writer using Witscript to just input the news of the day and say, ‘All right, give me some ideas’”

Joe: The monologue writers on a late-night talk show have a big job, especially working for Jay Leno on The Tonight Show. He would do a 30-joke monologue, which means the writing staff had to come up with literally hundreds of jokes every day. In that situation, where quantity and quality are both important, I could easily see a writer using Witscript to just input the news of the day and say, “All right, give me some ideas.” Some of the jokes would be word-perfect, you wouldn’t have to change them at all; they could just go on the air. Some would need a little work by the human, and some would be useless, but you can just ignore them. I could see professional writers and certainly non-professional writers using Witscript, but maybe not admitting it. If you’re a professional, you may not admit that you’re getting help from a machine.

Years ago, there was a program called Idea Fisher, and some comedy writers used that. You basically put in a word like Christmas, and it just gave you lots of associations. What do you think of when you think of Christmas? Christmas carols, Santa Claus, North Pole, elves. Part of the process of writing a joke is linking associations. One of the top 10 lists we did on the Letterman Show was “Top 10 Santa Claus pet peeves.” So it’d be useful to have a list of associations. What do you think when you think of Santa Claus? And then it would be a joke about Rudolph the Red-Nosed Reindeer and holiday fruitcake or going down the chimney. That was an early example of how software helped professional joke writers. And this is, I think, just an extension of that. That’s how I was introduced to Idea Fisher. Somebody said, “Oh, here’s something I use.” Because, as a human, you do that anyway. It’s just an easier way to do it.

Liam: I’m guessing you would have a field day writing jokes about AI and chatbots if you were writing a monologue for a late-night talk show now.

Joe: Yeah. Here’s one that Witscript wrote about that. The user says, “Tech experts are warning that artificial intelligence poses a risk of extinction for humans.” And Witscript says, “If only we could use AI to figure out how to get rid of AI.” Here’s another one, “The president of Microsoft says he expects the US government to regulate artificial intelligence this year.” Witscript says, “Don’t worry, the government will regulate AI just as well as it regulates everything else.” So, pretty good jokes. Certainly pitchable, if you’re turning jokes for a comedy show.

“Sometimes I’ll write a joke on one topic, give the same topic to Witscript, and Witscript will have its own take”

Liam: Absolutely. Where is Witscript at the minute? And where do you see its future?

Joe: I’m still doing some internal testing and tweaking. It’s in a limited beta test mode. The next step is to figure out the best way to allow individuals to have access to it, and that’s going to mean coming up with a way to keep track of users and process payments and things like that. I’m exploring ways to do that efficiently to get it into the hands of people who can use it.

Liam: Where can people go to keep up with it and read more about it?

Joe: You can go to witscript.com. If you want to see the latest output from Witscript, go on Twitter @witscript. You can see Witscript’s take on the news of the day. I also write jokes and post them on Twitter. @joetoplyn is my Twitter handle. Sometimes I’ll write a joke on one topic, give the same topic to Witscript, and Witscript will have its own take.

Liam: That’s brilliant. Straight after this, I’m following you and Witscript. We can all do with a few more jokes in our timelines at the minute. Joe, thank you so much for joining me today.

Joe: Thank you, Liam. It’s been fun.