Updating racist terminology in our codebase

Words matter: Removing exclusionary terminology from our codebase

Main illustration: Florencio Zavala

In the past few months, the world has changed in many ways. The Black Lives Matter movement has sparked a much-needed dialogue about diversity and inclusion in the workplace.

Intercom has responded by hiring a diversity and inclusion consultant, updating our recruitment strategy and hiring practices to reduce bias, scheduling allyship training for all employees, and amplifying Black voices in our podcast and blog. We take so much pride in the work Intercom is doing to take a stand against racial injustice. And while seeing this work come to fruition, Sam, a Product Engineer at Intercom, was inspired to find more ways to get involved.

“We realized that the terminology we use in our codebase at Intercom reinforced the very ideas we set out to take a stand against”

If there’s one thing we know to be true, it’s that words matter. And an astute awareness of the language we use, and how it impacts those around us, is essential for all of us as we move towards eliminating our own conscious and unconscious biases. As we thought about ways to reduce bias in the workplace and do our part to stand against racism, we realized that the terminology we use in our codebase at Intercom reinforced the very ideas we set out to take a stand against.

So we took a deeper look into the history of those terms and found a way forward to eliminate their use at Intercom in hopes to create an inclusive and welcoming environment for all.

The language we use shapes our world

Sam experienced the lasting effects words can have. As they tell it,

“When I was fresh out of college, I was working at my first job in the tech industry. One day, I came into a meeting a few minutes late after another meeting had run over. My team’s lead engineer stopped what he was doing and announced to the all-male room, ‘Looks like the token female engineer has arrived.’ ( I was presenting as a woman and using she/her pronouns at the time.) It hurt to be called a token; what hurt even more was that nobody spoke up for me. Years later, this memory sticks with me as a testament to the bystander effect and, more importantly, the power of words.”

The Sapir-Whorf Hypothesis states that the language you use shapes your view of the world. It was first proposed after Benjamin Whorf, a fire safety inspector at the time, observed how oil workers treated empty oil barrels as less hazardous than full oil barrels, despite them being equally flammable due to the traces of oil remaining in them. Oil workers would casually smoke around the “empty” barrels because calling them “empty” made the workers perceive them as safe. Language is powerful; it stands to reason that using racist terminology could reinforce racist biases.

Think about the impact that racism has already had in tech. For example, webcams with built-in facial detection sometimes don’t detect people with darker skin, because they were only tested on people with lighter skin. This is a huge problem; it translates into self-driving cars being more likely to hit darker skinned people because they don’t “see” them. Many people believe tech to be apolitical and unbiased, but that’s simply not true. As long as humans are programming computers, humans will be transferring their own biases over to computers. We have to actively fight our own biases to correct this.

These words have a major effect on the groups they’re marginalizing. The experience of being called a “token” is living proof that these things add up over time and can make you feel unwelcome.

Paving a way forward

Now, in the midst of a civil rights movement, Sam was moved to take action after being inspired by a Twitter thread about the racist history of the terms “master/slave” and “whitelist/blacklist.”

In computing, the terms “master” and “slave” refer to the primary and secondary nodes in a database replication scheme. “Master” and “slave” also refer to the gruesome practice of slavery. The terms “whitelist” and “blacklist” refer to lists of explicitly allowed or explicitly denied resources; to “whitelist” an email address, for example, means to allow that email address to contact you. The terms originated around the 1610s, at the beginning of the Atlantic slave trade. They equate “white” with “good” and “black” with “bad.”

Other companies were already taking steps to address this language within their organizations. The likes of GitHub, Google, Twitter, Apple, and the UK’s NCSC are all making changes to their terminology.

“At the end of the day, using a different word is a small price to pay in exchange for our coworkers’ psychological safety”

We did a quick search of our codebase and discovered that not only did we use these terms at Intercom, they were ubiquitous. Between the four terms, there were thousands of occurrences in our codebase. So, how could we convince 100+ engineers to drop their normal work and prioritize updating them?

If we didn’t change these terms, everyone at Intercom would be forced to use racist terminology to do their jobs. At the end of the day, using a different word is a small price to pay in exchange for our coworkers’ psychological safety.

So Sam wrote an open letter to our Research & Development department at Intercom, stating what the problem was with these terms and that we needed to address them immediately. They published it in our Slack channel with over 200 members. And the response was overwhelming.

A snippet of the open letter Sam sent to our R&D Slack channel.

A snippet of the open letter Sam sent to our R&D Slack channel.

The letter got resounding support, including a rousing endorsement from our co-founder and CTO, Ciaran Lee. The question was not, “should we do this?” but, “how fast can we do this?” Ben stepped up to help lead the project, and he and Sam formed a partnership for identifying uses of the terminology and creating a plan to update them.

How we rethought our terminology

We worked with our content designers to get to the heart of what these terms meant. We found that on top of being offensive phrases, they were also bad metaphors for what they were actually trying to convey.

“Master/slave” refers to a relationship between a primary source and its replicas. “Whitelist/blacklist” implies risk management – if something is “whitelisted,” it’s coming from a trusted source. If something is “blacklisted,” we’re blocking it because it’s an untrusted source. With that in mind, we came up with the following terminology replacements:

Previous Usage Updated Usage
Master (database node) Primary
Slave (replicated database node) Replica
Slave (non-replicated database node) Secondary
Master branch Main branch
Whitelist (verb) Trust
Blacklist (verb) Block
Whitelist (noun) Trusted list
Blacklist (noun) Blocked list

Taking action against the racist terminology in our codebase

We had terminology replacements drafted and people with an appetite to make changes. But devising a process to smoothly overhaul the terminology was quite a challenge – these are the steps we ended up taking, and if you’re embarking on a similar review of your codebase, we recommend taking these into consideration.

Know the scope of the issue

First, we needed to know exactly where this terminology existed in our codebase. Fortunately, we already use a search tool internally: an open source search engine called Hound. When searching for “whitelist,” for example, Hound returns the repository, file, and exact line for all instances of the search term. A few exports later, we had a spreadsheet detailing every instance of “whitelist,” “blacklist,” “master,” and “slave.” A few pivot tables and spreadsheet magic tricks later, we had an actionable terminology tracker.

This was a sizable amount of work. We focused on just “whitelist” as an initial step, but even then, there were almost 600 occurrences in our codebase. To add to the technical challenge, some instances were found in business-critical code like the billing and spam-filtering systems.

Share the work

Carelessly finding-and-replacing terms could cause serious consequences like blocking our code release pipeline, causing system outages, or eroding customers’ trust in Intercom. There was no way we had the capacity or context to make all the changes between the two of us. At that point, we took a step back to think about our goals.

“The ultimate goal is for us as a company to be more intentional with our language”

This project isn’t about changing every single instance as a one-time effort. The ultimate goal is for us as a company to be more intentional with our language. This was going to be a team effort.

We identified the team that owned each file containing “whitelist,” “blacklist,” or “slave.” For some repositories, there is a practice of including a responsible team in each file to show code ownership. In other cases, we made our best guess based on the product area. Future efforts to automate identifying code owners could also have applications outside this project (e.g. finding teams or teammates that can best resolve an incident).

Next, we created GitHub issues according to responsible teams and assigned the changes to them. There isn’t a one-size-fits-all replacement term. For example, “allow list” could be a better choice than “trusted list” in some situations. But by distributing the work, teams can use their judgment to choose the best replacement in their specific context. This allows for some nuance in Intercom’s use cases as the industry progresses towards alignment on these terms.

Keep the conversation going

This project is just the start of an ongoing conversation.

Since Sam’s open letter a bit over a month ago, all occurrences of “whitelist,” “blacklist,” and “slave” have a corresponding GitHub issue filed, and we’re starting to do the same for “master.”

At the time of writing, 48%, 60%, and 99% of all instances in our codebase of “whitelist”, “blacklist,” and “slave” respectively have been resolved, and the percentage is growing by the week.

“Language evolves, and we should embrace the opportunity to consider our vocabulary rather than passively adopting a pre-established lexicon”

This was purely a volunteer-led effort at first, but we worked to make it a priority at Intercom. Sam made a company-wide presentation, and Sinead O’Rourke, a technical program manager at Intercom, helped us put the project on the engineering roadmap for next quarter.

We have plans to not only make the terminology replacements but also prevent them from being reintroduced by educating teams and introducing linters, tools that analyze code and flag any instances of the outdated terms.

Since a notable number of instances are due to dependencies on third-party providers, we also plan to reach out directly and encourage them to join the movement. The wider engineering community has the responsibility to take action, and we can all do our part in driving positive change.

Stay intentional with your language

We all have the obligation to be thoughtful. Language evolves, and we should embrace the opportunity to consider our vocabulary rather than passively adopting a pre-established lexicon. The words we choose to use, and not use, signal our priorities, our values, our worldview. The words we use matter, and our actions matter, too.