Is machine learning hard? Not always

There’s a common misconception that you have to be an AI wizard or mathematician to use machine learning in your work. That machine learning requires hard calculus.

After all, you’re teaching machines that work in ones and zeros to reach their own conclusions about the world. You’re teaching them how to think!

But like many frameworks we have for understanding our world – Newton’s Laws of Motion, Jobs to be Done, Supply & Demand – the fundamental ideas at the heart of machine learning are relatively straightforward.

Our own machine learning expert, Fergal Reid, describes it this way: “You have a problem, you’re trying to solve it, and then you have a system where the performance improves when you give it more training data…The more data you get, the better your estimate.”

If you’ve got a well-defined problem that can be solved given enough time and example data, you should give machine learning a try. Here’s an example.

Example problem – without using machine learning

Say we want to include a “You might also like” section at the bottom of this post. How should we go about doing this?

If you’re avoiding machine learning at all cost, you might opt for this approach:

Split the current post title into its individual words.
Get all other posts.
Sort all other posts by those with the most words in their body in common with our title.

Or, in Ruby:

	def similar_posts(post)
	title_keywords = post.title.split(' ')
	Post.all.to_a.sort \|post1, post2\|
	post1_title_intersection = post1.body.split(' ') & title_keywords
	post2_title_intersection = post2.body.split(' ') & title_keywords

	post2_title_intersection.length <=> post1_title_intersection.length
	end[0..9]
	end

view raw naive-similar-posts.rb hosted with ❤ by GitHub

But look at the results you get when you apply this method to our blog post “How the support team improves the product”:

How to launch with a validated idea
Know your customers and how they decide
Designing first run experiences to delight users
How to hire designers
The dribbblisation of design
An interview with Ryan Singer
Why being first doesn’t matter
Proactive customer support with Intercom
An interview with Joshua Porter
Retention, cohorts, and visualisations

We can do better — posts about running an effective support process have little in common with cohort analysis, or debate around the merits of design.

The same example using basic machine learning

Let’s try a machine learning approach. We’re going to break this into two parts:

Represent posts mathematically.
Cluster these mathematical representations with K-Means.

1. Representing posts mathematically

If we can represent our posts mathematically, we can plot the posts, compare distances between posts, and identify clusters of similar posts.

Machine learning - representing posts mathematically

Mapping each post to a mathematical representation is easy. We can do it in two steps:

Find all words in all posts.
Convert each post into an array. Each element is a 1 or a 0, denoting presence of a word. This array is of the same order for each post, as it’s based off step #1.

Or, in Ruby:

	@posts = Post.all

	@words = @posts.map do \|p\|
	p.body.split(' ')
	end.flatten.uniq

	@vectors = @posts.map do \|p\|
	@words.map do \|w\|
	p.body.include?(w) ? 1 : 0
	end
	end

view raw post-vector.rb hosted with ❤ by GitHub

If @words equaled:

['hello', 'inside', 'intercom', 'readers', 'blog', 'post']

A post with the body “hello blog post readers” would be mapped to:

[1,0,0,1,1,1]

We don’t have simple tools for plotting vectors in 6-dimensions, like we do for those in 2-dimensions — but concepts like distance are easily extrapolated. (It’s also still useful to use the 2-dimensional example.)

2. Clustering posts with the K-Means algorithm

Now we have a mathematical representation of our blog posts, we can find clusters of similar posts. For our problem, we’ll use a simple clustering algorithm called K-Means:

Set ‘K’ to the number of clusters you want.
Choose ‘K’ random points.
Assign each document to its closest point.
Choose ‘K’ new points, from the ‘average’ of all documents assigned to each point.
Repeat steps 3-4. Until documents’ assignments stop changing.

Let’s visualize these steps. First, we choose 2 (i.e. k = 2) random points, in the same space as our posts:

Machine learning clustering posts step 1

We assign each document to its closest point:

Machine learning clustering posts step 2

We re-evaluate the center of each of these clusters, to be the average of all posts in that cluster:

Machine learning clustering posts step 3

That’s the end of our first iteration. Now we re-assign each post to its new closest point:

Machine learning clustering posts step 4

We’ve found our clusters! We know this because it’s obvious in further iterations that the assignments would not change.

Or, in Ruby:

	@cluster_centers = [rand_point(), rand_point()]

	15.times do
	@clusters = [[], []]

	@posts.each do \|post\|
	min_distance, min_point = nil, nil

	@cluster_centers.each.with_index do \|center, i\|
	if distance(center, post) < min_distance
	min_distance = distance(center, post)
	min_point = i
	end
	end

	@clusters[min_point] << post
	end

	@cluster_centers = @clusters.map do \|post\|
	average(posts)
	end
	end

view raw kmeans.rb hosted with ❤ by GitHub

Here are the top 10 posts similar to “How the support team improves the product” produced with this method:

Are you being clear, or clever?
3 rules for customer feedback
Asking customers what you want to hear
Shipping is the beginning of a process
What does feature creep look like?
Getting insight into your userbase
Converting customers with the right message at the right time
Conversations with your customers
Does your app have a message schedule?
Have you tried talking to your customers?

The results speak for themselves. Now, with more statistics, we can keep optimizing the algorithm and improve our results, but this isn’t a bad start for roughly 40 lines of code! Other problems may require different clustering algorithms.

Give it a try

Machine learning has led to breakthroughs in highly technical areas like computer vision, audio recognition, and natural language translation. But machine learning isn’t only just applicable to large abstract problems. It’s great at generating suggestions to help users with different workflows. Want to add tags in your project management app? Or assignees in your customer support tool? Or members of a group on a social network? Chances are an easy algorithm can help you out.

So, when faced with a challenge in your product where you believe machine learning can help, don’t feel you have to master the math behind complex algorithms before giving it a try. These resources can help you get started:

Relatively high-level programming libraries like Python’s Scikit Learn
Books written for programmers like Programming Collective Intelligence by Toby Segaran
Online courses like Andrew Ng’s Coursera course

Machine learning might be more applicable and doable than you think.

With thanks to Fergal Reid for his input

We like to break things down to their fundamental principles. If that’s the way you like to work too, join our team