A Machine Successfully Predicted the Hit Dance Songs of 2015
Can the essence of popular music be boiled down into a simple, exploitable equation?
Image: Skrillex performing in 2011 via Wikimedia
Hit songs are getting so predictable. No, literally. The recipe for what makes a pop or dance song a hit has apparently become so formulaic, a computer algorithm can predict with above-average accuracy the likelihood that a song will top the charts.
Hit prediction science is still in its infancy and so far attempts have been hit or miss, but I talked to a team of data scientists from the University of Antwerp in Belgium that developed one of the more accurate prediction tools I've seen. Their research singled out the genre of dance music (which tends to have songs with greater similarity) and was successfully able to forecast whether a song would be a top 10 hit.
Their initial research looked at dance hits from 1985 through 2014, but I asked one of the study authors, Dorien Herremans, to run this year's top Billboard dance singles through their prediction tool. The algorithm predicted a 65 percent or higher probability of a hit for all of the top 10, and over 70 percent probability for 6 out of 10 songs. Not too shabby.
Billboard 2015 Hot Dance/Electronic Songs
- "Lean On" by Major Lazer & DJ Snake Featuring M0 — 82%
- "Where Are U Now" by Skrillex & Diplo With Justin Bieber — 72%
- "Hey Mama" by David Guetta Featuring Nicki Minaj, Bebe Rexha & Afrojack — 72%
- "You Know You Like It" by DJ Snake & AlunaGeorge — 63%
- "Waves" by Mr. Probz — 68%
- "Outside" by Calvin Harris Featuring Ellie Goulding — 82%
- "Prayer In C" by Lillywood & Robin Schulz — 65%
- "Blame" by Calvin Harris Featuring John Newman — 88%
- "How Deep Is Your Love" by Calvin Harris & Disciples — 62%
- "I Want You To Know" by Zedd Featuring Selena Gomez — 89%
We also tested the hit potential of the dance and electropop songs on the UK Official Charts top 10 singles of the year, and got similarly positive results. They all scored between a 68 to 90 percent hit probability.
Official Charts Company 2015 Singles
- "Happy" by Pharrell Williams — 83%
- "Rather Be" by Clean Bandit — 74%
- "All Of Me" by John Legend (not dance)
- "Waves" by Mr Probz — 68%
- "Thinking Out Loud" by Ed Sheeran (not dance)
- "Ghost" by Ella Henderson — 79%
- "Timber" by Pitbull ft. Kesha — 90%
- "Stay With Me" by Sam Smith (not dance)
- "Let It Go" by Idina Menzel (not dance)
- "All About That Bass" - Meghan Trainor — 87%
"The hit predictions seem very good! Even better than the ones we did in the paper," said Herremans, who now works in the Centre for Digital Music at Queen Mary University of London.
The team published a paper last year explaining how the model works. To quantify the building blocks of a hit, you first have to break down a song into its various audio characteristics. They pulled this data from The Echo Nest, the music intelligence software that analyzes and categorizes audio data to power the search and recommendation features on services like Spotify.
They looked at 139 different musical aspects to analyze songs. This includes basic features like length, tempo (measured in beats per minute or bpm), time signature, key, and loudness; more subjective features like beat, energy, and danceability; and even more intangible qualities like a song's timbre or tone color, or what we might think of as its general feel.
Timbre is measured by 13 different features related to the basic components of the audio spectrum, and how each changes over time. For example, the number of high or low frequencies in a sound determines its perceived "brightness." "It seems that brightness has a big influence and how the time between beats evolves throughout a song," said Herremans.
The theory of course is that popular songs share a certain set of features that make them appealing to the majority of people, and that you can test any new song against those success markers to predict its commercial potential. (It's important to note that hit-predicting algorithms tend to leave out certain crucial factors for determining a song's commercial success, like marketing budget, the social mood, the music video, or artist name recognition.)
The team initially tested about 3,500 songs that charted in the top 10 from 1985-2014, and found they could predict a hit with an above-average accuracy. Since they looked at hit songs across almost 30 years (adjusting the formula since the anatomy of a hit evolves over time) and found it could predict more recent songs with greater accuracy. The research also found some interesting historical insights. Dance hits have been getting louder and faster over time, but their "danceability" (a proprietary formula of The Echo Nest) has actually decreased; apparently wanting to dance to it is no longer a firm requirement of a successful dance song.
Naturally, there's a whole lot of interest in this burgeoning field of hit song science. A handful of smash hits still makes up the majority of a record company's profits, and labels pour billions into finding talent in hopes of striking that gold. Devising a reliable scientific measure of what makes a song a success, unglamorous as it may be, is a mighty lucrative endeavor.
But the implication that the essence of popular music can be boiled down into a simple, exploitable equation has been met with some controversy, and rightfully so. There's a lot of math behind making music, but even pop music is ultimately an art form. And there have been some failed attempts at computerized hit prediction. The entrepreneur that initially coined the term "Hit Song Science," Mike McCready, later had his claim picked apart by researchers that determined there just wasn't enough science in his science. The commercial arm of Hit Song Science, Uplaya, went on to flop.
"Some other research tried to build models on pop songs, yet with bad results," said Herremans. "Dance charts seemed to be songs in the most similar style, versus pop charts that can contain both rock or dance songs. By choosing one particular style, we can gather more directed features and get better prediction."
Another "Hit Potential Equation," developed a few years ago by researchers at the University of Bristol, also relied on The Echo Nest intelligence to analyze the anatomy of a hit. It used just 23 audio features to analyze hit across all genres going back 50 years, and found it could predict a hit with 60 percent accuracy—better than a random coin toss, though not by much.
If you want to play with some prediction yourself, you can upload mp3 files to the dance hit predictor tool here. The team is working on a follow-up study that also looks at social media behavior of influencers who listen to hits before they hit the charts. "First tests seem to indicate that this makes the predictions even more accurate," Herremans said. The team also suggested they could expand the tool to create an optimization feature to use while composing a song to help generate new hits.
What is clear is that this field of research isn't going anywhere, especially as music AI advances. It's no secret that increasingly today's hit songs are manufactured from a time-tested formula by producers that know how to give the public what the data suggests it wants. Testing that recipe against the mathematical equation for success, and ultimately, using an algorithm to generate hit songs, are logical next steps for the hit making factory.