Why Is AI-Generated Music Still so Bad?
Researchers say it’s a lot more complex than it seems on the surface.
There’s no denying that holiday music is somewhat formulaic. You’d think it would be easy for a computer to generate something indistinguishable from the typical carols piped through department stores this time of year. Turns out, it’s not that easy.
Swedish company Made by AI recently trained an AI system on 100 midi files of Christmas tunes, then tasked it with creating new songs. This is what the computer came up with:
Made by AI is not the first group to create some lackluster AI-generated music. There was this attempt, from researchers at the University of Toronto in 2016. They trained a neural net to generate a new Christmas carol, which is the stuff of nightmares:
In a time when artificial intelligence is advanced enough to generate scary-realistic human faces, why can’t it string together a decent bop?
“Composing good music is actually more complicated than we expected,” Hang Chu, a computer science PhD student at the University of Toronto who created the creepy 2016 AI Christmas carol, told me in a phone interview. “Music is not something where if you throw enough data at it and hope the algorithm can figure it out, it will work.”
Chu explained one challenge is that with image recognition and generation, it’s fairly easy to get a huge dataset to train a neural network with images that are more or less the same, like close ups of human faces. With music, however, every song is going to vary widely, and it’s expensive to put together a large dataset of songs, he said.
There are also a lot of intricacies in music to be considered, from melody and harmony to tempo and timing. One way researchers have tried to work around this is by breaking down each element and having different AI systems build each element of a song piece by piece. That’s how the team at the Luxembourg-based AI composition company AIVA (which stands for Artificial Intelligence Virtual Artist), created an AI-generated score for the video game Pixelfield.
“We asked, ‘what are the building blocks to create an entire song?’” Pierre Barreau, the CEO of AIVA, told me over the phone. “If you consider just melody, you can generate that. Then based on that melody, you can make another model that creates instrumental accompaniment for that melody. If you break it down, it becomes substantially easier.”
Barreau said that some research projects on music generation had instead tasked one AI system with composing an entirely new piece of music from end-to-end. He compared that to asking one system to draw a detailed portrait with shading, color, and form, as opposed to having three different systems contribute each of those elements based on the existing work.
This approach helps get around a lot of the challenges of creative generation, as does having systems that are still assisted by humans. A couple years ago, IBM’s Watson created a trailer for a film by being trained on other movie trailers. But John Smith, a fellow at IBM’s AI research center, told me a human ultimately had to take Watson’s ideas over the finish line.
“The AI studied historical trailers and came up with a whole bunch of ideas, but in the end a person still made the trailer out of the raw material that the AI had suggested,” Smith said in a phone interview.
But Smith also argued that our idea of what is “good” or “bad” music or art, when it’s created by AI, should maybe not be so narrow. Part of the joy of AI is that it doesn’t think like humans do, and so it comes up with concepts and ideas we would never think of. That can be helpful for pushing art into new realms, Smith said, but added that at the end of the day, human creativity can’t be replaced.
“The computer can start to do more and more of the groundwork and prep work and even suggest different ideas,” Smith said. “But that leap of creative thought, that spark of imagination, still has to come from a human.”