Photo via Flickr / CC.
Film studios churn out an inexhaustible glut of sequels, reboots, and adaptations during the summertime blockbuster season. This of course is because existing franchises usually gross the biggest numbers at the box office—just look at 2013's cumulative box office take for proof that Hollywood's formula is as fruitful as ever.
This year's numbers nudged beyond even 2012's box office pull, which is an impressive feat considering that 2012 alone was a year wherein not one but four movies crossed the coveted $1 billion box office mark worldwide. So Hollywood's current formula is clearly still working. Yes, there will be an Iron Man 7—only to be inexplicably rebooted five years later.
Despite a record year though, like every year before it, 2013 remained fraught with its fair share of box office disasters. What if studios could minimize their loses and predict when the next Pluto Nash-level flop was imminent? According to new research published in PLoS One, they may actually be able to.
Using data gleaned from Wikipedia articles, researchers measured the likelihood of a film's financial success based on four parameters: number of total page views; number of total edits made; number of users editing; and the number of revisions in the article's revision history, or "collaborative rigor."
The data mining used in the study purportedly show that measuring the activity of a forthcoming film's entry on Wikipedia can predict whether or not a movie will be successful at the box office. And because Wikipedia entries for films are created months—if not years—in advance of a release date, those fluctuating parameters could make possible for a course-correction for a floundering film far in advance of its premiere, according to the study.
First weekend box office revenue in the US against its predicted value by the Wikipedia model
Similar uses of big data have led to claims of Twitter being able to predict future elections, and Wednesday's new study addressed past attempts at using social media for predictions, notably with regard to a recent study that claimed Twitter too could be used to predict box office performance.
Researchers included data from the 2010 Twitter-mentions model and found that their Wikipedia formula for prediction not only worked but yielded far more accurate results than previous methods.
Comparison of the results with the Twitter-based prediction
"Our analysis not only outperforms the previous works by the much larger number of movies we have investigated, but also improves on the state of the art by providing reasonable predictions as early as one month prior to the release date of the movie."
For any film studio trying to figure out how to market its upcoming flick, or which month to release it, this sort of hard data would look to be quite a marketing boon for executives—like a film version of the Billy Beane-popularized sabermetric analytics used in sports. Then again, if you're trying to figure out what to do with another Pluto Nash, that's probably a catastrophe that not even big data can help avert.