Is This Algorithm a Better Movie Critic than Roger Ebert?


Carl Franzen

Carl Franzen

​Casablanca. Image: ​Wikimedia Commons

With crowdsourced reviews, declining print revenue, and the rise of social networks, has there ever been a worse time to be a movie critic? Now there's a new scourge on the way: automated algorithms.

Three scientists at Northwestern University have a new ​study out today, published in Proceedings of the National Academy of Sciences (PNAS)that says a simple computer program they developed is more accurate at determining a film's lasting influence than any other method, including critical averages or the late, beloved critic Roger Ebert's reviews. How can this be?

Because there are a lot of caveats to this work, namely in terms of how the scientists defined "influence." Also the specific films they studied, the time period they looked at, and most importantly of all, who generated the data they used in the first place.

The scientists looked only at movies that are included in the US National Film Registry

Specifically, the scientists looked only at movies that are included in the US National Film Registry: movies that a board of scholars deem so culturally or historically significant that copies of them are preserved for future generations in the Library of Congress. Only films that are at least 10 years old are eligible, and the registry contains 625 films so far.

The scientists then tried to see what was better able to predict films currently in the registry: humans—that is, professional critics and online voters—or several automated programs they cooked up.

To measure the human approach, they looked at the ratings given to these movies by Roger Ebert, as well as their scores on Metacritic (a professional movie critic opinions aggregator), and their ratings and total votes on IMDB.

For the automated approach, they looked at which films had the most citations on IMDB as being inspirational to, or appearing as homages, in later movies (under IMDB's "connections" section, such as the opaque Casablanca reference that supposedly occurs at the end of When Harry Met Sally). They also checked the Google search score (PageRank) given to each film's IMDB page.

As it turns out, Google and the citations method were more accurate at predicting films in the National Registry than Ebert, Metacritic, or IMDB ratings/votes. Specifically, one program the scientists wrote looked only at the number of IMDB citations that appeared 25 years after a movie was first released. This program predicted films that were in the registry with 61 percent accuracy, compared to 57 percent for Google and 50 percent for Ebert and the other human critics.

The scientists acknowledged that Metacritic is relatively young and therefore doesn't include reviews for many of the films on the registry. When they extrapolated for this absence of data, Metacritic fared much better, equal to their algorithm at 61 percent.

The idea that an automated method is better than humans at predicting a movie's long-term success is certainly compelling (maybe even the basis for future sci-fi movies).

However, the one thing this work seems to underlie is how important humans are to the process of gathering and supplying data that automated algorithms rely upon. Without the humans selecting movies for the film registry or writing citations on IMDB, the scientists behind this study wouldn't have had much to go on. So movie critics shouldn't start welcoming their robot overlords quite yet.