The VICE Channels

    How Google Knows if You Were Sick Yesterday

    Written by

    Brian Merchant

    Senior Editor

    Boston is currently experiencing its worst outbreak since the H1N1 swine flu panic of 2009—the mayor has just declared a state of emergency there—and cities around the country are getting hit nearly as hard. Just take a look at Google's Flu Trends map and see for yourself:

    It's all bright red, as you can see: We are a nation come down with a cold. Just about every state is rated as "Intense" or "High" in terms of flu activity. But wait up. Google Flu Trends is based solely on data collected from search activity—which means that everybody who thinks they have the flu and log onto Google to get more information about remedies or their symptoms gets taken into account. So how does Google know who's sick and who's not? Is this really a good way to predict how many people are truly sick across the country?

    Actually, yeah. It is. Take a look at this chart, provided by Google, of their fluey search analysis compared to official U.S. Center for Disease Control stats.

    That's crazy accurate. As Google notes, "These graphs show historical query-based flu estimates for different countries and regions compared against official influenza surveillance data. As you can see, estimates based on Google search queries about flu are very closely matched to traditional flu activity indicators. Of course, past performance is no guarantee of future results."

    No, but it's going to be pretty close. Google's model has proved so scientifically sound a predictor for who's getting the flu and where that its results were published in the esteemed journal Nature. How does it work? The paper explains:

    Because the relative frequency of certain queries is highly correlated with the percentage of physician visits in which a patient presents with influenza-like symptoms, we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day. This approach may make it possible to use search queries to detect influenza epidemics in areas with a large population of web search users.

    In other words, Google took historical flu records and compared them to number of search queries in a given area. It then refined the ratio of searched-for flu to actual flu, and uses the model to predict how many people are currently ailing. Google's results are now so good, they can paint a rather accurate picture of how many people were sick as recently as the day before. If you search for 'flu symptoms' or 'vaccine Philadelphia' or whatever, Google will have that data processed by tomorrow—Google can do a pretty good job of determining the statistical probability of whether or not you were sick yesterday.

    The CDC's forecasts, by way of comparison, relying on data collected from hospitals and health clinics, lags a few weeks. Slate's Will Oremus explains: "Because they are based on after-the-fact reports from more than 3,000 health care providers around the country, the numbers can tell us only how many people were suffering from the flu a couple weeks ago."

    That's obviously a lot less useful. Yet Oremus notes that few reporters and officials are ready to take Google's analysis as seriously as the CDC's findings, so the Flu Map isn't as carefully watched as it should be. But that might change, if its most recent predictions—which show skyrocketing flu rates that analysts say could lead to the worst flu nationwide flu pandemic in 10 years—come to fruition.

    Taking Google's results more seriously could expedite the vaccination process and allow health care workers to better prepare their clinics.

    Now that is a worthy public service, Google. Sure beats giving out free wifi to rich New Yorkers, anyway.