Last year, Bloomberg signed into law what he called "the most ambitious and comprehensive open data legislation in the country."
New York's next mayor, Bill De Blasio, doesn't look like the tech hypeman that his predecessor was in recent years, during which City Hall coddled startups, praised the digital economy, and brought in a Chief Digital Officer. Rumors have it that De Blasio is considering eliminating that position when he takes office—it's mostly a public relations job, based in the city's entertainment office—but there's little chance that he'll abandon one of Bloomberg's other big, but lesser-known technology project: unleashing all of the city's data to the public.
New York's Open Data Plan, part of Mayor Bloomberg’s vision for creating a more responsive government, has been billed as a new way to run a city. In an ideal situation, open data about commuter patterns could, say, highlight inefficiencies in the transit system; a public record of each school’s math scores might foster competition among schools and teachers; maps of car and bicycle accidents could alert drivers and pedestrians to common dangers. Or, according to one much-touted success story: looking at data about clogged sewers and grease hauling permits can help investigators find the restaurants that are illegally dumping grease into the sewers.
Anyone can now plumb a range of data on New York City, from how much fuel buildings use (and how old their boilers are) to whether a sidewalk café has a license, to the most popular baby names in New York (Jayden and Isabella in the latest census, 2009). In September, the city announced the release of 200 new datasets on its data website, and a plan to make all data from city’s public portals available to the public by 2018. In all, there are 1,100 total datasets from 60 city agencies. (Bonus data: both the Upper East Side and East New York feature the city's highest ratio of single females to single males—nearly 2 to 1.)
The open data trend has since spread to city agencies across the US and the United Kingdom, and was codified in May by President Obama, who issued an executive order requiring most government-owned data to become accessible to the public online and in a readable format. But the NYC Open Data Project dates back to at least 2011 (and maybe to 1993, when the city published its first Public Data Directory). There wasn’t much to look at in the first launch: 750 datasets were put online in an outdated design which was far from user-friendly. Another reason why the project didn’t get much attention on its first launch: Big data wasn’t nearly the buzzword it is now. (A Wikipedia editor deleted the first entry made for the word in 2009, explaining that "big data" was merely a combination of two words, and did not merit a separate entry.)
Last year, mayor-elect De Blasio issued a statement of support for Bloomberg's data policy, which fits with De Blasio’s own vision for forcing New York’s agencies to be more transparent and efficient. (As public advocate, he called for speeding up the city’s response to requests for information that fell under the Freedom of Information Law.) It may not be easy though to get "all" of the city's data out there, as the policy requires: much of the city's data isn't yet structured for distribution, and it's locked up behind bureaucratic concerns about just how free information should be.
For instance, the city's Dept. of Planning recently released its PLUTO database, which combines tax lot information with financial data, but only after a concerted effort by transparency groups like MuckRock (the agency had claimed that releasing the data freely would violate copyright). Meanwhile, the NYPD continues to resist efforts to release traffic accident data, on the argument that the public wouldn't understand it. After getting a minimal response to his freedom of information requests to the NYPD, MuckRock's Shawn Musgrave lambasted the agency, writing in these pages that it is "about as transparent as a bank vault."
Bloomberg's data policy calls for "all" data to be released by 2018, but it doesn't offer much in the way of oversight of the agencies that are meant to be going through the release process. Among the data still to be released: the status of bus routes, bus delays and daily school-wide attendance, updates on school internet access, wastewater treatment plant performance, water quality and reservoir levels, and the number of people in the shelter system. From the NYPD, The data is also set to include suspect descriptions, firearm discharges, weekly city-wide crime statistics, and reports on stop and frisks.
Still, a growing community of data mavens and startups—Made By Friends, Socrata, BetaNYC, Pediacities, the city-run BigApps competition—have been picking through the numbers already out there to build visualizations, maps and apps. What they've helped illustrate is that even if the bold ambitions of a numbers-driven city don't exactly materialize—and even if open data doesn't translate to more open government—the municipal data trail does offer some new and interesting ways of knowing the city. That might give geeks and the rest of us new ways of coping with it and maybe fixing it too. And at least for now, a handy source of trivia to hand out at bars (there are 2,657 of them in New York, by the way, or 3.18 per 10,000 people).
1. Fifty-eight rat sightings still won't land you on the worst landlords list.
Art by Open Data Sites
NYC Open Data includes information about rats sightings beginning 2010 in New York City based on calls made to the city’s complaint hotline 311. The city’s department of Health and Mental Hygiene collects that information. Open Data Sites, a search engine and analytical hub for data around the world put the information in a graph. The number of rat sightings in the city appear to increase and decrease based on the season. Every year, early January has the least reported incidences. The peak of rat sightings is in spring, April, May or June. Brooklyn has the highest number of sightings consistently, followed by Manhattan, Bronx, Queens, Staten Island, in descending order.
The highest number of rats in a borough ever reported to 311? 465 in June 2011 in Brooklyn.
The address with the most reports to date? 2131 Wallace Avenue in the Bronx. The building had 58 incidences of rodent reports to date. Runner up was 46-01 67 Street in Queens with 47 reported rat sightings. Curiously, neither were named on De Blasio’s list of worst landlords, compiled in 2011, based on violations such as lack of heat and hot water and failure to repair leaks and exterminate vermin.
2. Already, Citibikers could have collectively pedaled around the world 275 times.
Graph by Made By Friends
Since Citibike launched in May 2013, cyclists have traveled enough miles to make 275 trips around the world, says the NYC Open Data Tumblr site. The Citibike website has the data in real time, up to the number of miles traveled in the last 24 hours. Another takeaway from the infographic: heat waves don’t deter Citibike riders. Miles traveled on Citibikes dipped on the first day of July and went back to normal level on the subsequent days. Perhaps that means the riders found that the subway was worse than biking in the heat?
3. Trees and violent crime don't mix, sometimes, maybe.
Map of NYC’s urban tree canopy by Tom Swanson from ESRI, based on the NYC Parks Department’s high resolution land cover data set
A report by NYC Parks on the city's tree census says that it’s not just the number of trees that makes a positive impact on the residents’ quality of life, but how big they are. An older, larger tree is seven times more effective young, small tree in purifying air. Therefore, the term canopy is used to compare boroughs’ trees. Staten Island had 34% canopy (34% of its land is covered in tree shade). Brooklyn had 21%. Manhattan had 13%. Bronx had 24%. Queens had 20%. In sheer numbers, Queens had the most trees, followed by Brooklyn. Manhattan and the Bronx—which happen to be home to the city's largest parks—had the least number of tree cover.
The volunteer-gathered census in 2005 and 2006 counted 592,130 trees in all boroughs, a 19% or 93,000 increase from the previous census in 1995-1996. 5.2 million trees are growing in public and private property in New York City, and 24% of New York City is under the shade (or canopy) of a tree. Top species are London Planetree (15.3% of all trees) and Norway Maple (14.1%), both extremely resistant to pollution. Trees near Prospect Park tend to be firs and acers. Each tree in the census is given a six-number ID, though species aren't always listed, as I found with some trees in my neighborhood. As I also found, regrettably, some blocks in Williamsburg have only one recorded tree.
There is no telling how the city plans to use this data, but studies that have shown an inverse relationship between crime rate and tree shade. Does that hold true in this case? Using crime data gathered in a map by Pediacities, the rate of violent crimes per 1,000 residents in Midtown South was high, more than 12 incidents annually. They also had low tree shade, 26% to 50%. But most likely, many other factors contributed to the crime rate. East Harlem, for instance, has a relatively high tree shade rate—50% to 75%—but also had high rates of violent crime, around 9 incidents for every thousand residents per year. Numbers—those that exist—may not lie, but they don't say everything either.
Play with NYC's data here.
More on transparency and cities: