How 'HTTP with Accountability' Will Let You Track Who Uses Your Data

HTTPA would use transparency to give users increased privacy by keeping data usage records on a so-called Provenance Tracking Network.

|
Jun 23 2014, 7:20pm
Data.Tron by Ryoji Ikeda in Transmediale 10. Image: Wikimedia Commons

People often believe that data is secure and private if it's exchanged between trusted parties, and that malicious hackers are the only threat. But, malicious or not, can anyone actually be trusted?

Researchers at the Decentralized Information Group (DIG) are currently working on the exact opposite assumption: that not all parties can be trusted. They believe the best way to secure private data is by making it transparent, which would allow data creators to know just who is doing what with their information. 

DIG, part of MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), is currently developing its solution in the form of "HTTP with Accountability," or HTTPA. This protocol would automatically monitor transmission of private data, allowing its owners to monitor how their information is being used. 

HTTPA's primary architect is Oshani Seneviratne, an MIT graduate student in electrical engineering and computer science. At this July's IEEE Conference on Privay, Security and Trust, Seneviratne will present a paper on HTTPA, along with a prototype developed for a healthcare records system, using the experimental network PlanetLab.

The idea for HTTPA was inspired by the paper "Information Accountability," authored by Daniel Weitzner, Tim Berners-Lee, and others. "With access control and encryption no longer capable of protecting privacy, laws and systems are needed that hold people accountable for the misuse of information, whether public or secret," reads the paper's introduction. 

As Seneviratne told me, Berners-Lee is her PhD thesis advisor, and the HTTPA work was conducted as part of her thesis. As a result, Berners-Lee oversaw many aspects of the project, giving a lot of valuable advice along the way. Other senior members at DIG, particularly Lalana Kagal, Hal Abelson and Daniel Weitzner, also contributed to many of the ideas implemented in HTTPA.

HTTPA requires that each piece of private data be assigned its own uniform resource identifier (URI), a string of symbols that can be compared to a person's street address. (One type of URI is the URL, the web addresses we're most familiar with.) 

HTTPA uses the URI to look up access and usage logs in the Provenance Tracking Network (PTN), a secure collection of servers that uses hash table technology, a distributed and encrypted network that powers peer-to-peer networks like BitTorrent. Everything a user needs to request a data usage audit is contained on this network.

"HTTPA is an end-to-end architecture, so changes should happen at both the client and the server level. We are working on making it easier for websites to adopt this easily."

Seneviratne said that HTTPA doesn't follow from any specific security news in the last year. She did, however, note there is anecdotal evidence in the technology press about data misuse, which had an influence on HTTPA. 

"Some of the examples include denying employment and healthcare insurance payments based on what someone put on Facebook, etc.," she said. "The traditional ways to protect privacy has been through data minimization, hiding, or obscurity. Those things still hold with HTTPA, and the activities on sensitive data items are only transparent to the data subject."

"However, the difference in HTTPA is that the data subject will obtain a complete account of the accesses and usages of her data even when it has left the original system as long as the subsequent systems use HTTPA," she added. "This transparent view of the sensitive data (or any other data of interest) is what sets HTTPA apart from other similar privacy preserving technologies."

In healthcare emergencies, for example, accessing the right data at the right time regardless of authorization could be the difference between life and death. HTTPA would allow users to see exactly how healthcare data was used by medical professionals in such situations, as well as set data usage restrictions beforehand.

Seneviratne said she built the prototype for the Electronic Healthcare Records (EHR) system because of concerns about privacy and patients' right to information. Data transaction logs were stored across 300 servers on PlanetWeb, ensuring that if some servers crashed, data would still be accessible. Multi-server storage also provided a way of determining if data has been tampered with or deleted. If some logs differed from the vast majority of others, it would be a huge red flag.

"By enabling HTTPA to give a transparent view of the usages of their sensitive data, patients will have more trust, while allowing 'break glass' scenarios for the hospital personnel using the EHR," she said. "Since HTTPA has been demonstrated to work in this domain, we believe the healthcare industry will (hopefully) be the first to have a real software implementation."

"If access to creative content is completely sealed off, a generative culture that results in even more creative things would not thrive."

Healtcare isn't the only industry that could find HTTPA useful. In the world of intellectual property, restricted access to creative content can also be a problem.

"If access to creative content is completely sealed off, a generative culture that results in even more creative things would not thrive," Seneviratne said. "So, at a conceptual level, HTTPA provides usage restriction communication, logging of data transactions and auditing that can detect any misuses after the fact."

Seneviratne added that HTTPA can supplement access control mechanisms. It can also provide guidelines to data creators as far as how they might think about attaching usage restrictions based, as Seneviratne said, on historical usage patterns. 

HTTPA would be voluntary, though Seneviratne and her fellow researchers hope the protocol is widely adopted. But, it's really up to software developers to stick to HTTPA's specifications when designing their systems. 

"HTTPA is an end-to-end architecture, so changes should happen at both the client and the server level," she said. "We are working on making it easier for websites to adopt this easily."

For this to happen, Seneviratne said that the Provenance Tracking Network will have to grow to handle increasing amounts of data usage traffic.

"We are hoping that this network will grow over time with the support of the community," she said. "Once all these pieces align, any website should be able to use HTTPA without any problem."