Researchers in the Decentralized Information Group (DIG) at MIT are developing a protocol they call "HTTP with Accountability," or HTTPA, which will automatically monitor the transmission of private data and allow the data owner to examine how it's being used.
With HTTPA, each item of private data would be assigned its own uniform resource identifier (URI), a key component of the Semantic Web, a new set of technologies, championed by W3C, that would convert the Web from, essentially, a collection of searchable text files into a giant database.
Remote access to a Web server would be controlled much the way it is now, through passwords and encryption. But every time the server transmitted a piece of sensitive data, it would also send a description of the restrictions on the data's use. And it would log the transaction, using only the URI, somewhere in a network of encrypted, special-purpose servers.
When the data owner requests an audit, a network of servers work through the chain of derivations, identifying all the people who have accessed the data, and what they've done with it.
An HTTPA-compliant program also incurs certain responsibilities if it reuses data supplied by another HTTPA-compliant source. Suppose, for instance, that a consulting specialist in a network of physicians wishes to access data created by a patient's primary-care physician, and suppose that she wishes to augment the data with her own notes. Her system would then create its own record, with its own URI. But using standard Semantic Web techniques, it would mark that record as "derived" from the PCP's record and label it with the same usage restrictions.
Oshani Seneviratne, an MIT graduate student in electrical engineering and computer science, and Lalana Kagal, a principal research scientist at CSAIL, will present a paper at the IEEE's Conference on Privacy, Security and Trust in July giving an overview of HTTPA with sample application such as an experimental health-care records system.
Seneviratne uses a technology known as distributed hash tables - the technology at the heart of peer-to-peer networks like BitTorrent - to distribute the transaction logs among the servers. Redundant storage of the same data on multiple servers serves two purposes: First, it ensures that if some servers go down, data will remain accessible. And second, it provides a way to determine whether anyone has tried to tamper with the transaction logs for a particular data item - such as to delete the record of an illicit use. A server whose logs differ from those of its peers would be easy to ferret out.
"It's not that difficult to transform an existing website into an HTTPA-aware website," Seneviratne says. "On every HTTP request, the server should say, 'OK, here are the usage restrictions for this resource,' and log the transaction in the network of special-purpose servers."
Audit servers could be maintained by a grassroots network, much like the servers that host BitTorrent files or log Bitcoin transactions.