Mind Your Own Business
Ensuring privacy in a world of constant monitoring
While the increasing amount of data captured in the urban environment holds great promise for shaping life in the cities of tomorrow, there is also growing unease over the volume and nature of data that is being built up about individual citizens. Efforts made for the benefit of the many – such as installing sensors, adding GPS to taxis, tracking transit cards – run the risk of being used to focus on the actions of the few.
To address these concerns, and limit the potential for misuse of data, Warwick researchers have been developing technologies for anonymisation of data.
These are intended to make it impossible to isolate the signals of individuals in collected data. There is a tightrope to walk here, since while we want to mask out the pattern of any individual, we still want true group behaviour patterns to be well-captured.
Anonymisation turns out to be a challenging problem. Well-intentioned simple approaches usually turn out to be deficient in sometimes subtle ways. Consider collecting GPS readings from smartphones as people go about their day. It might seem that if we remove the details of who provided each trace, their movements are protected. But humans are creatures of habit, and most daily traces begin and end in a residential area, from which a home address, and hence a name, can be extracted.
This name can then be attached to the rest of the trace, showing where else that individual travelled.
Instead, more complex approaches are required which draw on concepts from mathematics, computer science and statistics to add just enough random fluctuation to data to mask the individual’s information. Models such as “differential privacy” set a standard of data masking. Then each new type of data (sensor, GPS, transit) requires novel methods to reach this standard.
Professor Graham Cormode (Department of Computer Science, University of Warwick) has been developing techniques that can help anonymise urban data. This work is in conjunction with a team of international privacy experts from universities such as Duke, North Carolina State in the US, and Nanyang Technological University in Singapore; and businesses such as AT&T. These efforts have resulted in state-of-the-art tools for handling data that represents spatial distributions (maps) and data warehouses. Ongoing work is addressing the case of mobility data, as individuals move around geographic areas leaving trajectories. The ultimate aim of this research is to develop general tools which can be used to anonymise new types of data effectively. The approach is to build effective, lightweight models of data that capture the core characteristics of the data distribution. The anonymisation then gently perturbs the model parameters to hide the contribution of individuals, but remaining true to the original input.