Automatically Answering People's Privacy Questions
As novel technologies collect increasingly large and diverse amounts of data about us, people are unable to keep up and retain control over what happens to their data. The current legal approach to privacy concentrates on the concept of "Notice and Choice", namely the expectation that people are provided sufficient information about the collection and use of their data, and are offered meaningful choices about these practices (e.g., opt out, opt in). A primary element of this approach relies on privacy policies to communicate this information to people. In practice, these policies tend to be long, vague and ambiguous. Not too surprisingly, few people find the time to read them and those who do often struggle to understand what they say. This multi-disciplinary project aims to develop novel technology that will enable people to regain a sense of control by enabling them to simply ask questions about the privacy issues that matter to them rather than requiring them to read long, one-size-fits all privacy policies. In addition to producing new knowledge and technologies and contributing to improving the state of privacy in the United States, this project will also create education and research opportunities for both undergraduate and graduate students at participating universities, including activities to broaden participation of women and under-represented minorities in this important area of computer science, and contribute to the development of technologies with the potential to help the visually impaired take advantage of information found in the text of privacy policies.
This multi-disciplinary project builds on recent advances in natural language processing, machine learning, code analysis and user modeling to re-invent notice and choice, moving from long and hard-to-understand notices to interactive privacy dialogues with users. An important part of this research involves the development of question answering functionality that enables users to ask questions about those issues that truly matter to them rather than presenting them with one-size-fits-all privacy notices. Another involves supplementing disclosures found in privacy policies with additional sources of information such as background knowledge (e.g. knowledge about common data practices and relevant laws) and code analysis to disambiguate statements and provide additional details to users when it matters (e.g., with whom their data is actually shared). This research will be guided by user-centered design methodologies where the design of novel technologies is informed by findings from human subject studies, and where technologies are deployed and evaluated in increasingly rich and realistic scenarios. Products of this research will include prototype privacy Question Answering functionality, as well as technology to automatically extract information about data collection and use practices from both the text of privacy policies and from code.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.