Surfacing Privacy Settings using Semantic Matching

Published in Proceedings of the Second Workshop on Privacy in NLP, 2020

Recommended citation: Rishabh Khandelwal, Asmit Nayak*, Yao Yao*, Kassem Fawaz. (2020). "Surfacing Privacy Settings Using Semantic Matching" Proceedings of the Second Workshop on Privacy in NLP. https://www.aclweb.org/anthology/2020.privatenlp-1.4.pdf

Online services utilize privacy settings to provide users with control over their data. However, these privacy settings are often hard to locate, causing the user to rely on provider-chosen default values. In this work, we train privacy settings centric encoders and leverage them to create an interface that allows users to search for privacy settings using free-form queries. To achieve this, we create a custom Semantic Similarity dataset, which consists of real user queries covering various privacy settings. We then use this dataset to fine-tune the state of the art encoders. Using these fine-tuned encoders, we perform semantic matching between the user queries and the privacy settings to retrieve the most relevant setting. Finally, we also use these encoders to generate embeddings of privacy settings from the top 100 websites and perform unsupervised clustering to learn about the online privacy settings types.

Download paper here