This article discusses a study aimed at protecting user-side privacy in online self-disclosure. The researchers developed a taxonomy of 19 self-disclosure categories and used a language model for identification, achieving over 75% in Token F 1. The model was viewed positively by 82% of participants in a user study, indicating its potential real-world applicability. The study also introduced the task of self-disclosure abstraction and tested different fine-tuning strategies. The best model was able to generate diverse abstractions that moderately reduced privacy risks while maintaining high utility according to human evaluation.

 

Publication date: 17 Nov 2023
Project Page: not provided
Paper: https://arxiv.org/pdf/2311.09538