Proactive Detection of Voice Cloning with Localized Watermarking

The article introduces AudioSeal, a novel technique designed specifically for localized detection of AI-generated speech, in response to the growing security concerns raised by advancements in speech generative models. AudioSeal employs a generator/detector architecture and presents a novel perceptual loss inspired by auditory masking. It offers state-of-the-art performance in terms of robustness to real-life audio manipulations and imperceptibility, based on automatic and human evaluation metrics. Moreover, AudioSeal’s fast, single-pass detector surpasses existing models in detection speed, making it suitable for large-scale and real-time applications.

Publication date: 31 Jan 2024
Project Page: github.com/facebookresearch/audioseal
Paper: https://arxiv.org/pdf/2401.17264

Post Views: 249

Proactive Detection of Voice Cloning with Localized Watermarking

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

Leave a Reply Cancel reply

Please allow ads on our site