Seattle, WA, USA  /  18. Juni 2024

Workshop on Media Forensics @ CVPR 2024

CVPR 2024

Der Workshop on Media Forensics im Rahmen der CVPR 2024 findet am 18. Juni 2024 in Seattle, WA, USA statt. Mit Beiträgen zur synthetischen Spracherkennung und Audio Provenance Analyse wird das Fraunhofer IDMT aktuelle Forschungsaktivitäten im Bereich der Medienforensik vorstellen.

Audio Provenance Analysis in Heterogeneous Media Sets

Milica Gerhardt, Luca Cuccovillo, Patrick Aichroth

This paper introduces a framework for Audio Provenance Analysis, addressing the complex challenge of analyzing heterogeneous sets of audio items without requiring any prior knowledge of their content. Our framework applies a novel approach that combines partial audio matching and phylogeny techniques. It constructs directed acyclic graphs to capture the origins and the evolution of content within near-duplicate audio clusters, identifying the least altered versions and tracing the reuse of content within these clusters. The approach is evaluated for two selected application scenarios, demonstrating that it can accurately determine the direction of content reuse and identify parent-child relationships, while also offering a dedicated dataset for benchmarking future research in this area.

Audio Transformer for Synthetic Speech Detection via Multi-Formant Analysis

Luca Cuccovillo, Milica Gerhardt, Patrick Aichroth

This paper introduces a novel multi-task transformer for detecting synthetic speech. The network encodes magnitude and phase of the input speech with a feature bottleneck, used to autoencode the input magnitude, to predict the trajectory of the first phonetic formants (F0, F1, F2), and to distinguish whether the input speech is synthetic or natural. The approach achieves state-of-the-art performance on the ASVspoof 2019 LA dataset with an AUC score of 0.932, while ensuring interpretability at the same time.