Depersonalization of Speech Using Speaker-Specific Transform Based on Long-Term Spectrum

Loading...
Thumbnail Image

Authors

Rujzl, M.
Sigmund, M.

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

Společnost pro radioelektronické inženýrství

ORCID

Altmetrics

Abstract

This paper introduces a novel approach for hiding personal information in speech signals. The proposed approach applied a transform warping function, which is obtained from a long-term linear prediction spectrum individually for each speaker. The depersonalized speech was compared with the often used technique based on vocal tract length normalization. The proposed approach performs wider manipulation of fundamental frequency and provides higher intelligibility by 5% in clean speech and by 8% for signal-to-noise ratio 5 dB. It also significantly alters the derived glottal pulses, making them difficult to use for personality analysis. Speech intelligibility index and glottal pulse distortion are new aspects in the field of voice depersonalization.

Description

Citation

Radioengineering. 2023 vol. 32, č. 4, s. 523-530. ISSN 1210-2512
https://www.radioeng.cz/fulltexts/2023/23_04_0523_0530.pdf

Document type

Peer-reviewed

Document version

Published version

Date of access to the full text

Language of document

en

Study field

Comittee

Date of acceptance

Defence

Result of defence

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Creative Commons Attribution 4.0 International license
Citace PRO