Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals

Early identification of cognitive or physical overload is critical in fields where human decision making matters when preventing threats to safety and property. Pilots, drivers, surgeons, and operators of nuclear plants are among those affected by this challenge, as acute stress can impair their cognition. In this context, the significance of paralinguistic automatic speech processing increases for early stress detection. The intensity, intonation, and cadence of an utterance are examples of paralinguistic traits that determine the meaning of a sentence and are often lost in the verbatim transcript. To address this issue, tools are being developed to recognize paralinguistic traits effectively. However, a data bottleneck still exists in the training of paralinguistic speech traits, and the lack of high-quality reference data for the training of artificial systems persists. Regarding this, we present an original empirical dataset collected using the BESST experimental protocol for capturing speech signals under induced stress. With this data, our aim is to promote the development of pre-emptive intervention systems based on stress estimation from speech.
Early identification of cognitive or physical overload is critical in fields where human decision making matters when preventing threats to safety and property. Pilots, drivers, surgeons, and operators of nuclear plants are among those affected by this challenge, as acute stress can impair their cognition. In this context, the significance of paralinguistic automatic speech processing increases for early stress detection. The intensity, intonation, and cadence of an utterance are examples of paralinguistic traits that determine the meaning of a sentence and are often lost in the verbatim transcript. To address this issue, tools are being developed to recognize paralinguistic traits effectively. However, a data bottleneck still exists in the training of paralinguistic speech traits, and the lack of high-quality reference data for the training of artificial systems persists. Regarding this, we present an original empirical dataset collected using the BESST experimental protocol for capturing speech signals under induced stress. With this data, our aim is to promote the development of pre-emptive intervention systems based on stress estimation from speech.

Keywords

speech , stress , machine learning<br> , speech , stress , machine learning<br>

Citation

Scientific Data. 2024, vol. 11, issue 1, p. 1-9.
https://www.nature.com/articles/s41597-024-03991-w

Document type

Peer-reviewed

Document version

Published version

Language of document

en

DOI

10.1038/s41597-024-03991-w

URI

http://hdl.handle.net/11012/250852

Collections

Ústav automatizace inženýrských úloh a informatiky
Ústav biomedicínského inženýrství
Ústav počítačové grafiky a multimédií

Creative Commons license

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Except where otherwised noted, this item's license is described as Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Citace PRO

Full item page

Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals

Files

Date

Authors

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Altmetrics

Abstract

Description

Keywords

Citation

Document type

Document version

Date of access to the full text

Language of document

Study field

Comittee

Date of acceptance

Defence

Result of defence

DOI

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Citace PRO