Speech, Speaker and Speaker\'s Gender Identification in Automatically Processed Broadcast Stream

Silovsky, Jan; Nouza, Jan

Speech, Speaker and Speaker\'s Gender Identification in Automatically Processed Broadcast Stream

dc.contributor.author	Silovsky, Jan
dc.contributor.author	Nouza, Jan
dc.coverage.issue	3	cs
dc.coverage.volume	15	cs
dc.date.accessioned	2016-04-22T06:15:03Z
dc.date.available	2016-04-22T06:15:03Z
dc.date.issued	2006-09	cs
dc.description.abstract	This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data.	en
dc.format	text	cs
dc.format.extent	42-48	cs
dc.format.mimetype	application/pdf	en
dc.identifier.citation	Radioengineering. 2006, vol. 15, č. 3, s. 42-48. ISSN 1210-2512	cs
dc.identifier.issn	1210-2512
dc.identifier.uri	http://hdl.handle.net/11012/57963
dc.language.iso	en	cs
dc.publisher	Radioengineering Society	cs
dc.relation.ispartof	Radioengineering	cs
dc.relation.uri	http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf	cs
dc.rights	Creative Commons Attribution 3.0 Unported License	en
dc.rights.access	openAccess	en
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/	en
dc.subject	Speaker recognition	en
dc.subject	Gaussian mixture models	en
dc.subject	broadcast speech transcription	en
dc.title	Speech, Speaker and Speaker\'s Gender Identification in Automatically Processed Broadcast Stream	en
dc.type.driver	article	en
dc.type.status	Peer-reviewed	en
dc.type.version	publishedVersion	en
eprints.affiliatedInstitution.faculty	Fakulta elektrotechniky a komunikačních technologií	cs

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 06_03_42_48.pdf
Size:: 264.99 KB
Format:: Adobe Portable Document Format

Download

Collections

2006/3