Introduction to Audio Segmentation and Classification

868 visualizacións 2 de xul. de 2013

In the introductory talk, a background on audio segmentation will be given to the students: description of the task, applications, main issues to deal with, and the evaluation framework as defined in Albayzin evaluations. After this, the most habitual audio features (MFCCs and its derivatives, LPCs, etc) and the main state-of-the-art approaches for audio segmentation and classification will be introduced (BIC algorithm, HMM segmentation, SVMs). Right after, fusion techniques for combining different audio segmentation systems will be presented. Lastly, the Albayzin Audio Segmentation Evaluation carried out in 2010 and 2012 will be described: description of the task, datasets, evaluation metrics and performance achieved by the proposed systems. A brief summary of the main issues found by the participants will be given.

Laura Docio Fernández
Multimedia Technologies Group (GTM), AtlantTIC Research Center, University of Vigo
Paula López Otero
Multimedia Technologies Group (GTM), AtlantTIC Research Center, University of Vigo

Benvida

Edita de Lorenzo - Directora da Escola Escola de Enxeñaría de Telecomunicación

Emotional Speech Systems: What is an emotional system?

Juan Manuel Montero Martínez - Speech Technology Group (GTH), Department of Electronic Engineering (IEL)

Introduction to Audio Segmentation and Classification

Laura Docio Fernández - Multimedia Technologies Group (GTM), AtlantTIC Research Center

Implementation of Segmentation and Classification at the Same Time

Laura Docio Fernández - Multimedia Technologies Group (GTM), AtlantTIC Research Center

An Overview of the NIST Series of Speaker Recognition Evaluations and Technologies

Joaquin González-Rodríguez - Biometric Recognition Group – ATVS, Escuela Politécnica Superior

Session Variability Compensation in Speaker Recognition

Javier Gonzalez-Dominguez - Biometric Recognition Group – ATVS, Escuela Politécnica Superior

Speech technologies: research opportunities at Vicomtech-IK4

Arantza del Pozo - Head of the Human Speech and Language Technology Group

Evaluation of Spoken Language Recognition Systems: Tasks, applications, general issues and acoustic approaches

Luis Javier Rodríguez Fuentes - Software Technologies Working Group (GTTS), Department of Electricity and Electronics (ZTF-FCT)

Evaluation of Spoken Language Recognition Systems: Phonotactic approaches, backend and fussion

Luis Javier Rodríguez Fuentes - Software Technologies Working Group (GTTS), Department of Electricity and Electronics (ZTF-FCT)