AudioLabs - Course: Speech Coding, Summer Term 2016

Course: Speech Coding, Summer Term 2016

Past course
Please visit the education page for information on current courses.

Lecturer

Prof. Dr. Tom Bäckström

Guest Lectures

tbd

Time

Summer Term 2016, Most likely on Mondays at 14-16 (alternatively Thursdays 16-18)

Place

Am Wolfsmantel 33, Erlangen-Tennenlohe, Room 3R4.04

Registration

Please come to the first lecture on Monday, 11.04.2016, 14:15 - 16:00, Room 3R4.04 (Am Wolfsmantel 33). If you are unable to attend, please contact Prof. Dr. Tom Bäckström.

News and announcements

Up to date information and announcements will be provided with StudOn.

Content

Mobile phones – everyone has one. With 7 billion mobile phones in use, digital speech transmission is a truly global technology. Your grandma has one, Prince Charles has one and the poorest village in Africa has one. While the technology clearly works already, with such a market, the smallest improvement, when multiplied by 7 billion, has a huge impact worldwide.

Speech coding refers to digital compression and transmission of speech. This course provides an in-depth perspective to ACELP, the most commonly used speech coding algorithm. We will study the speech production models on which it is based, the perceptual models which are used for its optimization, and most importantly, go through the theory and practice of the most important concepts, linear prediction (LP), long time prediction (LTP), algebraic codebooks, line spectral frequencies (LSFs) and windowing. In addition, we will look at the big picture, the additional challenges that emerge when building a commercial speech coding product.

The goal of this course is to provide a strong foundation for researchers, engineers, and graduate students who are interested in the problem of speech coding.

Tentative Schedule

11.4 -- Introduction
18.4 -- Speech Production and Perception
25.4 -- Envelopes
2.5 -- Windowing
9.5 -- Residual Modelling & Fundamental Frequency
16.5 -- bank holiday
23.5 -- Quality Evaluation
30.5 -- Relaxed Modelling (RCELP)
6.6 -- Systems Design, Constraints and Implementation
13.6 -- Voice Activity Detection
20.6 -- Packet Loss
27.6 -- Advanced tools
4.7 -- Speech Coding Standards
11.7 -- On reserve
Oral exams 11-15.7.

Course requirements

This course is the most advanced course offered by the university on this topic, and serves as an excellent basis from which to commence research in the area. Various aspects of the course bring students up to date with the very latest developments in the field, as seen in recent international standards, conferences and journals. This course builds on Sprach- und Audiosignalverarbeitung (by Prof. Kellermann), and is well complimented by Mensch-Maschine-Schnittstelle (by Prof. Rabenstein), Praxis der Audiodatenkompression (Dr. Grill), Speech Enhancement (Prof. Habets) and Selected Topics in Perceptual Audio Coding (Prof. Herre), which deal with many other signal processing methods and gives an understanding of human auditory perception (also a key part of speech coding) and audio compression techniques.

Pre-requisites; students must be familiar with Signals and systems as well as basic linear algebra and statistics. Prof. Kellerman's course on basic Speech and Audio Processing is highly advisable.

Course material

If your are missing handouts or chapter printouts, please contact the course assistant (yet to be chosen) or the lecturer (tom.backstrom@audiolabs-erlangen.de).

International Audio Laboratories Erlangen