Skip to main content

Unit information: Speech and Audio Processing in 2020/21

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name Speech and Audio Processing
Unit code EENGM1411
Credit points 10
Level of study M/7
Teaching block(s) Teaching Block 2 (weeks 13 - 24)
Unit director Dr. Hill
Open unit status Not open
Pre-requisites

None

Co-requisites

None

School/department School of Civil, Aerospace and Design Engineering
Faculty Faculty of Engineering

Description including Unit Aims

This unit will cover speech and audio processing techniques widely used in multimedia engineering. The first part of the course will provide a brief description of the human auditory system and speech production mechanism. The second part will deal with compression techniques for speech including LPC analysis and CELP coders. Wideband audio compression schemes will also be covered, as exemplified by MP3 compression. A description of the compression algorithms featured in some of the international multimedia coding standards will be provided. The final part of the course will examine specific audio applications such as 3D audio, time stretching (using the phase vocoder) and some specific music synthesis techniques such as subtractive and FM synthesis.

Elements:

Speech and its Characteristics

  • Audio system fundamentals: Phase vocoder, spectrographs, DSP review.
  • Historical review: Music Synthesis, Music Analysis, Speech Synthesis.
  • Acoustics: The wave equation, acoustic tubes, reflections & resonance, oscillations & musical acoustics, spherical waves & room acoustics.
  • Auditory System: Psychophysics, auditory scene analysis.
  • Speech models / speech analysis and synthesis: LPC and cepstrum analysis. The use of HMMs for speech recognition.
  • Compression / Coding: CELP coders, multi-rate and wideband compression. MUSICAM and MPEG audio coding schemas.
  • Music analysis and recognition: Transcription, summarization, and similarity 3D Audio: Head Related Transfer Functions (HRTFs), using OpenAL.
  • Synthesis: Subtractive, additive, FM, wavetable and granular synthesis.

Intended Learning Outcomes

On successful completion of the unit a student will be able to:

  1. explain the different aspects of speech production, coding and recognition
  2. explain the fundamentals of speech and audio coding systems
  3. explain the algorithmic details of various international standards for speech and audio coding
  4. design musical effects processing and synthesis systems.

Teaching Information

Teaching will be delivered through a combination of synchronous and asynchronous sessions, including lectures, practical activities supported by drop-in sessions, problem sheets and self-directed exercises.

Assessment Information

ILOs will be assessed via an exam.

Reading and References

L. Rabiner and B. Juang. Fundamentals of Speech Recognition. Prentice-Hall Signal Processing Series. 1993.

B Gold & N Morgan, Speech and Audio Signal Processing, John Wliey & Sons, 1999

Feedback