Skip to main content

Unit information: Speech and Audio Processing in 2013/14

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name Speech and Audio Processing
Unit code EENGM1411
Credit points 10
Level of study M/7
Teaching block(s) Teaching Block 2 (weeks 13 - 24)
Unit director Dr. Hill
Open unit status Not open
Pre-requisites

EENG31400 Digital Filters and Spectral Analysis 3

Co-requisites

None

School/department School of Civil, Aerospace and Design Engineering
Faculty Faculty of Engineering

Description including Unit Aims

This unit will cover speech and audio processing techniques widely used in multimedia engineering. The first part of the course will provide a brief description of the human auditory system and speech production mechanism. The second part will deal with compression techniques for speech including LPC analysis, CELP coders, wavelet coders, Multi-rate and wideband compression. Sub-band, MUSICAM and MPEG audio coding schemes will also be covered. A description of the compression algorithms featured in some of the international multimedia coding standards will be provided. The final part of the course will deal with methods of computer music synthesis such as granular synthesis, sample synthesis and physical modelling techniques.

Elements:

Speech and its Characteristics

  • Audio system fundamentals: Phase vocoder, spectrographs, DSP review.
  • Historical review: Music Synthesis, Music Analysis, Speech Synthesis.
  • Acoustics: The wave equation, acoustic tubes, reflections & resonance, oscillations & musical acoustics, spherical waves & room acoustics.
  • Auditory System: Psychophysics, auditory scene analysis.
  • Speech models / speech analysis and synthesis: LPC and cepstrum analysis. The use of HMMs for speech recognition.
  • Compression / Coding: CELP coders, multi-rate and wideband compression. MUSICAM and MPEG audio coding schemas.
  • Sound mixtures and Separation: BSS, CASA, ICA, and model-based separation
  • Music analysis and recognition: Transcription, summarization, and similarity 3D Audio: Head Related Transfer Functions (HRTFs), using OpenAL.
  • Music: Effect processing: Flanging, phasing, doppler effect, chorus effect, compression etc.
  • Synthesis: Subtractive, additive, FM, wavetable and granular synthesis,

Room Acoustics: Digital reverberation.

Intended Learning Outcomes

On successful completion of the unit a student will be able to:

  • explain the different aspects of speech production, coding and recognition
  • explain the fundamentals of speech and audio coding systems
  • the algorithmic details of various international standards for speech and audio coding
  • will be able to design musical effects processing and synthesis systems.

Teaching Information

A combination of lectures and seminars

Assessment Information

2 hour terminal exam - 100%

Reading and References

L. Rabiner and B. Juang. Fundamentals of Speech Recognition. Prentice-Hall Signal Processing Series. 1993.

B Gold & N Morgan, Speech and Audio Signal Processing, John Wliey & Sons, 1999

Feedback