Skip to main content

Unit information: Fault Tolerant Computing and VLSI Testing in 2014/15

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name Fault Tolerant Computing and VLSI Testing
Unit code COMSM0125
Credit points 10
Level of study M/7
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Professor. Pradhan
Open unit status Not open
Pre-requisites

COMS11300 and COMSM1201

Co-requisites

None

School/department Department of Computer Science
Faculty Faculty of Engineering

Description including Unit Aims

This course is broadly divided into two parts. Part one discusses the factors that cause system failures such as hardware defects, faults, noise, design errors and software bugs. Then a wide range of techniques are presented for discovering defects, design errors and faults. Also discussed are design methods to enhance reliability, availability and serviceability in microchips, computer systems and networks. This part includes models for evaluating the effectiveness of design techniques in terms of reliability and availability improvements versus costs in chip area, system complexity and power dissipation. Part two will introduce concepts of error correcting codes in memory and communication. Microchip test techniques, including on-line testing and built-in-self-test, are also reviewed.

Aims:

This unit seeks to acquaint you with various aspects of designing reliable and testable computer system design. Topics covered span issues at both micro-chip level as well as board and system level.

Intended Learning Outcomes

Successful completion of this unit will enable you to: understand why micro-chip fail; test for algorithms for wide range of faults including delay faults; understand reliability models of micro chips; design for testability; fault tolerant computing techniques.

Teaching Information

Lectures (20). A further 80 hours are set aside for coursework and private study.

Assessment Information

Coursework will consist of two parts.

A set of take home assignments worth 70% and two lab assignments worth 30%.

Reading and References

  • D.K. Pradhan, Fault-Tolerant Computer System Design, Prentice-Hall, 1996.
  • N. K. Jha and S. Gupta , Testing of Digital Systems Cambridge University Press, 2003, ISBN-13: 9780521773560 | ISBN-10: 0521773563 .
  • Koren and C. M. Krishna, Fault-Tolerant Systems Morgan-Kaufman, San Francisco, CA, 2007 ISBN: 0120885255
  • D. Siewiorek and R. Swarz, Reliable Computer Systems-Design and Evaluation, 2nd ed., Digital Press - Butterworth, 1992.
  • B. W. Johnson, Design and Analysis of Fault-Tolerant Digital Systems, Addison-Wesley, 1989

Feedback