Duplex: a reusable fault tolerance extension framework for network access devices

Abstract
A growing variety of edge network access devices appear on the marketplace that perform various functions which are meant to complement generic routers' capabilities, such as firewalling, intrusion detection, virus scanning, network ad- dress translation, traffic shaping and route optimization. Be- cause these edge network access devices are deployed on the critical path between a user site and its Internet ser- vice provider, high availability is crucial to their design. This paper describes the design, construction and evalua- tion of a general implementation framework for supporting fault tolerance on edge network devices. This implementa- tion framework, called Duplex, is designed to be indepen- dent of the functionality of the hosting edge network access device, such that only a minimal amount of programming is required to tailor this framework to a specific edge network access device implementation. Duplex can tolerate power failure, hardware failure, and software failure by support- ing device mirroring and watchdog timer-based link bypass- ing. Empirical performance measurements of an instance of Duplex that is embedded in a commercial bandwidth man- agement device show that the run-time overhead of its fault tolerance mechanisms is less than 1 msec 90% of the time, and the failure detection and recovery period is less than 1.3 sec when running at 100 Mbps.

This publication has 10 references indexed in Scilit: