End-to-end Optimal Algorithms For Traffic Engineering, Failure Detection And Recovery In Connectionless Networks
Abstract
In this thesis we propose a novel scheme to achieve intra-domain Traffic Engineering
(TE), Failure Detection and Recovery (FR) in connectionless networks. This scheme addresses
rate adaptation, load balancing and stability issues of the OSPF protocol namely,
network convergence times and route flapping.
With the current default settings of the OSPF parameters, the network takes several
tens of seconds to recover from a failure. The main component in this delay is the time
required to detect the failure using the Hello protocol. Route flap is another undesirable
phenomenon and needs to be eliminated to achieve greater stability and robustness in
computer networks. Also, performing rate adaptation and load balancing at the edge
routers without any feedback from the network core is a challenging task.
In this thesis, we address the above issues with a scheme that does not require any
modifications to the underlying routing protocols. We focus on the application of a set
iv
of distributed control laws on the edge routers in a connectionless network. The control
laws provide optimal data rate adaptation and load balancing where multiple disjoint
paths are available between an ingress-egress pair. The control laws require information
of whether a forwarding path is congested or not for their traffic engineering operations.
We have implemented a source inferred congestion detection scheme, to find congested
paths and node/link failures along a forwarding path. The information inferred from this
mechanism serves as input to the control laws to provide optimal rate adaptation and
load balancing.
The proposed approach endows the network with the important property of stability
and robustness with respect to node/link failures. The congestion detection scheme
also serves as a failure detection mechanism and thus helps to overcome the IGP convergence
problem in OSPF. On the occurrence of a link or router breakdown, the congestion
detection mechanism is capable of detecting the failure within a few milliseconds. This
information is provided to the control laws, which reroute traffic away from the inoperative
node/link to converge to the optimal allocation for the "reduced" network. In
comparison, OSPF takes several tens of seconds for the failure detection and convergence
process. Hence, this approach helps to get around the IGP convergence problem.
Highly scalable TE and FR features are implemented on edge routers, based on
these control laws and the congestion detection mechanism, without the involvement of
the routers in the network core. Simulation results are presented to demonstrate the
advantages of the proposed approach under a variety of network scenarios. Results show
that the convergence time reduces from few tens of seconds to the order of milliseconds.
The average throughput is improved considerably due to dynamic rate adaptation and
load balancing provided by the control laws.