TL;DR: This paper presents a scheduling algorithm called iSLIP, an iterative, round-robin algorithm that can achieve 100% throughput for uniform traffic, yet is simple to implement in hardware, and describes the implementation complexity of the algorithm.
Abstract: An increasing number of high performance internetworking protocol routers, LAN and asynchronous transfer mode (ATM) switches use a switched backplane based on a crossbar switch. Most often, these systems use input queues to hold packets waiting to traverse the switching fabric. It is well known that if simple first in first out (FIFO) input queues are used to hold packets then, even under benign conditions, head-of-line (HOL) blocking limits the achievable bandwidth to approximately 58.6% of the maximum. HOL blocking can be overcome by the use of virtual output queueing, which is described in this paper. A scheduling algorithm is used to configure the crossbar switch, deciding the order in which packets will be served. Previous results have shown that with a suitable scheduling algorithm, 100% throughput can be achieved. In this paper, we present a scheduling algorithm called iSLIP. An iterative, round-robin algorithm, iSLIP can achieve 100% throughput for uniform traffic, yet is simple to implement in hardware. Iterative and noniterative versions of the algorithms are presented, along with modified versions for prioritized traffic. Simulation results are presented to indicate the performance of iSLIP under benign and bursty traffic conditions. Prototype and commercial implementations of iSLIP exist in systems with aggregate bandwidths ranging from 50 to 500 Gb/s. When the traffic is nonuniform, iSLIP quickly adapts to a fair scheduling policy that is guaranteed never to starve an input queue. Finally, we describe the implementation complexity of iSLIP. Based on a two-dimensional (2-D) array of priority encoders, single-chip schedulers have been built supporting up to 32 ports, and making approximately 100 million scheduling decisions per second.
TL;DR: This paper introduces two maximum weight matching algorithms: longest queue first (LQF) and oldest cell first (OCF), which achieve 100% throughput for all independent arrival processes.
Abstract: It is well known that head-of-line blocking limits the throughput of an input-queued switch with first-in-first-out (FIFO) queues. Under certain conditions, the throughput can be shown to be limited to approximately 58.6%. It is also known that if non-FIFO queueing policies are used, the throughput can be increased. However, it has not been previously shown that if a suitable queueing policy and scheduling algorithm are used, then it is possible to achieve 100% throughput for all independent arrival processes. In this paper we prove this to be the case using a simple linear programming argument and quadratic Lyapunov function. In particular, we assume that each input maintains a separate FIFO queue for each output and that the switch is scheduled using a maximum weight bipartite matching algorithm. We introduce two maximum weight matching algorithms: longest queue first (LQF) and oldest cell first (OCF). Both algorithms achieve 100% throughput for all independent arrival processes. LQF favors queues with larger occupancy, ensuring that larger queues will eventually be served. However, we find that LQF can lead to the permanent starvation of short queues. OCF overcomes this limitation by favoring cells with large waiting times.
TL;DR: A switch architecture with two-stage switching fabrics and one- stage switching fabrics that scales up with the speed of fiber optics is proposed.
TL;DR: This paper considers how optics can be used to scale capacity and reduce power in a router, and describes two different implementations based on technology available within the next three years.
Abstract: Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make efficient use of their expensive long-haul links. In this paper we consider how optics can be used to scale capacity and reduce power in a router. We start with the promising load-balanced switch architecture proposed by C-S. Chang. This approach eliminates the scheduler, is scalable, and guarantees 100% throughput for a broad class of traffic. But several problems need to be solved to make this architecture practical: (1) Packets can be mis-sequenced, (2) Pathological periodic traffic patterns can make throughput arbitrarily small, (3) The architecture requires a rapidly configuring switch fabric, and (4) It does not work when linecards are missing or have failed. In this paper we solve each problem in turn, and describe new architectures that include our solutions. We motivate our work by designing a 100Tb/s packet-switched router arranged as 640 linecards, each operating at 160Gb/s. We describe two different implementations based on technology available within the next three years.
TL;DR: This paper proposes a scalable solution, called the mailbox switch, that solves the out-of-sequence problem in the two-stage switch architecture and proposes a recursive way to construct the switch fabrics for the set of symmetric connection patterns.
Abstract: Traditionally, conflict resolution in an input-buffered switch is solved by finding a matching between inputs and outputs per time slot. To do this, a switch not only needs to gather the information of the virtual output queues at the inputs, hut also uses the gathered information to compute a matching. As such, both the communication overhead and the computation overhead make it difficult to scale. Recent works on the two-stage switch architecture in (6|, [7], [12], (8| showed that conflict resolution can be easily solved over time and space without communication and computation overhead. However, the main problem of such a two-stage switch architecture is that packets might be out of sequence. The main objective of this paper is to propose a scalable solution, called the mailbox switch, that solves the out-of-sequence problem in the two-stage switch architecture. The key idea of the mailbox switch is to use a set of symmetric connection patterns to create a feedback path for packet departure times. With the information of packet departure times, the mailbox switch can schedule packets so that they depart in the order of their arrivals. Despite the simplicity of the mailbox switch, we show via both the theoretical models and simulations that the throughput of the mailbox switch can be as high as 75%. With limited resequencing delay, a modified version of the mailbox switch achieves 95% throughput. We also propose a recursive way to construct the switch fabrics for the set of symmetric connection patterns. If the number of inputs, N, is a power of 2, we show that the switch fabric for the mailbox switch can be built with N/2 log2 N 2times2 switches