Ayuda
Ir al contenido

Dialnet


Resumen de Improvement of interconnection networks for clusters: direct-indirect hybrid topology and HoL-blocking reduction routing

Roberto Peñaranda Cebrián

  • Nowadays, clusters of computers are used to solve computation intensive problems.

    These clusters take advantage of a large number of computing nodes to provide a high degree of parallelization.

    Interconnection networks are used to connect all these computing nodes.

    The interconnection network should be able to efficiently handle the traffic generated by this large number of nodes.

    Interconnection networks have different design parameters that define the behavior of the network.

    Two of them are the topology and the routing algorithm.

    The topology of a interconnection network defines how the different network elements are connected, while the routing algorithm determines the path that a packet must take from the source to the destination node.

    The most commonly used topologies typically follow a regular structure and can be classified into direct and indirect topologies, depending on how the different network elements are interconnected.

    On the other hand, routing algorithms can also be classified into two categories: deterministic and adaptive algorithms.

    To evaluate interconnection networks, metrics such as latency or network productivity are often used.

    Throughput refers to the traffic that the network is capable of accepting the network per time unit.

    On the other hand, latency is the time that a packet requires to reach its destination.

    This time can be divided into two parts.

    The first part is the time taken by the packet to reach its destination in the absence of network traffic.

    The second part is due to network congestion created by existing traffic.

    One of the effects of congestion is the so-called Head-of-Line blocking, where the packet at the head of a queue blocks, causing the remaining queued packets can not advance, although they could advance if they were at the head of the queue.

    Nowadays, there are other important factors to consider when interconnection networks are designed, such as cost and fault tolerance.

    On the one hand, a high performance is desirable, but without a disproportionate increase in cost.

    On the other hand, the fact of increasing the size of the network implies an increase in the network components, thus the probability of occurrence of a failure is higher.

    For this reason, having some fault tolerance mechanism is vital in current interconnection networks of large machines.

    Putting all in a nutshell, a good performance-cost ratio is required in the network, with a high level of fault-tolerance.

    This thesis focuses on two main objectives. The first objective is to combine the advantages of the direct and indirect topologies to create a new family of topologies with the best of both worlds.

    The main goal is the design of the new family of topologies capable of interconnecting a large number of nodes being able to get very good performance with a low cost hardware.

    The family of topologies proposed, that will be referred to as k-ary n-direct s-indirect, has a n dimensional structure where the k different nodes of a given dimension are interconnected by a small indirect topology of s stages.

    We will also focus on designing a deterministic and an adaptive routing algorithm for the family of topologies proposed.

    Finally we will focus on analyzing the fault tolerance in the proposed family of topologies.

    For this, the existing fault tolerance mechanism for similar topologies will be studied and a mechanism able to exploit the features of this new family will be designed.

    The second objective is to develop routing algorithms specially deigned to reduce the pernicious effect of Head-of-Line blocking, which may shoot up in systems with a high number of computing nodes.

    To avoid this effect, routing algorithms able of efficiently classifying the packets in the different available virtual channels are designed, thus preventing that the occurrence of a hot node (Hot-Spot) could saturate the network and affect the remaining network traffic.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus