Load Balancing Technology

Load Balancing Technology
Now, as network access needs boom, service providers are becoming overloaded. Choosing a single server with a powerful configuration to meet this need will entail huge investment costs. An effective solution is to use a group of servers to perform the same hotkeys under the control of a load distribution tool. There are many companies offering load balancing solutions like Cisco, Coyote Point, Sun Microsystems ... with lots of hot features. However, fundamentally, the principle of load balancing is still derived from fairly similar technical perspectives. Introduce Now, as network access needs boom, service providers are becoming overloaded. Choosing a single server with a powerful configuration to meet this need will entail huge investment costs. An effective solution is to use a group of servers to perform the same function under the control of a load-balancing tool. There are many companies offering load balancing solutions such as Cisco, Coyote Point, Sun Microsystems ... with a lot of features. However, fundamentally, the principle of load balancing is still derived from fairly similar technical perspectives. A typical load balancing technique is RRDNS (Round Robin DNS). With this solution, if a server in the cluster fails, the RRDNS will continue to send the load to the server until the network administrator detects the problem and separates the server from the DNS address list. This will cause a disruption of the service. Following developments, from static load balancing algorithms such as Round Robin, Weighted Round Robin to dynamic load balancing algorithms such as Least Connection, Weighted Least Connection, Optimized Weighted Round Robin and Optimized Weighted Least Connection, Nowadays, thanks to the combination of these algorithms is becoming more and more perfect despite the inherent disadvantages such as single point error and the bottleneck problem due to the use of centralized dispatcher. This article introduces a solution that Microsoft uses for a web server running Microsoft.com, which is Network Load Balancing (NLB). In addition to the ability to use Web server, this technique can also apply to other application server systems. NLB is not only responsible for distributing downloads to servers, but also provides a mechanism to ensure that server systems are always available to clients. NLB does not have any special hardware requirements, and any compatible computer can be used as a server. The cost of deployments therefore drops dramatically. NLB's distributed software architecture enables the delivery and utilization of this technology at the highest level. How does NLB work? NLB extends the performance of application servers, such as the Web server, by distributing client requests to servers in the cluster. Servers (or hosts) receive incoming IP packets, but packets are processed only by a particular server. Hosts in a group will simultaneously respond to different requests from clients, even if a client can make multiple requests. For example, a Web browser needs a lot of images on a Web page hosted at multiple hosts in a server group. With load balancing technology, processing and client response time will be much faster. Each host in the group can determine the load that it will handle or the load can be distributed evenly between hosts. By using this load distribution, each server will select and process a portion of the host load. Loaded by incoming clients is distributed so that each server receives the requested number of requests according to its intended load. This load balancer can dynamically adjust when hosts join or leave the group. For applications such as Web servers, there are many clients and the length of time that the requirements of the client exist is relatively short, the ability of this technique to distribute load through statistical mapping will help balance a Efficiently load and provide fast response capability when group server changes. Servers in the load balancing group send out a special message informing them of their status (called the heartbeat message) to other hosts in the group while listening to the message from other hosts. If one server in the cluster goes down, the other hosts will adjust and redistribute the load to maintain services for the client. In most cases, the client software automatically reconnects and the user only feels delayed for a few seconds after receiving the response. System load balancing architecture To maximize throughput and availability, load balancing uses a fully distributed software architecture, load balancer is installed and running in parallel on all hosts in the cluster. This controller arranges all hosts in the group into a subnet to simultaneously detect network traffic to the group's primary IP address (and additional addresses of hosts in different locations). On each host, the driver acts as a filter between the network card driver and the TCP / IP stack, allowing some of the incoming network traffic to be received by the host. As a result, client requests will be partitioned and load balanced between hosts in the cluster. The load balancing system runs as a logical (network) driver under underlying application layer protocols such as HTTP or FTP. The following shows the deployment of the load balancing system as an intermediate driver in the Windows2000 network stack at each host in the cluster. This architecture maximizes capacity by using broadcast networks to distribute network traffic to all hosts in the cluster and eliminates the need to route packets to individual hosts. As the unwanted packet filtering time takes place faster than the packet routing time (routing includes packet receive, test, repacking, and sending processes), the architecture provides higher throughput solution-based coordinators. As the speed of the network and server grows, the throughput increases as well, thereby eliminating any dependency on routing based on particular hardware. In fact, the load balancer can achieve 250Mbit / s throughput in Gigabit networks. Another fundamental advantage of fully distributed architecture is the increased usability with (N-1) fixes in a group with N hosts. Adapter-based solutions create a point of inheritance that can only be overcome by using a redundant dispatcher and therefore only provides a single fix. The load balancing architecture also takes advantage of the architecture advantages of switch arrays and / or hub subnets to simultaneously distribute network traffic to all hosts. in the group. However, this method increases the "payload" on the switches due to the additional port throughput. This is not a problem in most applications, such as Web services or streaming media, because incoming traffic accounts for only a tiny fraction of total network traffic. However, if the client-side network connections to switches are much faster than the server-side connections, the traffic may account for a large portion of the throughput allowed by the server-side gateway. The same problem will occur if multiple groups are connected on the same switch and the measures for setting up the virtual LANs for each group are not implemented. In the process of receiving the package, the NLB implementation is a combination of distributing packets to the TCP / IP layer and receiving other packets through the network card driver. This increases overall processing speed and reduces latency as TCP / IP can handle packets while the Network Driver Interface Specification (NDIS) receives the next packet. During packet delivery, NLB also increases throughput, reducing latency and overhead by increasing the number of packets that TCP / IP can send in a connection. For these improvements, the NLB establishes and manages a set of packet buffers and descriptors that are used to coordinate TCP / IP operations and NDIS drivers. Distribute traffic in groups NLB uses two broadcast or multicast classes to simultaneously distribute network traffic to all hosts in the cluster. In the default mode of operation unicast, NLB assigns a workstation address (MAC address) to the network adapter so that the network adapter can operate (this card is called a cluster adapter), and all hosts in the group assigned the same MAC address. The incoming packets are thus received by all the hosts in the group and the packet is transferred to the load balancer for filtering. To ensure uniqueness, the MAC address is derived from the main IP address of the group. For example, with the group's primary IP address of 1.2.3.4, the unicast MAC address is set to 02-BF-1-2-3-4. The load balancer automatically modifies the MAC address of the group card by setting up a registered entity and re-loading the group card driver. The operating system does not need to be restarted. If the hosts in the cluster were attached to a swap device rather than a hub, sharing the same MAC address would cause conflicts because Layer 2 switches could only be active. Dynamic when the source MAC address on all ports of the switch is unique. To avoid this, the NLB fixes the source MAC address for the output packets to be unique, the MAC address of the group is 02-BF-1-2-3-4 and is converted to 02-h-1-2-3- 4, where h is the priority of the host in the group. This technique prevents the switch from discovering the real MAC address of the cluster and as a result the packets to the cluster are distributed to all switch ports. If the hosts in the group are connected directly to a hub, NLB source MAC address masks in unicast mode can be disabled to avoid overflows for uplink switches (upstream ). This can be done by setting the NLB registry parameter MaskSourceMAC = 0. The use of triple-layer switching systems can also limit overflow to switches. NLB unicast mode can disable the process of exchanging information between hosts in a group using a group card. When the packets of a host are sent with the destination MAC address the same as the source MAC address, these packets will be loop-backed between the network protocol layers within the sender's system and never come out. line. This limitation can be avoided by adding a second network card to each host. In this configuration, the NLB uses a network adapter on the subnet to receive client requests and another network card is usually placed on the local subnet to exchange information between hosts in the cluster and to the physical servers. The database as well as the original file server. NLB only uses group cards to transmit "heartbeat" messages and remote control traffic. Note that exchanging information between hosts in the group and outside hosts is never affected by NLB's unicast mode. Network traffic to a host-specific IP address (in the group card) is received by all hosts in the group because they share the same MAC address. Since NLB will never load traffic for reserved IP addresses, NLB will immediately distribute this traffic to TCP / IP on the host. Other hosts in the group consider this traffic to be load balanced and will remove this traffic. Note that if the incoming network traffic is too large for dedicated IP addresses, it may affect the performance when the NLB system operates in unicast mode (depending on the need for TCP / IP to be removed). unwanted packages). NLB provides a second mode for distributing network traffic to hosts in a group, multicast mode. This mode assigns two layer multicast addresses to the group card instead of changing the workstation address of the card. For example, the multicast MAC address will be assigned 03-BF-1-2-3-4 corresponding to the main IP address of 1.2.3.4. Since each host in the group has a unique workstation address, this mode does not need a second set of network cards to exchange information between hosts in the cluster and it has no effect on the performance of the host. Full system by the use of dedicated IP addresses. NLB's unicast mode causes overflow on the switch because of the simultaneous distribution of network traffic across all ports. However, the NLB's multicast mode offers the opportunity to limit switch overflows so that system administrators can configure a virtual LAN on the switch for ports that correspond to hosts. This can be done by programming the switch or using the IGMP protocol or the GARP, GMRP protocol. NLB needs to deploy the ARP function to ensure that the primary IP address of the group and other virtual IP addresses can be resolved to the multicast MAC address of the group. (The reserved IP address will continue to be resolved to the workstation address of the group card.) Load balancing algorithm NLB uses a fully distributed filtering algorithm to map clients to hosts in the cluster. This algorithm allows the hosts in the group to make load balancing decisions independently and quickly for each incoming packet. It is optimized to provide statistical load balancing for a large number of small requests made by countless clients, typically for Web servers. If the number of client and / or client connections creates too many different loads on the server, the load balancing algorithm will be less efficient. However, the simplicity and speed of the algorithm allows for very high performance, including high throughput and short response time in a wide range of popular client / server applications. NLB handles client requests by routing for a percentage of the selected new requests for each host in the group. The algorithm does not respond to host load changes (such as CPU load or memory usage). However, the mapping process will be changed when the membership of the group changes and the proportion of distributed load will be rebalanced. When considering a incoming packet, all hosts simultaneously perform a statistical mapping to quickly determine which host will handle the packet. The mapping process uses a random function to calculate the host priority based on the client's IP address and client port and other state information to optimize the load balancing. The host will move that packet from the downstairs to the TCP / IP layer and the other hosts will remove the packet. The mapping does not change unless the relationship between the hosts in the group changes, to ensure that the given IP address and destination of the client are always mapped to the same host in the group. However, specific hosts in the group to which the IP address and port of the client mapping can not be predetermined because the random function takes into account the membership of the current and past group to minimize the probability. mapped again. In general, the quality of load balancing is statistically determined by the number of clients that make the request. As the statistical structure of the client decreases, the quality of the load balancing algorithm will change slightly. For high-level load balancing on each cluster host, a portion of system resources will be used to measure and respond to load changes. This cost advantage must be weighed against the benefits of maximizing the use of resources in the cluster (basically CPU and memory). In any case, the proper use of server resources must be maintained in order to serve other client loads in the event of an error. When a new host joins the group, it activates the convergence process and a new member relation in the group is calculated. When the convergence process is complete, a minimum of client will be mapped to the new host. NLB detects TCP connections on each host and after their current TCP connection completes, the next connection from the affected clients will be processed by the new host. Therefore, hosts should be added to the group at overloaded times to minimize interruptions. To avoid this problem, the session state must be managed by the server application so that it can be refactored or returned from any host in the group. For example, the session state can be pushed to the database server and stored in client cookies. The convergence process Hosts in the group exchange periodic "heartbeat" multicast or broadcast messages together. This allows the host to monitor the status of the group. When a group's status changes (such as when a host is having trouble, leaves or joins a group), NLB triggers a cycle called convergence in which hosts exchange heartbeat messages to determine A new, sustainable status for the group. Once all of their "new" states are reached, their new state will be established and these changes will be saved in the event log. In the process of convergence, hosts continue to process network traffic as always except that the traffic to the faulty host does not receive the service. The convergence process ends when all hosts in the group have a stable membership relationship within a few heartbeat cycles. Upon completion of the convergence process, traffic to the faulty host will be redistributed to the remaining hosts. If a host is added to a group, the convergence process allows the host to receive its load in the balanced traffic. Group expansion does not affect group activities and in a completely transparent way to all Internet clients as well as to server software programs. However, it can affect client sessions because clients may have to re-map to other hosts in the group. In unicast mode, each host broadcasts a "heartbeat" message cyclically. In multicast mode, it will broadcast these messages in multicast mode. Each "heartbeat" message occupies an Ethernet frame and is appended with the group's primary IP address to allow multiple groups to coexist on the same subnet. Microsoft's "heartbeat" message was assigned a value of 0x886F. The period of sending this newsletter is 1 second. This value may change. During the convergence process, this cycle is reduced to only half to speed up the completion of the process. Even for large clusters, communication is needed to transmit very low "heartbeat" messages (24kBytes / s for a 16-way cluster). In order to initialize the convergence process, five heartbeat messages are not received by default. This value may change. Remote controller The NLB remote control mechanism uses UDP protocol and is assigned service port # 2504. Remote control packets are sent to the group's primary IP address. Since the drivers on each host in the group handle these packets, they need to be routed to the subnet of the group (instead of the original subnet that the group is attached to). When remote control commands are given in the group, they are broadcast on the local subnet. This ensures that all hosts in the group can receive them even when the group is running in unicast mode. Effect of load balancing The role of NLB in the performance of a system can be assessed based on the following four main criteria: - CPU overhead on the hosts of the group. - The CPU section needed to analyze and filter the network packets (the lower the better). All load balancing solutions need to use a portion of the system's resources to look at incoming packets and make a load balancing decision and thus, inevitably, affect the performance of the network. Load balancer-based load balancing solutions need to test, modify, and retransmit packets to hosts in the cluster (usually re-modify the IP address to re-route the packets from the virtual IP address to the IP address of each). host specific). For NLB, it simultaneously distributes packets to all hosts in the group and applies a filtering algorithm to eliminate unwanted packets. The filtering process is less influential than the rebuilding process. The result is faster response time with higher system throughput. - Throughput and response time requirements NLB improves system performance by increasing throughput and minimizing response time to client requirements. When the power of the hosts in the group is maximized, it will not be able to provide additional throughput and response time depending on the client's request queue latency. Adding a host will allow higher throughput and reduced response times. If the customer needs to continue to grow, the hosts will be added until the subnet is saturated. And if the load continues to grow, it should use multiple NLB groups and the distribution of traffic between hosts is done using the Round Robin DNS technique. In fact, this method is used for the Microsoft Web site www.microsoft.com, which regularly has five NLB groups, each with 6 hosts and hosts running at 60% of maximal power. - Switch occupancy: The rate of communication of the switch used by the process of overflowing client requests. The NLB packet filtering architecture is based on the broadcast subnet to distribute client requests to all hosts at the same time. In small groups, the hub can be used to connect hosts. For larger groups, the switch will be the choice. And by default, NLB will create the "overflow" switch that can deliver client requests to all hosts at the same time. Be sure that the switch overflow does not exceed the power of the switch, especially when the switch is shared between the group and the off-line computers. Normally, the throughput used for client traffic only accounts for a small proportion of the total traffic necessary for communication between the server and the client. However, the "overflow" switch will become a problem in applications with a significant share of network traffic going to the cluster (such as uploading files in FTP applications) or when multiple user groups share a switch. In these cases, running NLB in multicast mode and setting up a virtual LAN to limit switch overflow is a very effective way to overcome this problem. In addition, the viability of NLB determines the ability to improve the efficiency of the system when the hosts are added to the group. Conclude Server is a platform for distribution of important applications, often and widely as Web, Streaming media, VPN. As an integrated part of Windows2000 Advanced Server and Datacenter Server, NLB provides an ideal, economical solution to enhance the scalability and usability of applications across both the Internet and intranet environments. However, in addition to the built-in applications in Windows2000, NLB can also integrate into network operating systems and other server-based applications efficiently. Galaxy