Why virtual server?

With the explosive growth of the Internet and its increasingly important role in our lives, the traffic on the Internet is increasing dramatically, which has been growing at over 100% annual rate, and the load on the servers is increasing rapidly too. The servers will be easily overloaded for a short time, especially for a popular web server. To overcome the overloading problem of the servers, there are two solutions. One is the single server solution, i.e. to upgrade the server to a higher performance server, but it will soon be overloaded when requests increases so that we have to upgrade it again, the upgrading process is complex and the cost is high. The other is the multi-server solution, i.e. to build a scalable server on a cluster of servers. When load increases, we can simply add a new server or more into cluster to meet the increasing requests. However, there are several methods to construct the cluster of servers.

Now the widely used one is Round-Robin DNS, which maps a single name to the different IP address in a round-robin manner; thus different clients will be mapped to different servers in the cluster for the ideal situation. In this way, the load is distributed among the servers. However, due to the caching nature of clients and hierarchical DNS system, it easily leads to dynamic load imbalance among the servers, thus it is not easy for a server to handle its peak load. The TTL (Time To Live) value of a name mapping can't be well chosen at RR-DNS, with small values RR-DNS will be a bottleneck, and with high values the dynamic load imbalance will get even worse. Even the TTL value is set with zero, the scheduling granularity is per host, different users' access pattern may lead to dynamic load imbalance, because some people may pull lots of pages from the site, and others may just surf a few pages and go away. Moreover, it is not so reliable, when a server node fails, the clients who maps the name to the IP address will find the server is down, and the problem still exists even if they press "reload" or "refresh" button in their browsers.

An even better way is to use a load balancer to distribute load among servers in a cluster. The parallel services of servers can be made to appear as a virtual service on a single IP address, so that the end users see a virtual server, not a cluster of servers. The scheduling granularity is per connection, which can make a sound load balance among the servers. Fails can be masked when one server or more fail. Server management is becoming easy, and administrator can take a server or more in and out of service at any time, which won't interrupt services to users.

Load balancing can be done in two levels, application-level and IP-level. For example, Reverse-proxy and pWEB is an application-level load balancing method to build a scalable web server. They forward the HTTP request to the different web servers in the cluster, get back the result, and then return it to the clients. Since the overhead of dealing the HTTP requests and replies in the application-level is high, I believe the application-lever load balancer will be a new bottleneck when the number of server nodes increase to 4 or more, whcih depens on the throughout of each server.

I prefer the IP-level load balancing, because the overhead of IP load balancing is small and the maxim number of server nodes can reach 25 or up to 100. That's Linux Virtual Server code designed for. How it works will be explained in detail in the next section.


Last updated: 1999/6/27

Created on: 1998/5/28