What is an ADC and how it works.
An Application Delivery Controller, or most commonly, a load balancer, is a device that distributes traffic between server farms. The main features of Load balancing are the improvement of responsiveness and availability of applications, but there are lots of others interesting features, some of them explained below.
The behavior of an ADC is to act like a man in the middle between client and server, I mean, when there is no ADC in the way, client attack directly to the server, so 1 session is created, but in case there is an ADC in the way, it breaks this one session in two session as there is a session between client and ADC and another session between ADC and backend server.
Summarizing, a load balancer sits between the client and the server farm accepting incoming network and application traffic and distributing them across multiple backend servers.
The architecture of the configuration is as follows:
Virtual Server: This is a value composed by a virtual IP and port where the ADC is listening for incoming traffic. For example, when you navigate to www.googe.com the DNS resolve this to an IP which must be a virtual server.
Server Farm or Pool: This is a group of real servers listening on a determinate port (bound to a Virtual Server) where the ADC redirect the traffic received in the virtual server.
Now, how to decide to what real server redirect the traffic incoming in a Virtual Server? Well to achieve this task the ADC uses algorithms also knowing as load balancing methods. Every load balancing method uses different criteria and here you are the most common of them:
The Round Robin Method This method continuously rotates a list of servers that are attached to it. When the virtual server receives a request, it assigns the connection to the first server in the list, and then moves that server to the bottom of the list. The next request would be redirected to the second server (now first after the original first rotated) and so on.
The Least Connection Method When a virtual server is configured to use the least connection, it selects the server with the fewest active connections.
The Least Response Time Method This method selects the server with the lowest average response time.
Of course there are many other methods, which varies depending on the vendor and you have to evaluate the best to apply in your environment.
Another component to take in mind is the Health check or monitor. This is a feature bound to the real server or server farms which performs a periodic check in order to know if the backend server is working or not. There are simples health checks which only test layer 3 (e.g. ping), other which test layer 4 (protocol and port, e.g. TCP 80), and other most complex which test layer 7 (e.g. http get awaiting a response with code 200 OK). In case a server does not respond properly to the monitor bound to it, the Load balancing method does not include that server as available in the server farm.
Other interesting features could be persistency an ssl offload.
Persistency is a concept through which you can force that a connection remains on the same backend server, without take care of the load balancing method, I mean, persistency settings will override load balancing decisions.
For your understandings, there is a case of use that will let you understand perfectly. Imagine you are navigating to Amazon in order to buy something, well in the background www.amazon.com is a Virtual Server after DNS resolution and it has a server farm with lots of real servers providing the same web page. While navigating through that site, you can be redirected to any of the real server at any item you clicked, but when you select an item and clicked “add to cart” you are saving information about your purchase on only one real server you have been redirected to, so at this point is when persistency takes effect, because in the next connections, you must override load balancing method and you need to be redirected to the same real server in order to have your item purchased in you cart.
SSL offload in case of a HTTPS service, is used to raise the encryption layer to the ADC lowering the charge of real servers. Think that if how real server must encrypt/decrypt the traffic it consumes too much resources like CPU, ram, etc. However, if you raise this task to the ADC (which usually have a specific dedicated hardware to achieve this task) you are lowering the charge of real server.
Another important thing is to know distinguish between SLB (Server Load Balance) and GSLB (Global Server Load Balance).
SLB:
Layer 4 (L4) load balancing - the ability to direct traffic based on data from network and transport layer protocols, such as IP address and TCP port
Layer 7 (L7) load balancing and content switching – the ability to make routing decisions based on application layer data and attributes, such as HTTP header, uniform resource identifier, SSL session ID…
GSLB:
Global server load balancing (GSLB) - extends the core L4 and L7 capabilities so that they are applicable across geographically distributed server farms (datacenter load balance).
Finally, we can say that load balancing is the most straightforward method of scaling out an application server infrastructure. As application demand increases, new servers can be easily added to the server farm, and the ADC will immediately begin sending traffic to the new server.