What is HAProxy and how to install and configure in Linux

6 min readMar 31, 2022

What is HAProxy?

HAProxy (High Availability Proxy) is a TCP/HTTP load balancer and proxy server that allows a webserver to spread incoming requests across multiple endpoints. This is useful in cases where too many concurrent connections over-saturate the capability of a single server. Instead of a client connecting to a single server which processes all the requests, the client will connect to an HAProxy instance, which will use a reverse proxy to forward the request to one of the available endpoints, based on a load-balancing algorithm.

HAProxy acts as an API gateway in front of your application servers, providing cross-cutting security.

What is web server ?

A web server is a computer that runs websites. It’s a computer program that distributes web pages as they are requisitioned. The basic objective of the web server is to store, process and deliver web pages to the users. This intercommunication is done using Hypertext Transfer Protocol (HTTP). These web pages are mostly static content that includes HTML documents, images, style sheets, test etc. Apart from HTTP, a web server also supports SMTP (Simple Mail transfer Protocol) and FTP (File Transfer Protocol) protocol for emailing and for file transfer and storage.

Types of Load Balancing

Load Balancing

In a previous article we have seen the basic mechanisms that are used while configuring load balancing , so let’s get into the types of load balancing…

No Load Balancing

The name says it all; without load balancing it’s just a simple web application environment. The following diagram will help you to understand it in a right way:

The above diagram denotes, the user connects straight to the web server i.e. domain.com and there is no presence of load balancing. This mean, in case the web server goes offline due to some reasons; the end user will not be able to access it. In another scenario, if there are multiple users trying to access the web server simultaneously and web server shows its limitation to handle the load, the end users generally experience the slower response or might not able to connect to the server at all.

Layer 4 Load Balancing

Layer 4 load balancing (which is also called as transport layer load balancing) is widely acknowledged for its simple way to load balance the network traffic through multiple servers. This type is based on IP range/domain and port i.e. if user request comes in for domain.com/blog, the traffic will be sent to the backend that manages all the user requests for domain.com on port 80.

Check the below diagram to see a simple example of layer 4 load balancing:

In this architecture, the user first connects to the load balancer and then the user request is sent to the web servers. The selected web server responds directly to the user request immediately. Usually, all the web servers contain the similar data which avoid sending back inconsistent content to the user. Remember, all the web servers connect to the similar database server.

Layer 7 Load Balancing

Also known as application layer load balancing is more refined and sophisticated way of network traffic load balancing than Layer 4. This mode is based on the content of the user’s request in which load balancer send user request to the web servers according to the content of a request. This is the very advantageous way because users can run multiple web servers on the same domain and port.

Check the below diagram to see a simple example of layer 7 load balancing:

Balance Algorithm is the algorithm that is used by HAProxy to select the server when doing the load balancing. The following modes are available:

Roundrobin

This is the most simple balance algorithm. For each new connection, it will be handled by the next backend server. If the last backend server in the list is reached, it will start again from the top of backend list.

Leastconn

The new connection will be handled by the backend server with least amount of connections. This is useful when the time and load of the requests vary a lot.

Source

This is for sticky sessions, the client IP will be hashed to determine the backend server that received the last request from this IP. So an IP A will always be handled by backend1, and IP B will always be handled by banckend2 to not interrupt sessions.

Now, we will discuss how to install and configure HAProxy on CentOS 8/RHEL 8 for Nginx Web Servers. Following are the details for my haproxy lab setup,

HAProxy Server — 192.168.43.4 (haproxy-Redhat8)
Apache HTTP Server1–192.168.43.199 (httpd-node01)
Apache HTTP Server 2–192.168.43.32 (httpd-node01)

Server1–192.168.43.199 (httpd-node01)

This is our web server 1 in which WordPress is running

Server1–192.168.43.199 (httpd-node01)

This is our web server 1 in which normal web page is running

Step1 : Install and Configure HAProxy on RHEL 8

HAProxy Server — 192.168.43.4 (haproxy-Redhat8)

this is our main VM in which we have installed HAProxy to install the HAProxy the command is

#Yum install haproxy

Now go to the configuration file

Once you install you can find the default configuration on /etc/haproxy/haproxy.cfg with some default settings.

Replace the <server name> with whatever you want to call your servers on the statistics page and the <private IP> with the private IPs for the servers you wish to direct the web traffic to. And in this configuration, you can alter/delete the lines which is not required or need to modify. For more option please refer Official Documentation On HAProxy