Soft-removing balancing node from HAProxy

HAProxy is great! It is a simple tool that allows to balance the load across multiple servers and it's pretty easy to setup and configure.

But I have encountered a little bit challenging to handle the upgrade of one of the server node. You see, in the most common configuration, HAProxy is set to just checks for a port on the server, for example the WildlFly port 8080, so, in order to stop receiving requests from clients, you either stop the application server or disable it on HAProxy. Neither of those methods worked for the application I was working on for various reasons that are out of scope of this post.

The solution I have found is pretty simple and it is based on a URL on the application server itself, that HAPRoxy is using to check if the node is available. For example:
http://localhost:8080/console/health
Here is an example of HAProxy configuration, that is using this URL for health checking:
global
    daemon
    maxconn 4096
    user haproxy
    group haproxy

defaults
    mode http
    log  global
    timeout connect  5000
    timeout client   50000
    timeout server   50000

frontend http-in
    bind *:80
    default_backend app-servers

backend app-servers
    balance roundrobin
    option httpchk GET /console/health
    server server1 10.0.0.1:8080 check
    server server2 10.0.0.2:8080 check
    server server3 10.0.0.3:8080 check
    server server4 10.0.0.4:8080 check

The output of the URL can be managed by some sort of switch. I have chosen to control it with POST requests:
curl -X POST http://localhost:8080/console/activate
curl -X POST http://localhost:8080/console/deactivate

If the switch is disabled, then the URL should return an 503 (Service Unavailable ) status code. When it is activated, then the URL can return any useful information, as long as the status code is 200.

The main advantage of this solutions is that, when disabled from HAProxy, server keeps processing active connections, but new requests are being routed to the other nodes.

Now a server update process should look something like this:
  1. Mark the service as down
  2. Wait until all active connections are closed and all the load is routed to the other servers
  3. Stop the application and upgrade to the new version
  4. Start the application
  5. Perform all needed tests and validations
  6. If all passed, mark the service as up
  7. Let the users enjoy your new version
P.S.: I know the same can be achieved with service discovery tools like Consul but there are still companies that are not yet using the latest and trendy software.

Popular posts from this blog

Docker with WildFly and Java 8

Dropbox OAuth in Flask