SSL offloading impact on web applications

SSL Offloading

Nowadays, it is common (and convenient) to use the Load-Balancer SSL capabilities to cypher/uncypher traffic from clients to the web application platform.
Performing SSL at the Load-Balancer Layer is called “SSL offloading“, because you offload this process from your application servers.

Note that SSL offloading is also “marketingly” called SSL acceleration. The only acceleration performed is moving SSL processing from application server, which have an other heavy task to perform, to an other devices in the network…
So no acceleration at all, despite some people claming they do accelerate SSL in a dedicated ASIC, they just offload

The diagram below shows what happens on an architecture with SSL offloading. We can clearly see that the traffic is encrypted up to the Load-Balancer, then it’s in clear between the Load-Balancer and the application server:
ssl offloading diagram

Benefits of SSL offloading


Offloading SSL from the application servers to the Load-Balancers have many benefits:
  * offload an heavy task from your application servers, letting them to focus on the application itself
  * save resources on your application servers
  * the Load-Balancers have access to clear HTTP traffic and can perform advanced features such as reverse-proxying, Cookie persistence, traffic regulation, etc…
  * When using an ALOHA Load-Balancer (or HAProxy), there are much more features available on the SSL stack than on any web application server.

It seems there should only be positive impact when offloading SSL.

Counterparts of SSL offloading


As we saw on the diagram above, the load-balancer will hide three important information from the client connection:
  1. protocol scheme: the client was using HTTPS while the connection on the application server is made over HTTP
  2. connection type: was it cyphered or not? Actually, this is linked to the protocol scheme
  3. destination port: on the load-balancer, the port is usually 443 and on the application server should be 80 (this may change)

Most web application will use 301/302 responses with a Location header to redirect the client to a page (most of the time after a login or a POST) as well as requiring an application cookie.

So basically, the worst impact is that your application server may have difficulties to know the client connection information and may not be able to perform right responses: it can break totally the application, preventing it to work.
Fortunately, the ALOHA Load-Balancer or HAProxy can help in such situation!

Basically, the application could return such responses to the client:

Location: http://www.domain.com/login.jsp

And the client will leave the secured connection…

Tracking issues with the load-balancer


In order to know if your application supports well SSL offloading, you can configure your ALOHA Load-Balancer to log some server responses.
Add the configuration below in your application frontend section:

capture response header Location   len 32
capture response header Set-Cookie len 32

Now, in the traffic log, you’ll see clearly if the application is setting up response for HTTP or HTTPS.

NOTE: some other headers may be needed, depending on your application.

Web Applications and SSL offloading


Here we are :)

There are 3 answers possible for such situation, detailed below.

Client connection information provided in HTTP header by the Load-Balancer


First of all, the Load-Balancer can provide client side connection information to the application server through HTTP header.
The configuration below shows how to insert a header called X-Forwared-Proto containing the scheme used by the client.
To be added in your backend section.

http-request set-header X-Forwarded-Proto https if  { ssl_fc }
http-request set-header X-Forwarded-Proto http  if !{ ssl_fc }

Now, the ALOHA Load-Balancer will insert the following header when the connection is made over SSL:

X-Forwarded-Proto: https

and when performed over clear HTTP:

X-Forwarded-Proto: http

It’s up to your application to use this information to answer the right responses: this may require some code update.

HTTP header rewriting on the fly


An other way is to rewrite the response setup by the server on the fly.
The following configuration line would match the Location header and translate it from http to https if needed.
Rewriting rspirep http://www.domain.com:80/url:

rspirep ^Location:\ http://(.*):80(.*)  Location:\ https://\1:443\2   if  { ssl_fc }

NOTE: above example applies on HTTP redirection (Location header), but can be applied on Set-cookie as well (or any other header your application may use).
NOTE2: you can only rewrite the response HTTP headers, not in the body. So this is not compatible with applications setting up hard links in the HTML content.

SSL Bridging


In some cases, the application is not compatible at all with SSL offloading (even with the tricks above) and we must use a ciphered connection to the server but we still may require to perform cookie based persistence, content switching, etc…
This is called SSL bridging, or can also be called a man in the middle.

In the ALOHA, there is nothing to do, just do SSL offloading as usual and add the keyword ssl on your server directive.

Using this method, you can choose a light cipher and a light key between the Load-Balancer and the server and still use the Load-Balancer advanced SSL feature with a stronger key and a stronger cipher.
The application server won’t notice anything and the Load-Balancer can still perform Layer 7 processing.

Links

Posted in Aloha, architecture, HAProxy, ssl | Tagged , , , | 6 Comments

Mitigating the SSL Beast attack using the ALOHA Load-Balancer / HAProxy

The beast attack on SSL isn’t new, but we have not yet published an article to explain how to mitigate it with the ALOHA or HAProxy.
First of all, to mitigate this attack, you must use the Load-Balancer as the SSL endpoint, then just append the following parameter on your HAProxy SSL frontend:
  * For the ALOHA Load-Balancer:

bind 10.0.0.9:443 name https ssl crt domain ciphers RC4:HIGH:!aNULL:!MD5

  * For HAProxy OpenSource:

bind 10.0.0.9:443 name https ssl crt /path/to/domain.pem ciphers RC4:HIGH:!aNULL:!MD5

As you may have understood, the most important part is the ciphers RC4:HIGH:!aNULL:!MD5 directive which can be used to force the cipher used during the connection and to force it to be strong enough to resist to the attack.

Related Links

Links

Posted in Aloha, exchange 2010, Exchange 2013, HAProxy, security, ssl | Tagged , , , | 1 Comment

Microsoft Exchange 2013 architectures

Introduction to Microsoft Exchange 2013

There are 2 types of server in Exchange 2013:
  * Mailbox server
  * Client Access server

Definitions from Microsoft Technet website:
  * The Client Access server provides authentication, limited redirection, and proxy services, and offers all the usual client access protocols: HTTP, POP and IMAP, and SMTP. The Client Access server, a thin and stateless server, doesn’t do any data rendering. There’s never anything queued or stored on the Client Access server.
  * The Mailbox server includes all the traditional server components found in Exchange 2010: the Client Access protocols, the Transport service, the Mailbox databases, and Unified Messaging (the Client Access server redirects SIP traffic generated from incoming calls to the Mailbox server). The Mailbox server handles all activity for the active mailboxes on that server.

High availability in Exchange 2013


Mailbox server high availability is achieved by the creation of a DAG: Database Availability Group. All the redundancy is achieved within the DAG and the Client Access server will retrieve the user session through the DAG automatically when a fail over occurs.

Client Access servers must be Load-Balanced to achieve high-availability. There are many ways to load-balance them:
  * DNS round-robin (Com’on, we’re in 2013, stop mentioning this option !!!!)
  * NLB (Microsoft don’t recommend it)
  * Load-Balancers (also known as Hardware Load-Balancer or HLB): this solution is recommended since it can bring smart load-balancing and advanced health checking.

Performance in Exchange 2013


The key performance in Exchange 2013 is obviously the Mailbox server. If this one slows down, all the users are affected. So be careful when designing your Exchange platform and try to create multiple DAGs with one single master per DAG. If one Mailbox server fails or is in maintenance, then you’ll be in degraded mode (which does not mean degraded performance) where one Mailbox server would be master on 2 DAGs. The disks IO may also be important here ;)

Concerning the Client Access server, it’s easy to disable temporarily one of them if it’s slowing down and route connections to the other Client Access server(s).

Exchange 2013 Architecture


Your architecture should meet your requirements:
  * if you don’t need high-availability and performance, then a single server with both Client Access and Mailbox role is sufficient
  * if you need high-availability and have a moderate number of users, a couple of servers, each one hosting both Client Access and Mailbox role is enough. A DAG to ensure high-availability of the Mailbox servers and a Load-Balancer to ensure high-availability of the Client Access servers.
  * if you need high-availability and performance for a huge amount of connections, then multiple Mailbox server, multiple DAGs, multiple Client Access servers and a couple of Load-Balancers.

Note that the path from the first architecture to the third one is doable with almost no effort, thanks to Exchange 2013.

Client Access server Load-Balancing in Exchange 2013


Using a Load-Balancer to balance user connections to the Client Access servers allows multiple type of architectures. The current article mainly focus on these architectures: we do provide load-balancers ;)
If you need help from Exchange architects, we have partners who can help you with the installation, setup and tuning of the whole architecture.

Load-Balancing Client Access servers in Exchange 2013

First of all, bear in mind that Client Access servers in Exchange 2013 are stateless. It is very important from a load-balancing point of view because it makes the configuration much more simple and scalable.
Basically, I can see 3 main types of architectures for Client Access servers load-balancing in Exchange 2013:
  1. Reverse-proxy mode (also known as source NAT)
  2. Layer 4 NAT mode (desitnation NAT)
  3. Layer 4 DSR mode (Direct Server Return, also known as Gateway)

Each type of architecture has pros and cons and may have impact on your global infrastructure. In the next chapter, I’ll provide details for each of them.

To illustrate the article, I’ll use a simple Exchange 2013 configuration:
  * 2 Exchange 2013 servers with hosting both Client Access and Mailbox role
  * 1 ALOHA Load-Balancer
  * 1 client (laptop)
  * all the Exchange services hosted on a single hostname (IE mail.domain.com) (could work as well with an hostname per service)

Reverse-proxy mode (also known as source NAT)


Diagram
The diagram below shows how things work with a Load-balancer in Reverse-proxy:
  1. the client establishes a connection onto the Load-Balancer
  2. the Load-Balancer chooses a Client Access server and establishes the connection on it
  3. client and CAS server discuss through the Load-Balancer
Basically, the Load-Balancer breaks the TCP connection between the client and the server: there are 2 TCP connections established.

exchange_2013_reverse_proxy

Advantages of Reverse-proxy mode
  * not intrusive: no infrastructure changes neither server changes
  * ability to perform SSL offloading and layer 7 persistence
  * client, Client Access servers and load-balancers can be located anywhere in the infrastructure (same subnet or different subnet)
  * can be used in a DMZ
  * A single interface is required on the Load-Balancer

Limitations of Reverse-proxy mode
  * the servers don’t see the client IPs (only the LB one)
  * maximum 65K connections on the Client Access farm (without tricking the configuration)

Configuration
To perform this type of configuration, you can use the LB Admin tab and do it through the GUI or just copy/paste the following lines into the LB Layer 7 tab:

######## Default values for all entries till next defaults section
defaults
  option  dontlognull             # Do not log connections with no requests
  option  redispatch              # Try another server in case of connection failure
  option  contstats               # Enable continuous traffic statistics updates
  retries 3                       # Try to connect up to 3 times in case of failure 
  backlog 10000                   # Size of SYN backlog queue
  mode tcp                        #alctl: protocol analyser
  log global                      #alctl: log activation
  option tcplog                   #alctl: log format
  timeout client  300s            #alctl: client inactivity timeout
  timeout server  300s            #alctl: server inactivity timeout
  timeout connect   5s            # 5 seconds max to connect or to stay in queue
  default-server inter 3s rise 2 fall 3   #alctl: default check parameters

frontend ft_exchange
  bind 10.0.0.9:443 name https    #alctl: listener https configuration.
  maxconn 10000                   #alctl: max connections (dep. on ALOHA capacity)
  default_backend bk_exchange     #alctl: default farm to use

backend bk_exchange
  balance roundrobin              #alctl: load balancing algorithm
  server exchange1 10.0.0.13:443 check
  server exchange2 10.0.0.14:443 check

(update IPs for your infrastructure and update the DNS name with the IP configured on the frontend bind line.)

Layer 4 NAT mode (desitnation NAT)


Diagram
The diagram below shows how things work with a Load-balancer in Layer 4 NAT mode:
  1. the client establishes a connection to the Client Access server through the Load-Balancer
  2. client and CAS server discuss through the Load-Balancer
Basically, the Load-Balancer acts as a router between the client and the server: a single TCP connection is established directly between the client and the Client Access server, and the LB just forward packets between both of them.
exchange_2013_layer4_nat

Advantages of Layer 4 NAT mode
  * no connection limit
  * Client Access servers can see the client IP address

Limitations of Layer 4 NAT mode
  * infrastructure intrusive: the server default gateway must be the load-balancer
  * clients and Client Access servers can’t be in the same subnet
  * no SSL acceleration, no advanced persistence
  * Must use 2 interfaces on the Load-Balancer (or VLAN interface)

Configuration
To perform this type of configuration, you can use the LB Admin tab and do it through the GUI or just copy/paste the following lines into the LB Layer 4 tab:

director exchange 10.0.1.9:443 TCP
  balance roundrobin                               #alctl: load balancing algorythm
  mode nat                                         #alctl: forwarding mode
  check interval 10 timeout 2                      #alctl: check parameters
  option tcpcheck                                  #alctl: adv check parameters
  server exchange1 10.0.0.13:443 weight 10 check   #alctl: server exchange1
  server exchange2 10.0.0.14:443 weight 10 check   #alctl: server exchange2

(update IPs for your infrastructure and update the DNS name with the IP configured on the frontend bind line.)

Layer 4 DSR mode (Direct Server Return, also known as Gateway)


Diagram
The diagram below shows how things work with a Load-balancer in Layer 4 DSR mode:
  1. the client establishes a connection to the Client Access server through the Load-Balancer
  2. the Client Access server acknowledges the connection directly to the client, bypassing the Load-Balancer on the way back. The Client Access server must have the Load-Balancer Virtual IP configured on a loopback interface.
  3. client and CAS server keep on discussing the same way: request through the load-balancer and responses directly from server to client.
Basically, the Load-Balancer acts as a router between the client and the server: a single TCP connection is established directly between the client and the Client Access server.
The Client and the server can be in the same subnet, like in the diagram, or can be in two different subnet. In the second case, the server would use its default gateway (no need to forward the traffic back through the load-balancer).
exchange_2013_layer4_dsr

Advantages of Layer 4 DSR mode
  * no connection limit
  * Client Access servers can see the client IP address
  * clients and Client Access servers can be in the same subnet
  * A single interface is required on the Load-Balancer

Limitations of Layer 4 DSR mode
  * infrastructure intrusive: the Load-Balancer Virtual IP must be configured on the Client Access server (Loopback).
  * no SSL acceleration, no advanced persistence

Configuration
To perform this type of configuration, you can use the LB Admin tab and do it though the web form or just copy/paste the following lines into the LB Layer 4 tab:

director exchange 10.0.0.9:443 TCP
  balance roundrobin                               #alctl: load balancing algorythm
  mode gateway                                     #alctl: forwarding mode
  check interval 10 timeout 2                      #alctl: check parameters
  option tcpcheck                                  #alctl: adv check parameters
  server exchange1 10.0.0.13:443 weight 10 check   #alctl: server exchange1
  server exchange2 10.0.0.14:443 weight 10 check   #alctl: server exchange2

Related links

Links

Posted in Aloha, architecture, exchange, Exchange 2013, layer4, layer7 | Tagged , , , , , , | 5 Comments

Microsoft Exchange 2013 load-balancing with HAProxy

Introduction to Microsoft Exchange server 2013

Note: I’ll introduce exchange from a Load-Balancing point of view. For a detailed information about exchange history and new features, please read the pages linked in the Related links at the bottom of this article.

Exchange is the name of the Microsoft software which provides a business-class mail / calendar / contact platform. It’s an old software, starting with version 4.0 back in 1996…
Each new version of Exchange Server brings in new features, both expanding Exchange perimeter and making it easier to deploy and administrate.

Exchange 2007


Introduction of Outlook Anywhere, AKA RPC over HTTP: allows remote users to get connected on Exchange 2007 platform using HTTPs protocol.

Exchange 2010


In example, Exchange 2010 introduced CAS arrays, making client side services high-available and scalable. DAG also brings mail database high-availability. All the client access services required persistence: a user must be sticked to a single CAS server.
Exchange 2010 introduced as well a “layer” between the MAPI RPC clients and the mailbox servers (through the CAS servers), making the failover of a database transparent.

Exchange 2013


Exchange 2013 improved again the changes brought by Exchange 2010: the CAS servers are now state-less and independent from each other (no arrays anymore): no persistence required anymore.
In exchange 2013, raw TCP MAPI RPC services have disappeared and have definitively been replaced by Outlook Anywhere (RPC over HTTP).
Last but not least, SSL offloading does not seem to be allowed for now.

Load-Balancing Microsoft Exchange 2013

First of all, I’m pleased to announce that HAProxy and the ALOHA Load-Balancer are both able to load-balance Exchange 2013 (as well as 2010).

Exchange 2013 Services


As explained in introduction, the table below summarizes the TCP ports and services involved in an Exchange 2013 platform:




TCP PortProtocolCAS Service name (abbreviation)
443HTTPS- Autodiscover (AS)
- Exchange ActiveSync (EAS)
- Exchange Control Panel (ECP)
- Offline Address Book (OAB)
- Outlook Anywhere (OA)
- Outlook Web App (OWA)
110 and 995POP3 / POP3sPOP3
143 and 993IMAP4 / IMAP4sIMAP4

Diagram

There are two main types of architecture doable:
1. All the services are hosted on a single host name
2. Each service owns its own host name

Exhange 2013 and the Single host name diagram


exchange_2013_single_hostname

Exhange 2013 and the Multiple host name diagram


exchange_2013_multiple_hostnames

Configuration

There are two types of configuration with the ALOHA:
Layer 4 mode: the LB act as a router, infrastrcuture intrusive, ability to manage millions of connections
layer 7 mode: the LB act as a reverse-proxy, non-intrusive implementation (source NAT), ability to manage thousands of connections, perform SSL offloading, DDOS protection, advanced persistence, etc…

The present article describe the layer 7 configuration, even if we’re going to use it at layer 4 (mode tcp).

Note that it’s up to you to update your DNS configuration to make the hostname point to your Load-Balancer service Virtual IP.

Template:
Use the configuration below as templates and just change the IP addresses:
bind line to your client facing service IPs
server line IPs to match your CAS servers (and add as many line as you need)
Once updated, just copy/paste the whole configuration, including the default section to the bottom of your ALOHA Layer 7 configuration.

Load-Balancing Exhange 2013 services hosted on a Single host name

######## Default values for all entries till next defaults section
defaults
  option  dontlognull             # Do not log connections with no requests
  option  redispatch              # Try another server in case of connection failure
  option  contstats               # Enable continuous traffic statistics updates
  retries 3                       # Try to connect up to 3 times in case of failure 
  timeout connect 5s              # 5 seconds max to connect or to stay in queue
  timeout http-keep-alive 1s      # 1 second max for the client to post next request
  timeout http-request 15s        # 15 seconds max for the client to send a request
  timeout queue 30s               # 30 seconds max queued on load balancer
  timeout tarpit 1m               # tarpit hold tim
  backlog 10000                   # Size of SYN backlog queue

  balance roundrobin                      #alctl: load balancing algorithm
  mode tcp                                #alctl: protocol analyser
  option tcplog                           #alctl: log format
  log global                              #alctl: log activation
  timeout client 300s                     #alctl: client inactivity timeout
  timeout server 300s                     #alctl: server inactivity timeout
  default-server inter 3s rise 2 fall 3   #alctl: default check parameters

frontend ft_exchange_tcp
  bind 10.0.0.9:443 name https          #alctl: listener https configuration.
  maxconn 10000                         #alctl: connection max (depends on capacity)
  default_backend bk_exchange_tcp       #alctl: default farm to use

backend bk_exchange_tcp
  server cas1 10.0.0.15:443 maxconn 10000 check    #alctl: server cas1 configuration.
  server cas2 10.0.0.16:443 maxconn 10000 check    #alctl: server cas2 configuration.

And the result (LB Admin tab):
– Virtual Service:
aloha_exchange2013_single_domain_virtual_services
– Server Farm:
aloha_exchange2013_single_domain_server_farm

Load-Balancing Exhange 2013 services hosted on Multiple host name

######## Default values for all entries till next defaults section
defaults
  option  dontlognull             # Do not log connections with no requests
  option  redispatch              # Try another server in case of connection failure
  option  contstats               # Enable continuous traffic statistics updates
  retries 3                       # Try to connect up to 3 times in case of failure 
  timeout connect 5s              # 5 seconds max to connect or to stay in queue
  timeout http-keep-alive 1s      # 1 second max for the client to post next request
  timeout http-request 15s        # 15 seconds max for the client to send a request
  timeout queue 30s               # 30 seconds max queued on load balancer
  timeout tarpit 1m               # tarpit hold tim
  backlog 10000                   # Size of SYN backlog queue

  balance roundrobin                      #alctl: load balancing algorithm
  mode tcp                                #alctl: protocol analyser
  option tcplog                           #alctl: log format
  log global                              #alctl: log activation
  timeout client 300s                     #alctl: client inactivity timeout
  timeout server 300s                     #alctl: server inactivity timeout
  default-server inter 3s rise 2 fall 3   #alctl: default check parameters

frontend ft_exchange_tcp
  bind 10.0.0.5:443  name as        #alctl: listener: autodiscover service
  bind 10.0.0.6:443  name eas       #alctl: listener: Exchange ActiveSync service
  bind 10.0.0.7:443  name ecp       #alctl: listener: Exchange Control Panel service
  bind 10.0.0.8:443  name ews       #alctl: listener: Exchange Web Service service
  bind 10.0.0.8:443  name oa        #alctl: listener: Outlook Anywhere service
  maxconn 10000                     #alctl: connection max (depends on capacity)
  default_backend bk_exchange_tcp   #alctl: default farm to use

backend bk_exchange_tcp
  server cas1 10.0.0.15:443 maxconn 10000 check   #alctl: server cas1 configuration.
  server cas2 10.0.0.16:443 maxconn 10000 check   #alctl: server cas2 configuration.

And the result (LB Admin tab):
– Virtual Service:
aloha_exchange2013_multiple_domain_virtual_services
– Server Farm:
aloha_exchange2013_multiple_domain_server_farm

Conclusion


This is a very basic and straight forward configuration. We could make it much more complete and improve timeouts per services, better health checking, DDOS protection, etc…
I may write later articles about Exchange 2013 Load-Balancing with our products.

Related links

Exchange 2013 installation steps
Exchange 2013 first configuration
Microsoft Exchange Server (Wikipedia)
Microsft Exchange official webpage

Links

Exceliance
Aloha load balancer: HAProxy based LB appliance
HAPee: HAProxy Enterprise Edition

Posted in exchange, Exchange 2013 | Tagged , , , , , | 6 Comments

HAProxy, high mysql request rate and TCP source port exhaustion

Synopsys


At Exceliance, we do provide professional services around HAPRoxy: this includes HAProxy itself, of course, but as well the underlying OS tuning, advice and recommendation about the architecture and sometimes we can also help customers troubleshooting application layer issues.
We don’t fix issues for the customer, but using information provided by HAProxy, we are able to reduce the investigation area, saving customer’s time and money.
The story I’m relating today is issued of one of this PS.

One of our customer is an hosting company which hosts some very busy PHP / MySQL websites. They used successfully HAProxy in front of their application servers.
They used to have a single MySQL server which was some kind of SPOF and which had to handle several thousands requests per seconds.
Sometimes, they had issues with this DB: it was like the clients (hence the Web servers) can’t hangs when using the DB.

So they decided to use MySQL replication and build an active/passive cluster. They also decided to split reads (SELECT queries) and writes (DELETE, INSERT, UPDATE queries) at the application level.
Then they were able to move the MySQL servers behind HAProxy.

Enough for the introduction :) Today’s article will discuss about HAProxy and MySQL at high request rate, and an error some of you may already have encountered: TCP source port exhaustion (the famous high number of sockets in TIME_WAIT).

Diagram


So basically, we have here a standard web platform which involves HAProxy to load-balance MySQL:
haproxy_mysql_replication

The MySQL Master server is used to send WRITE requests and the READ request are “weighted-ly” load-balanced (the slaves have a higher weight than the master) against all the MySQL servers.

MySql scalability

One way of scaling MySQL, is to use the replication method: one MySQL server is designed as master and must manages all the write operations (DELETE, INSERT, UPDATE, etc…). for each operation, it notifies all the MySQL slave servers. We can use salves for reading only, offloading these type of requests from the master.
IMPORTANT NOTE: The replication method allows scalability of the read part, so if your application require much more writes, then this is not the method for you.

Of course, one MySQL slave server can be designed as master when the master fails! This also ensure MySQL high-availability.

So, where is the problem ???

This type of platform works very well for the majority of websites. The problem occurs when you start having a high rate of requests. By high, I mean several thousands per second.

TCP source port exhaustion

HAProxy works as a reverse-proxy and so uses its own IP address to get connected on the server.
Any system have available around 64K source TCP ports to get connected to a remote IP:port. Once a combination of “source IP:port => dst IP:port” is in use, it can’t be re-used.
First lesson: you can’t have more than 64K opened connections from a HAProxy box to a single remote IP:port couple. I think only people load-balancing MS Exchange RPC services or sharepoint with NTLM may one day reach this limit…
(well, it is possible to workaround this limit using some hacks we’ll explain later in this article)

Why does TCP port exhaustion occur with MySQL clients???


As I said, the MySQL request rate was a few thousands per second, so we never ever reach this limit of 64K simultaneous opened connection to the remote service…
What’s up then???
Well, there is an issue with MySQL client library: when a client sends its “QUIT” sequence, it performs a few internal operations before immediately shutting down the TCP connection, without waiting for the server to do it. A basic tcpdump will show it to you easily.
Note that you won’t be able to reproduce this issue on a loopback interface, because the server answers fast enough… You must use a LAN connection and 2 different servers.

Basically, here is the sequence currently performed by a MySQL client:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client ==>       FIN       ==> MySQL Server
Mysql Client <==     FIN ACK     <== MySQL Server
Mysql Client ==>       ACK       ==> MySQL Server

Which leads the client connection to remain unavailable for twice the MSL (Maximum Segment Life) time, which means 2 minutes.
Note: this type of close has no negative impact when the connection is made over a UNIX socket.

Explication of the issue (much better that I could explain it myself):
“There is no way for the person who sent the first FIN to get an ACK back for that last ACK. You might want to reread that now. The person that initially closed the connection enters the TIME_WAIT state; in case the other person didn’t really get the ACK and thinks the connection is still open. Typically, this lasts one to two minutes.” (Source)

Since the source port is unavailable for the system for 2 minutes, this means that over 534 MySQL requests per seconds you’re in danger of TCP source port exhaustion: 64000 (available ports) / 120 (number of seconds in 2 minutes) = 533.333.
This TCP port exhaustion appears on the MySQL client server itself, but as well on the HAProxy box because it forwards the client traffic to the server… And since we have many web servers, it happens much faster on the HAProxy box !!!!

Remember: at spike traffic, my customer had a few thousands requests/s….

How to avoid TCP source port exhaustion?


Here comes THE question!!!!
First, a “clean” sequence should be:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client <==       FIN       <== MySQL Server
Mysql Client ==>     FIN ACK     ==> MySQL Server
Mysql Client <==       ACK       <== MySQL Server

Actually, this sequence happens when both MySQL client and server are hosted on the same box and uses the loopback interface, that’s why I said sooner that if you want to reproduce the issue you must add “latency” between the client and the server and so use 2 boxes over the LAN.
So, until MySQL rewrite the code to follow the sequence above, there won’t be any improvement here!!!!

Increasing source port range


By default, on a Linux box, you have around 28K source ports available (for a single destination IP:port):

$ sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    61000

In order to get 64K source ports, just run:

$ sudo sysctl net.ipv4.ip_local_port_range="1025 65000"

And don’t forget to update your /etc/sysctl.conf file!!!

Note: this should definitively be applied also on the web servers….

Allow usage of source port in TIME_WAIT


A few sysctls can be used to tell the kernel to reuse faster the connection in TIME_WAIT:

net.ipv4.tcp_tw_reuse
net.ipv4.tcp_tw_recycle

tw_reuse can be used safely, be but careful with tw_recycle… It could have side effects.

anyway, these sysctls were already properly setup (value = 1) on both HAProxy and web servers.

Note: this should definitively be applied also on the web servers….
Note 2: tw_reuse should definitively be applied also on the web servers….

Using multiple IPs to get connected to a single server


In HAProxy configuration, you can precise on the server line the source IP address to use to get connected to a server, so just add more server lines with different IPs.
In the example below, the IPs 10.0.0.100 and 10.0.0.101 are configured on the HAProxy box:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101
[...]

This allows us to open up to 128K source TCP port…
The kernel is responsible to manage to affect a new TCP port when HAProxy requests it. Dispite improving things a bit, we still reach some sour port exhaustion… We could not get over 80K connections in TIME_WAIT with 4 source IPs…

Let HAProxy manage TCP source ports


You can let HAProxy decides which source port to use when opening a new TCP connection, on behalf of the kernel. HAProxy has a purposely built functions which make it more efficient that the kernel for such task.
Let’s update the configuration above:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100:1025-65000
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101:1025-65000
[...]

We managed to get 170K+ connections in TIME_WAIT with 4 source IPs… and not source port exhaustion anymore !!!!

Use a memcache


Fortunately, the devs from this customer are skilled and write flexible code :)
So they managed to move some requests from the MySQL DB to a memcache, opening much less connections.

Use MySQL persistant connections


This could prevent fine Load-Balancing on the Read-Only farm, but it would be very efficient on the MySQL master server.

Conclusion

  • If you see some SOCKERR information messages in HAProxy logs (mainly on health check), you may be running out of TCP source port.
  • Have skilled developers who writes flexible code, where moving from a DB to an other is made easy
  • This kind of issue can happens only with protocols or applications which make the client closing the connection first
  • This issue can’t happen on HAProxy in HTTP mode, since it let the server closes the connection before sending a TCP RST

Links

Posted in HAProxy, optimization, performance | Tagged , , , , | 1 Comment

Exchange Outlook Web Access (OWA) Cross-Site Request Forgery (CSRF) protection

Outlook Web Access

Outlook Web Access is the webmail embedded in Exchange mail server. It is used by users outside the office to get access to their emails.
Unfortunately, some version of OWA are affected by a CSRF attack.
This vulnerability affects supported editions of Microsoft Exchange Server 2003 and Microsoft Exchange Server 2007 (except Microsoft Exchange Server 2007 Service Pack 3).
Exchange 2010 is not concerned by this attack.

CSRF attack explained

The attacker hosts on his own webserver a page with a code pointing to the targeted OWA webmail domain. When the user browse this page, the code hijack his session to change the target email parameters.
The most complicated part for the attacker is to manage to make the target browse the web page: usually he puts the link in a mail.

CSRF prevention for OWA


Fortunately, it is easy to block this type of attack, since it requires a third party website. Well, easy, only if you use a Load-Balancer or a reverse-proxy with real layer 7 ability.
The ALOHA Load-balancer can be used to load-balance exchange services as well as protect your OWA users against CSRF attacks.

As explained above, a page hosted on a third party server would make the user’s browser send a request to his webmail. When doing this, the browser sets up the Referer HTTP header with the attacker’s website URL (including hostname). Even if it is easy to fake this header in normal situation, it is impossible for the attacker to change the behavior of the browser.
Which means we can easily monitor the Referer header and prevent any request coming from an unknown domain.
In some cases, a Referer from an other domain could be allowed, but only when pointing to a few URLs (OWA’s entry points).

It is important to notice that the ALOHA Load-Balancer must be used as the SSL offloader in order to be able to access all the HTTP headers.

The configuration below will explain all of this.

# valid Referer detection
  acl valid_owa_referer hdr_beg(Referer) http://webmail.company.com/ https://webmail.company.com/

# OWA entry points may have a Referer pointing to an other domain
  acl owa_welcome_url url / /owa /owa/

# don't check the Referer on welcome urls
  http-request allow if owa_welcome_url

# deny any OWA requests if the Referer does not point to Company's webmail hostname
  http-request deny if !valid_owa_referer

# allow valid requests (this one is implicite, but written for better understanding
  http-request allow

The code above won’t run any Referer check for the webmail URLs entry points and will check it for all other URLs. If a request points to a page with a Referer outside the company’s domain name, the the request is denied and your user safety is preserved.

Related links

Links

Posted in Aloha, exchange, layer7, security | Tagged , , , | 1 Comment

Websockets load-balancing with HAProxy

Why Websocket ???


HTTP protocol is connection-less and only the client can request information from a server. In any case, a server can contact a client… HTTP is purely half-duplex. Furthermore, a server can answer only one time to a client request.
Some websites or web applications require the server to update client from time to time. There were a few ways to do so:

  • the client request the server at a regular interval to check if there is a new information available
  • the client send a request to the server and the server answers as soon as he has an information to provide to the client (also known as long time polling)

But those methods have many drawbacks due to HTTP limitation.
So a new protocol has been designed: websockets, which allows a 2 ways communication (full duplex) between a client and a server, over a single TCP connection. Furthermore, websockets re-use the HTTP connection it was initialized on, which means it uses the standard TCP port.

How does websocket work ???

Basically, a websocket start with a HTTP request like the one below:

GET / HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: avkFOZvLE0gZTtEyrZPolA==
Host: localhost:8080
Sec-WebSocket-Protocol: echo-protocol

The most important part is the “Connection: Upgrade” header which let the client know to the server it wants to change to an other protocol, whose name is provided by “Upgrade: websocket” header.

When a server with websocket capability receive the request above, it would answer a response like below:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: tD0l5WXr+s0lqKRayF9ABifcpzY=
Sec-WebSocket-Protocol: echo-protocol

The most important part is the status code 101 which acknoledge the protocol switch (from HTTP to websocket) as well as the “Connection: Upgrade” and “Upgrade: websocket” headers.

From now, the TCP connection used for the HTTP request/response challenge is used for the websocket: whenever a peer wants to interact with the other peer, it can use the it.

The socket finishes as soon as one peer decides it or the TCP connection is closed.

HAProxy and Websockets


As seen above, there are 2 protocols embedded in websockets:

  1. HTTP: for the websocket setup
  2. TCP: websocket data exchange

HAProxy must be able to support websockets on these two protocols without breaking the TCP connection at any time.
There are 2 things to take care of:

  1. being able to switch a connection from HTTP to TCP without breaking it
  2. smartly manage timeouts for both protocols at the same time

Fortunately, HAProxy embeds all you need to load-balance properly websockets and can meet the 2 requirements above.
It can even route regular HTTP traffic from websocket traffic to different backends and perform websocket aware health check (setup phase only).

The diagram below shows how things happens and HAProxy timeouts involved in each phase:
diagram websocket

During the setup phase, HAProxy can work in HTTP mode, processing layer 7 information. It detects automatically the Connection: Upgrade exchange and is ready to switch to tunnel mode if the upgrade negotiation succeeds. During this phase, there are 3 timeouts involved:

  1. timeout client: client inactivity
  2. timeout connect: allowed TCP connection establishment time
  3. timeout server: allowed time to the server to process the request

If everything goes well, the websocket is established, then HAProxy fails over to tunnel mode, no data is analyzed anymore (and anyway, websocket does not speak HTTP). There is a single timeout invloved:

  1. timeout tunnel: take precedence over client and server timeout

timeout connect is not used since the TCP connection is already established :)

Testing websocket with node.js

node.js is a platform which can host applications. It owns a websocket module we’ll use in the test below.

Here is the procedure to install node.js and the websocket module on Debian Squeeze.
Example code is issued from https://github.com/Worlize/WebSocket-Node, at the bottom of the page.

So basically, I’ll have 2 servers, each one hosting web pages on Apache and an echo application on websocket application hosted by nodejs. High-availability and routing is managed by HAProxy.

Configuration


Simple configuration


In this configuration, the websocket and the web server are on the same application.
HAProxy switches automatically from HTTP to tunnel mode when the client request a websocket.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor

frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 10000
  default_backend bk_web

backend bk_web                      
  balance roundrobin
  server websrv1 192.168.10.11:8080 maxconn 10000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 10000 weight 10 cookie websrv2 check

Advanced Configuration


The configuration below allows to route requests based on either Host header (if you have a dedicated host for your websocket calls) or Connection and Upgrade header (required to switch to websocket).
In the backend dedicated to websocket, HAProxy validates the setup phase and also ensure the user is requesting a right application name.
HAProxy also performs a websocket health check, sending a Connection upgrade request and expecting a 101 response status code. We can’t go further for now on the health check for now.
Optional: the web server is hosted on Apache, but could be hosted by node.js as well.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor



frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 60000

## routing based on Host header
  acl host_ws hdr_beg(Host) -i ws.
  use_backend bk_ws if host_ws

## routing based on websocket protocol header
  acl hdr_connection_upgrade hdr(Connection)  -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)     -i websocket

  use_backend bk_ws if hdr_connection_upgrade hdr_upgrade_websocket
  default_backend bk_web



backend bk_web                                                   
  balance roundrobin                                             
  option httpchk HEAD /                                          
  server websrv1 192.168.10.11:80 maxconn 100 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:80 maxconn 100 weight 10 cookie websrv2 check



backend bk_ws                                                    
  balance roundrobin

## websocket protocol validation
  acl hdr_connection_upgrade hdr(Connection)                 -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)                    -i websocket
  acl hdr_websocket_key      hdr_cnt(Sec-WebSocket-Key)      eq 1
  acl hdr_websocket_version  hdr_cnt(Sec-WebSocket-Version)  eq 1
  acl hdr_host               hdr_cnt(Sec-WebSocket-Version)  eq 1
  http-request deny if ! hdr_connection_upgrade ! hdr_upgrade_websocket ! hdr_w
ebsocket_key ! hdr_websocket_version ! hdr_host

## ensure our application protocol name is valid 
## (don't forget to update the list each time you publish new applications)
  acl ws_valid_protocol hdr(Sec-WebSocket-Protocol) echo-protocol
  http-request deny if ! ws_valid_protocol

## websocket health checking
  option httpchk GET / HTTP/1.1\r\nHost:\ ws.domain.com\r\nConnection:\ Upgrade
\r\nUpgrade:\ websocket\r\nSec-WebSocket-Key:\ haproxy\r\nSec-WebSocket-Version
 :\ 13\r\nSec-WebSocket-Protocol:\ echo-protocol
  http-check expect status 101

  server websrv1 192.168.10.11:8080 maxconn 30000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 30000 weight 10 cookie websrv2 check

Note that HAProxy could also be used to select different Websocket application based on the Sec-WebSocket-Protocol header of the setup phase. (later I’ll write the article about it).

Related links

Links

Posted in Aloha, HAProxy, optimization | Tagged , , , , , , | 1 Comment