오래전 이야기/Server

Global DNS Load Balancing, for FREE!

리눅스 엔지니어였던 2009. 12. 8. 09:48

Global DNS Load Balancing, for FREE!

Published on Sat 30 Jun 2007 08:06 ( 2 years, 5 months ago)

Last year, yes, over 1.5 years ago, the company I worked on faced on a serious problem/requirement of separating web traffic to different servers in different data centers. When we come to many network solution providers, or CDN (content dilivery network) service providers, we get unbelievable and unacceptable quotations, so I spend some time research this issue, finally we got a good enough solution with all open source software and data(Linux + PowerDNS + geo backend + geo IP data we collected), it's exactly same as the solutions which cost many thousands dollars per month.

What's Global DNS Load Balancing


Global load balancing, aka Geographic Load Balancing, or GSLB for short, is a DNS technology to reply DNS records based on the request's IP geographic location. e.g. a US user visit www.google.com, he/she will visit a web server located in a data center in the United States, in the same time, a user in China type in the exactly same URL "www.google.com", he/she will get served from a server located in Google's China data center.

GSLB

Diagram credit: http://www.oes.co.th

It's a very important and valuable technology for big web sites, Google, Yahoo, MSN... almost all those multinational web sites are using this technology.

How ?


A special DNS serve, or a module attached with such DNS server, return different answer to different request based on the request (generally another DNS server, which is your DNS server of your ISP) IP geo-location :

www.yourdomain.com ---[CNMAE Record] ---> geo.yourdomain.com --- [ GSLB handdling, CNAME ]--> us.geo.yourdomain.com -- [A record] --> 68.178.110.21

Of course you directly configure the steps simpler to: www.yourdomain.com -- [GSLB, return A record] ---> IP address, however most site use some CNAME records to make the configuration more flexible and easier to manage:


  • first CNMAE

    • Because you may have many different top level names, e.g. photo.yoursite.com, blog.yoursite.com, etc. , you can handle them all in geo.yoursite.com.



  • next CNAME

    • GSLB generally return another cname record, It is much more useful for configuration, because you don't wish GSLB know too much about a bunch of IP addresses, it's better to use names such as us.yoursite.com, jp.yoursite.com, etc.


    IP Address

    • Of course you can config multiple IPs for A records to enable DNS round robin, which is a simple load balancing for servers.




In short one important thing is

Here is how google's look like, you can see if in your browser via this link:


DNS Lookup: www.google.com A record


Generated by www.DNSstuff.com at 07:23:25 GMT on 30 Jun 2007.

How I am searching:

Searching for www.google.com A record at d.root-servers.net [128.8.10.90]: Got referral to G.GTLD-SERVERS.NET. (zone: com.) [took 46 ms]
Searching for www.google.com A record at G.GTLD-SERVERS.NET. [192.42.93.30]: Got referral to ns1.google.com. (zone: google.com.) [took 52 ms]
Searching for www.google.com A record at ns1.google.com. [216.239.32.10]: Got CNAME of www.l.google.com. and referral to g.l.google.com. [took 73 ms]
Searching for www.l.google.com A record at h.root-servers.net [128.63.2.53]: Got referral to b.gtld-servers.net. (zone: com.) [took 44 ms]
Searching for www.l.google.com A record at b.gtld-servers.net. [192.33.14.30]: Got referral to ns1.google.com. (zone: google.com.) [took 162 ms]
Searching for www.l.google.com A record at ns1.google.com. [216.239.32.10]: Got referral to a.l.google.com. (zone: l.google.com.) [took 72 ms]
Searching for www.l.google.com A record at a.l.google.com. [209.85.139.9]: Reports www.l.google.com. [took 71 ms] Response:








































DomainTypeClassTTLAnswer
www.l.google.com. A IN 300 209.85.135.147
www.l.google.com. A IN 300 209.85.135.99
www.l.google.com. A IN 300 209.85.135.104
www.l.google.com. A IN 300 209.85.135.103

NOTE: One or more CNAMEs were encountered. www.google.com is really www.l.google.com.

How much?


It's really depends on who you ask! Most of network equipment providers, such as CISCO, F5 Networks can give you a nearly perfect solution, and sell you bunch of boxes which cost you over $10,000 or even over $100,000 ! If you ask for some CDN serivces providers, they may sell you a "Global DNS" services to you cost over $1,000 for each month.

But there are open source solutions which cost you almost FREE! Though there are several free and open source solutions (but too many of them), I think PowerDNS is the best choice, I used powerDNS and its geo backend to full fill our requirements.

Poor(smart) man's Global DNS Load Balancing solutions, FREE!


There are a very specific wiki pages talking about how to implement GSLB with open source software and how to configure.

PowerDNS is free with full source code, and it's really powerful with many advanced features, it support plugins (called 'backend') to extend it. Geo backend is one of those free banckends come together with PowerDNS. Unfortunately, very few document could be found about geo backend, fortunately there is a setup notes which is simple but almost explained everything in step and step.

You need to set the TTL of the CNAME records that geobackend will return for a reasonable (generally short) time, in my case I use 5 minutes.

Something important is how to build the IP->Geo location map, you can use rsync to grab a coutry geo data or you may need to build your own.

To grab the country data:


rsync -va rsync://countries-ns.mdc.dk/zone .


(updated: check here http://countries.nerd.dk/more.html for the zonefile rsync)

The config of powerDNS:


# This is the real guts of the data that drives this backend.  This is a DNS
# zone file for RBLDNSD, a nameserver specialised for running large DNS zones
# typical of DNSBLs and such. We choose it for our data because it is easier
# to parse than the BIND-format one.
#
# Anyway, it comes from http://countries.nerd.dk/more.html - there are details
# there for how to rsync your own copy. You'll want to do that regularly,
# every couple of days maybe. We believe the nerd.dk guys take the netblock
# info from Regional Internet Registries (RIRs) like RIPE, ARIN, APNIC. From
# that they build a big zonefile of IP/prefixlen -> ISO-country-code mappings.
geo-ip-map-zonefile=/usr/local/etc/zz.countries.nerd.dk.rbldnsd


Map country codes to your country name:
# Andorra
20 eu
# United Arab Emirates
784 uk
# Afghanistan
4 uk

You don't have to replace your original DNS server to PowerDNS, you can just let PowerDNS handle the geo-location part, and keep all other DNS records in your favorite DNS server.

Here is a sample configuration to implement: www.yoursite.com --> geo.yoursite.com --> us.yoursite.com --> IP address.

Inside your original DNS configure:

 www CNAME geo

pdns A 192.168.1.1 ; your server installed PowerDNS
geo NS pdns ; use PowerDNS to handle geo


And also inside your original DNS server configure, add A records for servers located in different places:
 us A 68.178.100.12 ; server IP address in the United States, 'us' defined in your country code to name mapping.

cn A 202.102.24.100 ;server IP address in China...

...


It seemed not too many people talking about it or related matters. (I guess it's because for most web site, people don't need GSLB; for those guys who expereinced on GSLB, such as google, yahoo, they feel GSLB is such a simple thing they don't want to waste time to explain such a "easy thing"; for some guys they may treat GSLB as a technical knowhow to make money, also not willing to talk too much)

Wikipedia is using PowerDNS, and with geobackend to do global load balancing. Here is from Wikipedia about PowerDNS:

As of early 2005, PowerDNS in combination with the bind and geo backends is used by Wikimedia to handle all DNS traffic. By using the geobackend, incoming clients can be redirected to the nearest Wikipedia server (based on their geographic location). This facility provides an effective way of load balancing and it reduces response times for the clients

Global DNS Load Balancing Limitations (0n HA)


Global load balancing is not a good solution for HA(high availability), there are many reasons, the DNS refresh time, web browser DNS cache, server down detect time lag, etc., here is an article explained the limitation very clearly and easy to understand: "Why DNS Based Global Server Load Balancing (GSLB) Doesn't Work". Don't let its title fool you, I think what "doesn't work" means HA doesn't work, GSLB is still very useful for you to run a huge site across the world.

Useful links:

[1] Thoughts on Global Server Load Balancing Dave Walker from SUN talking about GSLB in general.

[2] DNS Balancing Very specific explain and config of free GSLB

[3] GeoDNS A simple but clear setup of PowerDNS+geobackend


Related posts:


===============================