-
Notifications
You must be signed in to change notification settings - Fork 9
Generic TCP Balancer for Erlang
License
mdaguete/tcpbalance
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
TCPBalance, a load-balancing TCP proxy for distcc ================================================= There are dozens of Open Source TCP proxies available, written in close to a dozen languages, many of them capable of load balancing. Many of them would work with "distcc". Why write yet another TCP proxy? Why do it in Erlang? All of the TCP proxies I found, none appeared to have the following combination of features: 1. Not be too HTTP-centric to not be able to work with "distcc". 2. Be aware that some back-end hosts may be faster than other hosts. For each client connection, the proxy should choose the fastest back-end host that is currently idle. 3. Be aware of back-end hosts with multiple CPUs. 4. When all back-end hosts are busy, make the client wait for the next available back-end host when it is available, rather than giving a back-end host more work than it is configured to handle. 5. Detect when a back-end host is down and do something sane (like avoid giving future jobs to the dead machine). 6. Permit an administrator to put back-end hosts back in service, take them out of service, as well as add and remove hosts from the pool without adversely affecting clients using the proxy. 7. Keep basic statistics about back-end hosts and make them available via HTTP or Telnet. Features 1-5 were mandatory. Features 6-7 would be nice. In a couple of hours of Web surfing, I didn't find a TCP proxy that was capable of doing 1-5, so I decided to write my own. I knew it would be fairly easy to implement features 6-7 as well as 1-5 in Erlang (see http://www.erlang.org/), so that's what I used. This proxy has been in use at Caspian Networks for over two months. It's pretty solid. This README file is quite long. Sorry about that. However, much of it is a tutorial for Erlang newbies ... and perhaps a bit of evangelism. :-) I'll try to keep things straightforward, but I will also demonstrate some of the nifty communication, fault-tolerance, and hot code upgrade features of Erlang. License ------- See the file "LICENSE", at the top of the tcpbalance source distribution, for full licensing terms. Obtaining, Compiling, and Installing Erlang ------------------------------------------- Oh boy, yet another programming language development environment to compile and install. Well, if you're still reading this, you're interested enough in tcpbalance's feature set to try it out. The current release of Erlang (as of 15 January 2003) is R9B-0. All of the source and pre-compiled packages mentioned below can be obtained from http://www.erlang.org/download/. * UNIX and Linux users: Erlang runs quite well under: Linux, BSD flavors, Solaris, and others. Although it's slow, I recommend obtaining the source and compiling from scratch. The distribution is pretty big (8MB compressed) and contains a kajillion files ... but the Erlang programming & runtime environment contains many useful tools (including a nearly-complete CORBA implementation) that are themselves written in Erlang. Tcpbalance doesn't use most of them, but you get to compile all of them. :-) Follow the included directions, but the simple instructions are: 1. Extract the source package and change directory to its root. 2. Run "./configure" to use the default installation root ("/usr/local" on most platforms) or "./configure --prefix=/path/to/install/root" if you wish to use another installation root path. 3. Run "make". 4. Go do something else for a while. The Erlang VM is implemented in C, which compiles in a few minutes, but most of the run-time environment & misc tools are implemented in Erlang itself, and there's a *lot* of it. 5. Run "make install". 6. Add /path/to/install/root/bin, whatever it is, to your shell's program path. * MacOS X users: I haven't used it, but there's a pre-compiled, disk image-style installation thingie available. Give it a try if you wish. * Microsoft Windows users: Your only option is installing a pre-compiled package. Whoo hoo! When I installed Erlang R9B-0 on a Windows NT 4.0 machine, I discovered that "C:\Program Files\erl5.2\bin" was already added to my program path. What a deal. Testing Erlang Inter-Node Communication --------------------------------------- One of the really nice things about Erlang is that communication (via message passing) between threads inside an Erlang virtual machine is exactly the same as message passing between threads on different Erlang virtual machines. An instance of the Erlang virtual machine is called a "node". Erlang threads are called "processes", which can really confuse UNIX geeks if you're not careful. In this document, if I want to refer to a UNIX or NT process/task, I'll call it an "operating system process" or "OS process" to avoid confusion. Erlang's inter-node message passing relies on a simple shared secret mechanism similar to the X11's "MIT magic cookie" authentication scheme. Erlang stores a "cookie" in $HOME/.erlang.cookie (where $HOME is your account's home directory) on UNIX boxes and in C:\.erlang.cookie on NT boxes. To communicate with each other, all Erlang nodes must share the exact same cookie. (There are other ways to configure cookies, but I'm only going to describe one.) For nodes running on the same machine, there's no problem: everybody is sharing the same file system(s). For nodes running on different UNIX machines, you'll need to: 1. Use NFS or another shared file system for your $HOME directory. 2. Copy the .erlang.cookie file by hand to the $HOME directory of each machine you wish to run an Erlang node. For nodes running on different NT machines, you'll need to copy the .erlang.cookie to each machine. If you're mixing NT and UNIX machines (which is certainly possible), make certain the exact same .erlang.cookie file is used on all of them. To make life easier on yourself, make certain that all of the machines involved are present in DNS. E.g. if you're playing with "davinci" and "munch", make certain that "ping davinci" and "ping munch" works on both machines. If a machine has a hyphen, "-", in its DNS name ... choose another machine. Erlang's syntax requires treating hyphens specially. This example is complicated enough as it is. (See footnote [1] below.) An Erlang node name looks like: foo@hostname or: foo@hostname.fully.qualified.domain For simplicity, we'll use the former. Each node must have a unique node name. For nodes running on the same machine, the lefthand side of the "@" must, therefore, be unique. If you attempt to start two nodes on the same machine with the same name, the second VM will spit out a very long and cryptic error message (including the string "Kernel pid terminated"). For my example, I'm going to use the nodes 'foo@davinci' and 'bar@rover'. NOTE: Erlang commands are case-specific! On each machine, start the Erlang VM and interactive shell: 1. UNIX: run "erl -sname NODE_LHS". E.g. "erl -sname foo" 2. NT: run "werl -sname NODE_LHS" E.g. "werl -sname foo" 3. Mac OS X: I've never used it, so you're on your own. You will see something like: Erlang (BEAM) emulator version 5.2 [source] [hipe] [threads:0] Eshell V5.2 (abort with ^G) (foo@davinci)1> Note that the node name is included in the prompt string. Type the command "erlang:get_cookie()." and press Enter. You should see: (foo@davinci)1> erlang:get_cookie(). 'JQHZIQLDNQGUSZRAJXHB' You should see the same cookie on each node. If not, go back and fix it. Now, we'll test the inter-node message passing capability. One one of your nodes, type the following: (foo@davinci)2> register(test, self()). true (foo@davinci)3> receive Msg -> Msg end. You won't get see a prompt right away: you're blocked waiting for a message to arrive. On the other machine, type the following, substituting the other machine's node name: (bar@rover)1> {test, 'foo@davinci'} ! hello_world. hello_world On the first machine, you should see: (foo@davinci)3> receive Msg -> Msg end. hello_world (foo@davinci)4> If you run the function "nodes()", you should see the name of all other nodes that your node is aware of. (bar@rover)3> nodes(). [foo@davinci] (bar@rover)4> Congratulations! It's now time to play with the tcpbalance application. Use the command "q()." to exit the shell, or press Control-c. Modifying the application and Web server config to fit your environment ----------------------------------------------------------------------- I haven't spent any time trying to auto-magically edit the files that will need editing before you can use tcpbalance. I've made life "easier" by using relative, not absolute paths. Therefore, no extra configuration or file editing should be necessary. This means that you *must* change the current working directory exactly as described below (or else things won't work). This is not how a "real" Erlang application would be installed & run, but I haven't taken the time to do that. Sorry. If you want the built-in HTTP server to use a port other than port 8080, edit the file "priv/inets.conf" and modify the "Port" directive. Compiling the application ------------------------- The file "src/Makefile" requires GNU Make. To use it: % cd src % make If you do not have GNU Make, or if you're using NT and don't have GNU Make available, execute the commands found in the file "Make.all.out". The file src/balance.rel contains version numbers that are specific to a particular Erlang/OTP release version. If you see error messages like this: stdlib: No valid version ("1.11.0") of .app file found. Found file "/usr/local/lib/erlang/lib/stdlib-1.11.4.1/ebin/stdlib.app" with version "1.11.4.1" kernel: No valid version ("2.8.0") of .app file found. Found file "/usr/local/lib/erlang/lib/kernel-2.8.1.1/ebin/kernel.app" with version "2.8.1.1" gmake: *** [balance.boot] Error 1 ... then you have a different version of Erlang/OTP than I originally used for tcpbalance. Edit the file src/balance.rel to replace the invalid version numbers with the version numbers mentioned at the end of each error message. Running the application with the example configuration file ----------------------------------------------------------- Run the following: % cd src ... if you haven't already % erl -sname bal -pz ../ebin -boot balance -config ../priv/be-list -noshell You will see a whole bunch of diagnostic messages, labelled "PROGRESS REPORT". The last one will say something like: =PROGRESS REPORT==== 16-Jan-2003::13:59:45 === application: balance started_at: bal@davinci Congratulations. This means that the application is running successfully. If this isn't what you see, and you're certain that you've followed all the directions, cut-and-paste the output and email it to me. (See footnote [2] below.) The example configuration file, "../priv/be-list.config", is a proxy for two SMTP servers, mx1.mail.yahoo.com and mx1.hotmail.com. The proxy is listening to local port 2525. % telnet davinci 2525 Trying 10.10.10.10... Connected to localhost.localdomain. Escape character is '^]'. 220 YSmtp mta614.mail.yahoo.com ESMTP service ready quit 221 mta614.mail.yahoo.com Connection closed by foreign host. See the "be-list.config" file for full details of the configuration. To summarize: 1. The proxy's local TCP port is 2525. 2. The back-end connection timeout is 10 seconds. 3. The back-end connection activity timeout is 2 minutes. 4. The back-end host list: a. mx1.mail.yahoo.com, TCP port 25, 2 simultaneous sessions. b. bogus-demo, TCP port 25, 1 simultaneous sessions. c. mx1.hotmailcom, TCP port 25, 1 simultaneous sessions. Use a Web browser to connect to the Web server running on TCP port 8080 on the machine running the balancer, e.g. http://davinci:8080/ and follow the link there. You'll see something like (edited to fit in 80 columns): Proxy start time: 2003/1/16 13:59:44 Current time: 2003/1/16 15:39:21 Local TCP port number: 2525 Connection timeout (seconds): 10.0000 Activity timeout (seconds): 120.000 Length of wait list: 0 Name Port Status MaxConn ActConn ActiveCount ActiveTime mx1.mail.yahoo.com 25 up 2 0 1 2.94391 bogus-demo 25 up 2 0 0 0 mx1.hotmail.com 25 up 1 0 0 0 Now, do the following: 1. Open four windows: xterm, terminal, Telnet application, or whatever. 2. Use those windows to create four simultaneous TCP connections to the proxy. E.g. "telnet davinci 2525". 3. The first two clients should see greetings from a Yahoo mail exchanger. 4. The third client should see a greeting from a HotMail mail exchanger. 5. The fourth client should connect but otherwise see nothing. 6. Type "QUIT" in the second client to terminate the session. 7. The fourth client should then be connected to an available back-end host, namely a Yahoo server. 8. In the second window, connect to the proxy again. 9. Retrieve (or reload) the balancer's stats via its HTTP server. You should see that all three back-end sessions are busy, that there's one client in the "wait list", and that the status of the "bogus-demo" server has been changed to "down". Changing the application's configuration on-the-fly --------------------------------------------------- I haven't extended the balancer's HTTP server to be able to change the balancer's config on-the-fly, but it's easy enough to do using Erlang's native message passing mechanism. It's clunkier than "click here to change BE's status to 'down'", but hey, this was an afternoon's hack! Run "cd src" (if you aren't already there) and "erl -sname foo -pz ../ebin" on any machine that you've verified that the Erlang cookies are correct & message passing works, then run the following Erlang shell commands (changing the node name as appropriate): 1. bal_proxy:get_state({balance, 'bal@davinci'}). 2. bal_proxy:reset_host({balance, 'bal@davinci'}, "mx1.mail.yahoo.com", down). 3. bal_proxy:get_state({balance, 'bal@davinci'}). The first and third commands return raw state data maintained by the balancing process: the HTTP server simply pretty-prints this data. The second command sets the state of the "mx1.mail.yahoo.com" back-end host to 'down'. ('up' is the other valid state) Other commands to experiment with are (you don't have to type them on a single line, but you should, unless you're familiar with Erlang syntax): bal_proxy:get_host({balance, 'bal@davinci'}, "mx1.hotmail.com"). bal_proxy:reset_all({balance, 'bal@davinci'}). bal_proxy:del_be({balance, 'bal@davinci'}, "bogus-demo"). bal_proxy:add_be({balance, 'bal@davinci'}, {be,"mx-ca-1.pobox.com",25,up,1,0,0,no_error,0,0,0,[]}, ""). bal_proxy:add_be({balance, 'bal@davinci'}, {be,"smtp.TheWorld.com",25,up,1,0,0,no_error,0,0,0,[]}, "mx1.hotmail.com"). Use the bal_proxy:get_state() function to see how these functions affect the state of the balancer. NOTE: You probably shouldn't delete a back-end host unless you've marked its status as 'down' first ... and then waited for all active sessions to finish. :-) NOTE: The proxy has a bug (one of several, see comments in src/bal_proxy.erl) that happens if: 1. All back-end hosts are status 'down' 2. A proxy client connects. 3. A back-end host is marked status 'up' Work-around: Don't allow this to happen. Fault tolerance demonstrated by fault injection ----------------------------------------------- One of the many applications distributed with Erlang is called "appmon", the application monitor. To start it, run "erl -sname something" on any machine that you've verified that the Erlang cookies are correct & message passing works, then run: appmon:start(). A GUI box should pop up that displays a tree of the applications running on the local node. In your case, "kernel" is probably the only application running. If you haven't already done so, run this command in the Erlang shell (using the balancer's node name, of course): bal_proxy:get_state('bal@davinci'). Then pull down the "Nodes" pull-down menu. Both the local node and 'bal@davinci' should be listed. Select the balancer's node. You should then see a tree of three applications running on the balancer: kernel, sasl, and balance. Click on the "balance" box. Another window should appear. This window displays the tree of processes (Erlang threads, remember!), including "supervisor" processes, used by the balancer application. * A top-level supervisor, named 'balance_sup'. * The 5 processes used by the "inets" HTTP server. * A variable number of processes used by the TCP balancer portion of the application: the socket listener process will always appear under the 'balance' process, as will the transient per-TCP-session processes. Now, create a TCP connection to the balancer's port. The tree will be updated to show a second process underneath 'balance'. A basic programming philosophy behind Erlang is "code only for the common case". If there's an error, e.g. divide by zero or an uncaught exception, the default action is to kill the process. You rely on supervisor processes to restart abnormally-terminated processes. Very nice. You can use "appmon" to send kill signals to any process it displays, thus simulating a bug/software failure. Do the following: 1. Make a connection to the proxy's local port. 2. Note which process appears under 'balance' when the appmon window is updated. This is the process used to copy data back and forth between your client and the back-end host. 3. Click on the "Kill" button. 4. Click on the process box representing the process found in step number 2. This will terminate your client's TCP connection, as well as the proxy's connection to the back-end host. You can have more fun with this fault injection by: * Killing the socket accept()ing process underneath 'balance'. * Killing 'balance' or any of the HTTP server's processes. If you do any of those things, the 'balance_sup' supervisor will kill all remaining application processes and restart them. It may happen so quickly that the appmon window doesn't appear to change. However, look carefully at the process ID numbers of the socket listener underneath 'balance' and the process underneath 'httpd_acc_sup_8080': both will change, indicating that they aren't the same process that used to be there. That's the supervisor in action. If you kill the 'balance_sup' process or any of its parents, the proxy will crash and exit. That's because 'balance_sup' doesn't have a parent supervisor to restart it. However, the *only* things that supervisors do are: 1. Start child processes. 2. Monitor those children 3. Restart any of those children, if they should be restarted. 4. Exit if there are too many child failures within a configured amount of time. The supervisor's code is assumed to be bug-free. It probably is. :-) Typical Erlang application design uses a tree of supervisors. If there's a low level problem severe enough that the immediate supervisor cannot deal with it, that supervisor will exit ... in effect, passing the problem up the supervisor chain. In the worst case, the top-level supervisor can kill everything and restart from scratch. This kind of deterministic application startup and fault handling is quite nice. The 'balancer_sup' supervisor is configured to tolerate up to 5 child deaths within a 30 second time period. That probably isn't realistic for real-world use, but it's fun for demonstration purposes. For more detail on the process supervisor scheme of OTP, the Open Telecom Platform, see http://www.erlang.org/doc/r9b/doc/system.html, in particular the "Design Principles" document. Hot code update --------------- Like several other functional programming languages, Erlang permits on-the-fly, hot code update. The Erlang VM supports the notion of two simultaneous loads of any particular module, current and new. It's way beyond the scope of this README to describe how this works or how the Erlang OTP release handler can upgrade (or downgrade!) the running code (and associated data structures) of one or more applications. It's a nifty, if complex, feature. This example will be much more basic: we'll add a line of output to the HTTP server's status overview. We would like the summary at the top to include the balancer's Erlang node name. That's easy enough to do. Follow these steps: 1. Edit the file "src/bal_proxy.erl" with your favorite text editor. 2. Locate the word "README", about 90% of the way from the top. You will insert code on the line immediately after this comment. (The "%" character denotes the start of an Erlang comment; they continue to the end of the line.) 3. Add the following line after the comment you found in step #2: io_lib:format("Proxy's Erlang node name: ~w\n", [node()]), 4. Change working directory to "src", if you haven't already. 5. Run "make", otherwise run "erlc -bbeam +debug_info -o../ebin bal_proxy.erl" Now that you've made the code change and recompiled it, we just need to tell the balancer to load the new code. First, just to make it really obvious what's going on, use your Web browser to retrieve the balancer's current stats. The balancer node was started using the "-noshell" command line flag, so there is no Erlang shell available for us to modify the balancer's internals. So, we'll start a second node, then create a shell session on the balancer's node, then use the second node's shell to communicate with the balancer's node. Run "cd src" (if you aren't already there) followed by "erl -sname foo" on any machine that you've verified that the Erlang cookies are correct & message passing works, then type the following things (changing the node name as appropriate): 1. Control-g 2. r bal@davinci ENTER 3. j ENTER 4. c 3 ENTER You should see something like this: % erl -sname foo Erlang (BEAM) emulator version 5.2 [source] [hipe] [threads:0] Eshell V5.2 (abort with ^G) (foo@rover)1> User switch command --> r bal@davinci --> j 1 {} 2 {shell,start,[]} 3* {bal@davinci,shell,start,[]} --> c 3 Eshell V5.2 (abort with ^G) (bal@davinci)1> Any command you type in this command shell will be executed on the 'bal@davinci' node, *not* the local one. Pretty slick, huh? Type the command "l(bal_proxy)." into this shell, and you'll see: (bal@davinci)1> l(bal_proxy). {module,bal_proxy} (bal@davinci)2> Now, tell your Web browser to reload the stats page. Notice that your new code has indeed been executed! Using tcpbalance with distcc ---------------------------- See the file "priv/sample-distcc.config" for an example config for 6 back-end machines with different numbers of CPUs and different CPU speeds. Questions, bugs, etc. --------------------- If you have questions, bug reports, etc., please email them to me. My email address is in footnote [2] below. Tcpbalance isn't meant to be a 100% bulletproof, full-featured distcc application proxy ... but that doesn't mean that I'm not willing to help out or perhaps fix bugs. Martin Pool, distcc's maintainer, suggested that this Erlang proxy could be a model for a "real" bulletproof, full-featured distcc application proxy that someone might write someday. I think that is a *great* idea! In the meantime, I'll continue using this proxy.... -Scott Lystig Fritchie Footnotes --------- [1] If you really want to use a machine with a hyphen in its DNS hostname, you can do it. You just need to put single quotes around the node name whenever you use it. For example, if you run "erl -sname foo" on a machine called "nt-regal", then whenever this document asks you to type a node name into the Erlang shell, you must type: 'foo@nt-regal' ... instead of: foo@nt-regal The latter is incorrect Erlang syntax. Technically, a node name is treated as an Erlang atom, a primitive data type, much like an atom in Lisp or Scheme. An atom typically starts with a lowercase letter and may be alphanumeric or underscore ("_"). However, if you want an atom to contain other characters or to start with an upper-case letter, it can be surrounded by single quotes. For example, 'This_is_an_atom' is valid syntax for an atom. [2] My Internet email address: the lefthand side of the "@" is "slf". The righthand side is "caspiannetworks.com".
About
Generic TCP Balancer for Erlang
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published