-
Notifications
You must be signed in to change notification settings - Fork 9
/
README
647 lines (483 loc) · 24.6 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
TCPBalance, a load-balancing TCP proxy for distcc
=================================================
There are dozens of Open Source TCP proxies available, written in
close to a dozen languages, many of them capable of load balancing.
Many of them would work with "distcc". Why write yet another TCP
proxy? Why do it in Erlang?
All of the TCP proxies I found, none appeared to have the following
combination of features:
1. Not be too HTTP-centric to not be able to work with "distcc".
2. Be aware that some back-end hosts may be faster than other
hosts. For each client connection, the proxy should choose the
fastest back-end host that is currently idle.
3. Be aware of back-end hosts with multiple CPUs.
4. When all back-end hosts are busy, make the client wait for the
next available back-end host when it is available, rather than
giving a back-end host more work than it is configured to
handle.
5. Detect when a back-end host is down and do something sane (like
avoid giving future jobs to the dead machine).
6. Permit an administrator to put back-end hosts back in service,
take them out of service, as well as add and remove hosts from
the pool without adversely affecting clients using the proxy.
7. Keep basic statistics about back-end hosts and make them
available via HTTP or Telnet.
Features 1-5 were mandatory. Features 6-7 would be nice. In a couple
of hours of Web surfing, I didn't find a TCP proxy that was capable of
doing 1-5, so I decided to write my own.
I knew it would be fairly easy to implement features 6-7 as well as
1-5 in Erlang (see http://www.erlang.org/), so that's what I used.
This proxy has been in use at Caspian Networks for over two months.
It's pretty solid.
This README file is quite long. Sorry about that. However, much of
it is a tutorial for Erlang newbies ... and perhaps a bit of
evangelism. :-) I'll try to keep things straightforward, but I will
also demonstrate some of the nifty communication, fault-tolerance, and
hot code upgrade features of Erlang.
License
-------
See the file "LICENSE", at the top of the tcpbalance source
distribution, for full licensing terms.
Obtaining, Compiling, and Installing Erlang
-------------------------------------------
Oh boy, yet another programming language development environment to
compile and install. Well, if you're still reading this, you're
interested enough in tcpbalance's feature set to try it out.
The current release of Erlang (as of 15 January 2003) is R9B-0. All
of the source and pre-compiled packages mentioned below can be
obtained from http://www.erlang.org/download/.
* UNIX and Linux users:
Erlang runs quite well under: Linux, BSD flavors, Solaris, and
others.
Although it's slow, I recommend obtaining the source and compiling
from scratch. The distribution is pretty big (8MB compressed) and
contains a kajillion files ... but the Erlang programming & runtime
environment contains many useful tools (including a nearly-complete
CORBA implementation) that are themselves written in Erlang.
Tcpbalance doesn't use most of them, but you get to compile all of
them. :-)
Follow the included directions, but the simple instructions are:
1. Extract the source package and change directory to its root.
2. Run "./configure" to use the default installation root
("/usr/local" on most platforms) or
"./configure --prefix=/path/to/install/root" if you wish to use
another installation root path.
3. Run "make".
4. Go do something else for a while. The Erlang VM is implemented
in C, which compiles in a few minutes, but most of the run-time
environment & misc tools are implemented in Erlang itself, and
there's a *lot* of it.
5. Run "make install".
6. Add /path/to/install/root/bin, whatever it is, to your shell's
program path.
* MacOS X users:
I haven't used it, but there's a pre-compiled, disk image-style
installation thingie available. Give it a try if you wish.
* Microsoft Windows users:
Your only option is installing a pre-compiled package. Whoo hoo!
When I installed Erlang R9B-0 on a Windows NT 4.0 machine, I
discovered that "C:\Program Files\erl5.2\bin" was already added to my
program path. What a deal.
Testing Erlang Inter-Node Communication
---------------------------------------
One of the really nice things about Erlang is that communication (via
message passing) between threads inside an Erlang virtual machine is
exactly the same as message passing between threads on different
Erlang virtual machines.
An instance of the Erlang virtual machine is called a "node". Erlang
threads are called "processes", which can really confuse UNIX geeks if
you're not careful. In this document, if I want to refer to a UNIX or
NT process/task, I'll call it an "operating system process" or "OS
process" to avoid confusion.
Erlang's inter-node message passing relies on a simple shared secret
mechanism similar to the X11's "MIT magic cookie" authentication
scheme. Erlang stores a "cookie" in $HOME/.erlang.cookie (where $HOME
is your account's home directory) on UNIX boxes and in
C:\.erlang.cookie on NT boxes. To communicate with each other, all
Erlang nodes must share the exact same cookie.
(There are other ways to configure cookies, but I'm only going to
describe one.)
For nodes running on the same machine, there's no problem: everybody
is sharing the same file system(s).
For nodes running on different UNIX machines, you'll need to:
1. Use NFS or another shared file system for your $HOME directory.
2. Copy the .erlang.cookie file by hand to the $HOME directory of
each machine you wish to run an Erlang node.
For nodes running on different NT machines, you'll need to copy the
.erlang.cookie to each machine.
If you're mixing NT and UNIX machines (which is certainly possible),
make certain the exact same .erlang.cookie file is used on all of
them.
To make life easier on yourself, make certain that all of the machines
involved are present in DNS. E.g. if you're playing with "davinci"
and "munch", make certain that "ping davinci" and "ping munch" works
on both machines.
If a machine has a hyphen, "-", in its DNS name ... choose another
machine. Erlang's syntax requires treating hyphens specially. This
example is complicated enough as it is. (See footnote [1] below.)
An Erlang node name looks like:
foo@hostname
or:
foo@hostname.fully.qualified.domain
For simplicity, we'll use the former.
Each node must have a unique node name. For nodes running on the same
machine, the lefthand side of the "@" must, therefore, be unique. If
you attempt to start two nodes on the same machine with the same name,
the second VM will spit out a very long and cryptic error message
(including the string "Kernel pid terminated").
For my example, I'm going to use the nodes 'foo@davinci' and
'bar@rover'.
NOTE: Erlang commands are case-specific!
On each machine, start the Erlang VM and interactive shell:
1. UNIX: run "erl -sname NODE_LHS". E.g. "erl -sname foo"
2. NT: run "werl -sname NODE_LHS" E.g. "werl -sname foo"
3. Mac OS X: I've never used it, so you're on your own.
You will see something like:
Erlang (BEAM) emulator version 5.2 [source] [hipe] [threads:0]
Eshell V5.2 (abort with ^G)
(foo@davinci)1>
Note that the node name is included in the prompt string.
Type the command "erlang:get_cookie()." and press Enter. You should
see:
(foo@davinci)1> erlang:get_cookie().
'JQHZIQLDNQGUSZRAJXHB'
You should see the same cookie on each node. If not, go back and fix
it.
Now, we'll test the inter-node message passing capability. One one of
your nodes, type the following:
(foo@davinci)2> register(test, self()).
true
(foo@davinci)3> receive Msg -> Msg end.
You won't get see a prompt right away: you're blocked waiting for a
message to arrive.
On the other machine, type the following, substituting the other
machine's node name:
(bar@rover)1> {test, 'foo@davinci'} ! hello_world.
hello_world
On the first machine, you should see:
(foo@davinci)3> receive Msg -> Msg end.
hello_world
(foo@davinci)4>
If you run the function "nodes()", you should see the name of all
other nodes that your node is aware of.
(bar@rover)3> nodes().
[foo@davinci]
(bar@rover)4>
Congratulations! It's now time to play with the tcpbalance
application. Use the command "q()." to exit the shell, or press
Control-c.
Modifying the application and Web server config to fit your environment
-----------------------------------------------------------------------
I haven't spent any time trying to auto-magically edit the files that
will need editing before you can use tcpbalance. I've made life
"easier" by using relative, not absolute paths. Therefore, no extra
configuration or file editing should be necessary.
This means that you *must* change the current working directory
exactly as described below (or else things won't work). This is not
how a "real" Erlang application would be installed & run, but I
haven't taken the time to do that. Sorry.
If you want the built-in HTTP server to use a port other than port
8080, edit the file "priv/inets.conf" and modify the "Port" directive.
Compiling the application
-------------------------
The file "src/Makefile" requires GNU Make. To use it:
% cd src
% make
If you do not have GNU Make, or if you're using NT and don't have GNU
Make available, execute the commands found in the file "Make.all.out".
The file src/balance.rel contains version numbers that are specific to
a particular Erlang/OTP release version. If you see error messages
like this:
stdlib: No valid version ("1.11.0") of .app file found. Found file
"/usr/local/lib/erlang/lib/stdlib-1.11.4.1/ebin/stdlib.app" with
version "1.11.4.1"
kernel: No valid version ("2.8.0") of .app file found. Found file
"/usr/local/lib/erlang/lib/kernel-2.8.1.1/ebin/kernel.app" with
version "2.8.1.1"
gmake: *** [balance.boot] Error 1
... then you have a different version of Erlang/OTP than I originally
used for tcpbalance. Edit the file src/balance.rel to replace the
invalid version numbers with the version numbers mentioned at the end
of each error message.
Running the application with the example configuration file
-----------------------------------------------------------
Run the following:
% cd src ... if you haven't already
% erl -sname bal -pz ../ebin -boot balance -config ../priv/be-list -noshell
You will see a whole bunch of diagnostic messages, labelled "PROGRESS
REPORT". The last one will say something like:
=PROGRESS REPORT==== 16-Jan-2003::13:59:45 ===
application: balance
started_at: bal@davinci
Congratulations. This means that the application is running
successfully. If this isn't what you see, and you're certain that
you've followed all the directions, cut-and-paste the output and email
it to me. (See footnote [2] below.)
The example configuration file, "../priv/be-list.config", is a proxy
for two SMTP servers, mx1.mail.yahoo.com and mx1.hotmail.com. The
proxy is listening to local port 2525.
% telnet davinci 2525
Trying 10.10.10.10...
Connected to localhost.localdomain.
Escape character is '^]'.
220 YSmtp mta614.mail.yahoo.com ESMTP service ready
quit
221 mta614.mail.yahoo.com
Connection closed by foreign host.
See the "be-list.config" file for full details of the configuration.
To summarize:
1. The proxy's local TCP port is 2525.
2. The back-end connection timeout is 10 seconds.
3. The back-end connection activity timeout is 2 minutes.
4. The back-end host list:
a. mx1.mail.yahoo.com, TCP port 25, 2 simultaneous sessions.
b. bogus-demo, TCP port 25, 1 simultaneous sessions.
c. mx1.hotmailcom, TCP port 25, 1 simultaneous sessions.
Use a Web browser to connect to the Web server running on TCP port
8080 on the machine running the balancer, e.g. http://davinci:8080/
and follow the link there. You'll see something like (edited to fit
in 80 columns):
Proxy start time: 2003/1/16 13:59:44
Current time: 2003/1/16 15:39:21
Local TCP port number: 2525
Connection timeout (seconds): 10.0000
Activity timeout (seconds): 120.000
Length of wait list: 0
Name Port Status MaxConn ActConn ActiveCount ActiveTime
mx1.mail.yahoo.com 25 up 2 0 1 2.94391
bogus-demo 25 up 2 0 0 0
mx1.hotmail.com 25 up 1 0 0 0
Now, do the following:
1. Open four windows: xterm, terminal, Telnet application, or
whatever.
2. Use those windows to create four simultaneous TCP connections
to the proxy. E.g. "telnet davinci 2525".
3. The first two clients should see greetings from a Yahoo mail
exchanger.
4. The third client should see a greeting from a HotMail mail
exchanger.
5. The fourth client should connect but otherwise see nothing.
6. Type "QUIT" in the second client to terminate the session.
7. The fourth client should then be connected to an available
back-end host, namely a Yahoo server.
8. In the second window, connect to the proxy again.
9. Retrieve (or reload) the balancer's stats via its HTTP server.
You should see that all three back-end sessions are busy, that
there's one client in the "wait list", and that the status of
the "bogus-demo" server has been changed to "down".
Changing the application's configuration on-the-fly
---------------------------------------------------
I haven't extended the balancer's HTTP server to be able to change the
balancer's config on-the-fly, but it's easy enough to do using
Erlang's native message passing mechanism. It's clunkier than "click
here to change BE's status to 'down'", but hey, this was an
afternoon's hack!
Run "cd src" (if you aren't already there) and "erl -sname foo -pz
../ebin" on any machine that you've verified that the Erlang cookies
are correct & message passing works, then run the following Erlang
shell commands (changing the node name as appropriate):
1. bal_proxy:get_state({balance, 'bal@davinci'}).
2. bal_proxy:reset_host({balance, 'bal@davinci'}, "mx1.mail.yahoo.com", down).
3. bal_proxy:get_state({balance, 'bal@davinci'}).
The first and third commands return raw state data maintained by the
balancing process: the HTTP server simply pretty-prints this data.
The second command sets the state of the "mx1.mail.yahoo.com" back-end
host to 'down'. ('up' is the other valid state)
Other commands to experiment with are (you don't have to type them on
a single line, but you should, unless you're familiar with Erlang
syntax):
bal_proxy:get_host({balance, 'bal@davinci'}, "mx1.hotmail.com").
bal_proxy:reset_all({balance, 'bal@davinci'}).
bal_proxy:del_be({balance, 'bal@davinci'}, "bogus-demo").
bal_proxy:add_be({balance, 'bal@davinci'}, {be,"mx-ca-1.pobox.com",25,up,1,0,0,no_error,0,0,0,[]}, "").
bal_proxy:add_be({balance, 'bal@davinci'}, {be,"smtp.TheWorld.com",25,up,1,0,0,no_error,0,0,0,[]}, "mx1.hotmail.com").
Use the bal_proxy:get_state() function to see how these functions
affect the state of the balancer.
NOTE: You probably shouldn't delete a back-end host unless you've
marked its status as 'down' first ... and then waited for all
active sessions to finish. :-)
NOTE: The proxy has a bug (one of several, see comments in
src/bal_proxy.erl) that happens if:
1. All back-end hosts are status 'down'
2. A proxy client connects.
3. A back-end host is marked status 'up'
Work-around: Don't allow this to happen.
Fault tolerance demonstrated by fault injection
-----------------------------------------------
One of the many applications distributed with Erlang is called
"appmon", the application monitor. To start it, run "erl -sname
something" on any machine that you've verified that the Erlang cookies
are correct & message passing works, then run:
appmon:start().
A GUI box should pop up that displays a tree of the applications
running on the local node. In your case, "kernel" is probably the
only application running.
If you haven't already done so, run this command in the Erlang shell
(using the balancer's node name, of course):
bal_proxy:get_state('bal@davinci').
Then pull down the "Nodes" pull-down menu. Both the local node and
'bal@davinci' should be listed. Select the balancer's node. You
should then see a tree of three applications running on the balancer:
kernel, sasl, and balance. Click on the "balance" box.
Another window should appear. This window displays the tree of
processes (Erlang threads, remember!), including "supervisor"
processes, used by the balancer application.
* A top-level supervisor, named 'balance_sup'.
* The 5 processes used by the "inets" HTTP server.
* A variable number of processes used by the TCP balancer portion
of the application: the socket listener process will always
appear under the 'balance' process, as will the transient
per-TCP-session processes.
Now, create a TCP connection to the balancer's port. The tree will be
updated to show a second process underneath 'balance'.
A basic programming philosophy behind Erlang is "code only for the
common case". If there's an error, e.g. divide by zero or an uncaught
exception, the default action is to kill the process. You rely on
supervisor processes to restart abnormally-terminated processes. Very
nice.
You can use "appmon" to send kill signals to any process it displays,
thus simulating a bug/software failure. Do the following:
1. Make a connection to the proxy's local port.
2. Note which process appears under 'balance' when the appmon
window is updated. This is the process used to copy data back
and forth between your client and the back-end host.
3. Click on the "Kill" button.
4. Click on the process box representing the process found in step
number 2.
This will terminate your client's TCP connection, as well as the
proxy's connection to the back-end host.
You can have more fun with this fault injection by:
* Killing the socket accept()ing process underneath 'balance'.
* Killing 'balance' or any of the HTTP server's processes.
If you do any of those things, the 'balance_sup' supervisor will kill
all remaining application processes and restart them. It may happen
so quickly that the appmon window doesn't appear to change. However,
look carefully at the process ID numbers of the socket listener
underneath 'balance' and the process underneath 'httpd_acc_sup_8080':
both will change, indicating that they aren't the same process that
used to be there. That's the supervisor in action.
If you kill the 'balance_sup' process or any of its parents, the proxy
will crash and exit. That's because 'balance_sup' doesn't have a
parent supervisor to restart it. However, the *only* things that
supervisors do are:
1. Start child processes.
2. Monitor those children
3. Restart any of those children, if they should be restarted.
4. Exit if there are too many child failures within a configured
amount of time.
The supervisor's code is assumed to be bug-free. It probably is. :-)
Typical Erlang application design uses a tree of supervisors. If
there's a low level problem severe enough that the immediate
supervisor cannot deal with it, that supervisor will exit ... in
effect, passing the problem up the supervisor chain. In the worst
case, the top-level supervisor can kill everything and restart from
scratch. This kind of deterministic application startup and fault
handling is quite nice.
The 'balancer_sup' supervisor is configured to tolerate up to 5 child
deaths within a 30 second time period. That probably isn't realistic
for real-world use, but it's fun for demonstration purposes.
For more detail on the process supervisor scheme of OTP, the Open
Telecom Platform, see http://www.erlang.org/doc/r9b/doc/system.html,
in particular the "Design Principles" document.
Hot code update
---------------
Like several other functional programming languages, Erlang permits
on-the-fly, hot code update. The Erlang VM supports the notion of two
simultaneous loads of any particular module, current and new. It's
way beyond the scope of this README to describe how this works or how
the Erlang OTP release handler can upgrade (or downgrade!) the running
code (and associated data structures) of one or more applications.
It's a nifty, if complex, feature.
This example will be much more basic: we'll add a line of output to
the HTTP server's status overview. We would like the summary at the
top to include the balancer's Erlang node name. That's easy enough to
do. Follow these steps:
1. Edit the file "src/bal_proxy.erl" with your favorite text
editor.
2. Locate the word "README", about 90% of the way from the top.
You will insert code on the line immediately after this
comment. (The "%" character denotes the start of an Erlang
comment; they continue to the end of the line.)
3. Add the following line after the comment you found in step #2:
io_lib:format("Proxy's Erlang node name: ~w\n", [node()]),
4. Change working directory to "src", if you haven't already.
5. Run "make", otherwise run
"erlc -bbeam +debug_info -o../ebin bal_proxy.erl"
Now that you've made the code change and recompiled it, we just need
to tell the balancer to load the new code.
First, just to make it really obvious what's going on, use your Web
browser to retrieve the balancer's current stats.
The balancer node was started using the "-noshell" command line flag,
so there is no Erlang shell available for us to modify the balancer's
internals. So, we'll start a second node, then create a shell session
on the balancer's node, then use the second node's shell to
communicate with the balancer's node.
Run "cd src" (if you aren't already there) followed by "erl -sname
foo" on any machine that you've verified that the Erlang cookies are
correct & message passing works, then type the following things
(changing the node name as appropriate):
1. Control-g
2. r bal@davinci ENTER
3. j ENTER
4. c 3 ENTER
You should see something like this:
% erl -sname foo
Erlang (BEAM) emulator version 5.2 [source] [hipe] [threads:0]
Eshell V5.2 (abort with ^G)
(foo@rover)1>
User switch command
--> r bal@davinci
--> j
1 {}
2 {shell,start,[]}
3* {bal@davinci,shell,start,[]}
--> c 3
Eshell V5.2 (abort with ^G)
(bal@davinci)1>
Any command you type in this command shell will be executed on the
'bal@davinci' node, *not* the local one. Pretty slick, huh?
Type the command "l(bal_proxy)." into this shell, and you'll see:
(bal@davinci)1> l(bal_proxy).
{module,bal_proxy}
(bal@davinci)2>
Now, tell your Web browser to reload the stats page. Notice that your
new code has indeed been executed!
Using tcpbalance with distcc
----------------------------
See the file "priv/sample-distcc.config" for an example config for 6
back-end machines with different numbers of CPUs and different CPU
speeds.
Questions, bugs, etc.
---------------------
If you have questions, bug reports, etc., please email them to me. My
email address is in footnote [2] below. Tcpbalance isn't meant to be
a 100% bulletproof, full-featured distcc application proxy ... but
that doesn't mean that I'm not willing to help out or perhaps fix
bugs.
Martin Pool, distcc's maintainer, suggested that this Erlang proxy
could be a model for a "real" bulletproof, full-featured distcc
application proxy that someone might write someday. I think that is a
*great* idea! In the meantime, I'll continue using this proxy....
-Scott Lystig Fritchie
Footnotes
---------
[1] If you really want to use a machine with a hyphen in its DNS
hostname, you can do it. You just need to put single quotes around
the node name whenever you use it. For example, if you run "erl
-sname foo" on a machine called "nt-regal", then whenever this
document asks you to type a node name into the Erlang shell, you must
type:
'foo@nt-regal'
... instead of:
foo@nt-regal
The latter is incorrect Erlang syntax.
Technically, a node name is treated as an Erlang atom, a primitive
data type, much like an atom in Lisp or Scheme. An atom typically
starts with a lowercase letter and may be alphanumeric or underscore
("_"). However, if you want an atom to contain other characters or to
start with an upper-case letter, it can be surrounded by single
quotes. For example,
'This_is_an_atom'
is valid syntax for an atom.
[2] My Internet email address: the lefthand side of the "@" is "slf".
The righthand side is "caspiannetworks.com".