Ph:404.394.6102 | ICQ:29269081 | joe@joereger.com | skype: joereger


1 Author
5416 Log Entries
14982 Files
186 Locations
2451 Comments
23 Episodes
27 Graphs
23 Time Periods
0 Saved Searches
740 Smart Tags
9 Polls

< August 2008>
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Expand Calendar

On This Day
1 Year Ago:
30 Min Swim, 10 Min Stationary Bike
2 Years Ago:
Tuesday, August 29, 2006 - Standard Work and Workout Day
30 Min Morning Swim, 1hr30min Stone Mtn Ride
45 Min Swim, 1 Hr Run
3 Years Ago:
Tornado Warning
Hurricane Katrina a Category 4
4 Years Ago:
Big Weed
The First Open Water Practice Swim
joedom.com Gets It
Movie: Pulp Fiction
Welch Family Reunion
5 Years Ago:
1000 Yard Swim

My Favorite Sites
a ticket to kona
ad astra, per aspera
amy kloner
anna
billy vandervalk
carole sharpless
danielle grabol
diva marketing
dogwood girl
dtundacova
duncan mills groundblog
dylan rist
father's blog
gordo
hunter
isaac freeman
jenny selan
joe elswick
joel
josh shields
kate parker
katrina mitchell
kendy's blog
lil stew
maddox
marc crouch
mark ziler
mother's blog
nat's negative split
particleman
peter king
scripting.com
tribirdie jill
uncle-packles
vanessa
yellowjeepgirl.com
zoomartin

dNeero Messages
14 Messages Available

Search My Site


Email Subscription
Enter your email address to receive this site via email.


Graphs
None.

View this site in XML







Technorati Profile

dNeero


5
Month
23
Day
2008
Year
A Big Geeky Deploy: ClusterF is Live
8
Hour
55
Minute
PMThe servers drive me nuts. Always complaining about something. So I set out late last year to make my life easier. A few minutes ago I launched the project that I call ClusterF. dNeero.com is now running on it and, knock on wood, all seems to be ok.

Things get hairy when your site outgrows its first web server. To scale you start to run it on two servers. This is called clustering. It sounds simple, but it isn't. When a website visitor goes to one of the servers any data they change needs to be visible to website visitors on the other server. So there's a system that synchronizes data between the machines.

Then two servers becomes three, becomes four, etc. Before you know it things are complex and tedious. And, believe me, I design for dead simplicity. I'd rather write 10,000 lines of code than do something manually each time I want to deploy a new build, for example. But complexity creeps in.

My solution was to build my own little Tomcat provisioning system. My design goal was to be able to add a new server to the cluster in two minutes or less. Put a single file onto the server, double click to install, provision the instance and you're up and running. In tests I was able to do two minutes but in production it makes sense to check everything you do so it's realistically more like five minutes to set up a server. Not bad considering that it used to be around an hour... and at that I considered my processes fairly lightweight.

Here's how it works. I install ClusterF onto each server in the cluster. ClusterF includes clean copies of Tomcat and a Java JDK. Using a GUI (set of screens with buttons... a desktop application) I can control any of the servers in the cluster. I can, for example, create a new Tomcat instance.

To create a new Tomcat instance ClusterF copies the clean Tomcat into a deploy directory. It then uses a Java wrapper to make it executable as a service on Windows or Linux... this is done by copying four or five files into the instance's Tomcat directory and editing some config files... all automatically. Next, ClusterF configures Tomcat's server.xml file so that it'll cluster sessions with other instances... or create itself as a separate cluster if so chosen in the GUI.

Once the instances are up and running I can define an App. An App has a set of config properties like database connection strings, etc. I can configure these properties centrally and then ClusterF will make sure they're available as a properties bundle each time an instance is run. I can run multiple Apps per instance by choosing different root directories in the GUI.

Using the GUI I associate Apps to instances, saying essentially that I want App X to run on Servers 1, 2 and 5. I can then deploy a WAR file. ClusterF allows me to choose the WAR file centrally and then it distributes it to the cluster. This was actually quite tricky. I'm clustering using JGroups. I had to break the file into little chunks and send each one as a separate part. JGroups doesn't (yet) handle big payloads well. On the other side I re-assemble the file and process it. Of course, I have to have error checking (MD5 checksum), receipt management, versioning, etc to make sure that the WAR deploys properly.

ClusterF allows me to start/stop all instances running an app with a single button click. And it's cluster smart... meaning that it'll start one instance and give it a 60 second head start so that it'll establish itself as the primary controller in the cluster... then it'll start other instances.

So far I've saved myself time setting up Tomcat instances, managing config files and deploying builds. The last thing I wanted to save time with was restarting app servers.

I built a monitoring system into ClusterF. Each instance pings the others in the cluster periodically to make sure they're up. If they're not, ClusterF can restart them automatically, after set periods of time, etc. I'm essentially trying to make the system self-healing when things like out of memory errors happen.

When things go down I get cell phone pings and emails. I want to be informed, of course, but my hope is that I'll be able to watch ClusterF work through the issue itself from afar. Every now and then I'm sure I'll have to intervene.

ClusterF is a remote control for Tomcat instances. I didn't want Tomcat instances to ever run inside of ClusterF's runtime. Doing so would have introduced another level of possible failure on top of Tomcat. I didn't want that. The Tomcats and ClusterF run independently of each other. If ClusterF goes down I lose some monitoring and self-recovery capability but the Apps stay up. They're decoupled.

Work on ClusterF was always secondary. I got a few hours here and a few hours there for a couple months in Dec 07/Jan 08. Then I got pulled away with other things. A couple weeks ago, with the servers being a pain in the ass, I decided to revive ClusterF. Tonight I finally got it up and running.

Of course, we all know that when I release something it's virtually guaranteed that I'll spend the next 24-48 hours cussing up a storm as it hits the fan. We'll see how this one goes.

Plans for the future of ClusterF? I'll add stuff as I need it. More robust page checking would be nice... the ability to specify a URL and define a string that must come back or else it's defined as a fail. The GUI sucks, but works... I could use some cleaning up there. I'd also like to build a web interface. To do so I'll use NanoHTTPD and write simple pages that have basic status reporting and controls (start, stop, disable, etc.)

The name ClusterF? It's software that makes clustering possible. I can't remember what the F stands for... oh, wait... yes I can. At startup: "Now we're F'd."
Timezone: US/Eastern
3 Months Ago
Author:

Joe Reger, Jr.
Keyword Tags
Location:
All Locations
Not Specified

clusterF.jpg
0 bytes

Keyword Tags:
clusterf

Related Entries
Russel Beattie's Tomcat Config Tips
Deployment, Downtime
First Bug Submitted to Apache Bug Database
Late Upgrades to Server
Thursday, June 14, 2007: I Give Up... Clustering Wins
Time Period
This entry took place during these time periods.
Tue, Jun 1, 1993 12:00:00
US/Eastern
Thu, Aug 28, 2008 23:01:39
US/Eastern

Smyrna Home (view)
Ava Tallulah (view)
Married to Heather (view)
Living in Atlanta, GA (view)
Favorite Entry?
No
Reader Messages:
 Date: 2008-05-23 22:58:17
Name: Freddy
Nice right up on ClusterF...Hahah! Your explanations are very geeky.. Gotta love the geek in you...
Add a Message:
Name (Optional):

Email Address (Optional, not shown to public.):

URL (Optional):

Message (Required):

Email Me When Somebody Replies (Optional)
Remember My Info For Next Time (Optional. Cookie will be set.)

Prove that you're human by typing the wavy letters into the box below: