This tutorial will walk you through the basics of setting up and using Capistrano. It will not introduce you to the deployment system that is bundled with Capistrano, but will instead focus on the more general areas of executing Capistrano and writing your own recipes. It will be primarily of interest to those wanting to use Capistrano in non-deployment domains, and to those who just wish to become more familiar with Capistrano itself.
Capistrano is actually comprised of the Capistrano gem.
$ gem -v
Your RubyGems should be at least 1.3.x, if not please follow their upgrade instructions and come back here... with RubyGems up to date, you can install Capistrano and its dependencies with the following:
$ gem install capistrano
Capistrano makes a few assumptions about your servers. In order to use Capistrano, you will need to comply with these assumptions:
- You are using SSH to access your remote machines. Telnet and FTP are not supported.
- Your remote servers have a POSIX-compatible shell installed. The shell must be called “sh” and must reside in the default system path.
- If you are using passwords to access your servers, they must all have the same password. Because this is not generally a good idea, the preferred way of accessing your servers is with a public key. Make sure you’ve got a good passphrase on your key.
- Capistrano also makes a few assumptions about your own familiarity with computers: You should be comfortable working from a command-line. You do not need to know advanced shell scripting techniques or anything like that (though it helps, if you want to start writing complex Capistrano recipes), but you should be able to navigate directories and execute commands from the command-line. Capistrano has no GUI interface; it is entirely command-line driven.
To take full advantage of Capistrano, you should be comfortable (or at least, minimally familiar) with the Ruby programming language. When you write your own Capistrano tasks, you do so in Ruby.
Capistrano reads its instructions from a capfile. (For those of you familiar with the “make” or “rake” utilities, the concept is the same as a “makefile” or “rakefile”.) If you create a file called capfile (or Capfile, if you prefer), Capistrano will read that file and process the instructions in it.
The Capfile is where you will tell Capistrano about the servers you want to connect to and the tasks you want to perform on those servers. It is essentially just a Ruby script, but augmented with a large set of “helper” syntax, to make it easy to define server roles and tasks. (Using the lingo of those in the know, the Capfile is written using a custom DSL on top of Ruby.)
You can use any editor you want to write your Capfiles; they are just simple text files. I recommend something designed for programmers, like vim, emacs, TextMate, Eclipse, and so forth. Choose whatever you’re comfortable with, but make sure whatever you choose can save files as plain text, and will not automatically append a extension like ”.txt” to the filename. The Capfile should be called “capfile” or “Capfile”, without any extension.
Note: As capistrano is helpful, it will search up your file tree until it finds a capfile, this is intended to ensure if you are anywhere within your application, and you try and run cap, it will find the correct capfile; this has been known to catch people out though; your home directory is also searched.
So, enough chit-chat. Let’s look at a very simple capfile, to see what it’s like:
task :search_libs, :hosts => "www.capify.org" do
run "ls -x1 /usr/lib | grep -i xml"
end
This defines a single task, called “search_libs”, and says that it should be executed only on the “www.capify.org” host. When executed, it will display all files and subdirectories in /usr/lib that include the text “xml” in their name. By default, “run” will display all output to the console.
Assuming your capfile is in the current directory, you would execute that task like this (from the command-line):
$ cap search_libs
You can define as many tasks as you like:
task :search_libs, :hosts => "www.capify.org" do
run "ls -x1 /usr/lib | grep -i xml"
end
task :count_libs, :hosts => "www.capify.org" do
run "ls -x1 /usr/lib | wc -l"
end
Here we’ve added a second task, “count_libs”, which will display the number of entries in /usr/lib. We could add more, but even with just two tasks, having to specify the host over and over is getting a little unwieldy. Here is where “roles” come into play:
role :libs, "www.capify.org"
task :search_libs do
run "ls -x1 /usr/lib | grep -i xml"
end
task :count_libs do
run "ls -x1 /usr/lib | wc -l"
end
We’ve created one new role, called “libs”, and associated “www.capify.org” with that role. By default, a task will be executed on all servers in all roles, so we were able to drop the :hosts declaration from the tasks. Much simpler!
(Note: Capistrano’s login defaults to whatever user you are currently logged into your local machine as. If you need to log in as a different user, you can encode that username in the server definition, “[email protected]”. Alternatively, you can set the :user variable to the username you want to use. We’ll get to variables in a minute.)
In the “real world”, we have to worry about bad guys trying to sneak into our servers, so many server clusters are hidden behind NATs and firewalls, to prevent direct access. Instead, you have to log into some “gateway” server, and then log into the servers you want, from there.
Capistrano supports this scenario by allowing you to define a gateway server. All subsequent connections will be tunneled through the gateway (using SSH forwarded ports). To tell Capistrano about your gateway server, you simply do:
set :gateway, "www.capify.org"
role :libs, "private.capify.org"
Here, the assumption is that private.capify.org is behind a NAT, and cannot be directly accessed. By setting the :gateway value to “www.capify.org”, we tell Capistrano that in order to access any of the other servers, it must first establish a connection to “www.capify.org”, and tunnel subsequent connections through that.
So things are going great on your one server. But load climbs, and you decide you need to add another one. You’d still like to be able to query both servers from a single task, though…
Easy enough:
role :libs, "private.capify.org", "mail.capify.org"
Now, when you execute “cap search_libs” or “cap count_libs”, the command will be executed in parallel on both servers, with the output aggregated into a single stream and displayed on your console.
Down the road, let’s say you add a file server to the mix (to host all those pirated mp3’s you’re serving, you naughty person, you). Because it isn’t really running anything, you don’t care so much about the libraries installed there, so you don’t want the search_libs or count_libs tasks to run there. At the same time, you have a task that shows you the free disk space that you only want run on the file server. What to do?
role :libs, "crimson.capify.org", "magenta.capify.org"
role :files, "fuchsia.capify.org"
task :search_libs, :roles => :libs do
run "ls -x1 /usr/lib | grep -i xml"
end
task :count_libs, :roles => :libs do
run "ls -x1 /usr/lib | wc -l"
end
task :show_free_space, :roles => :files do
run "df -h /"
end
That was easy enough. We just added another role (“files”), and then added :roles constraints to each task, specifying which role each is associated with. When we run “search_libs”, it will only be run against the servers defined in the “libs” role. Similarly, “show_free_space” will only be run against the servers defined in the “files” role.
(Note: you can specify that a task should run on servers in multiple roles by passing an array of role names to the :roles option.)
Capistrano tasks can be considered to be self-documenting, there are a couple of Rake-inspired ways to get a list of all the tasks Capistrano can run for a given installation:
$ cap -T
Will list all tasks with a definition, the output may look something like this:
cap deploy # Deploys your project.
cap deploy:check # Test deployment dependencies.
cap deploy:cleanup # Clean up old releases.
cap deploy:cold # Deploys and starts a `cold' application.
cap deploy:migrate # Run the migrate rake task.
cap deploy:migrations # Deploy and run pending migrations.
cap deploy:pending # Displays the commits since your last deploy.
If you were to add a task without a description, it would not be shown in this list, except the verbose version:
$ cap -vT
The output for a default installation is the same, however unless your tasks have a name, they should only appear in verbose list.
To add a description to a task, see the following short example:
desc "Echo the server's hostname"
task :echo_hostname do
run "echo `hostname`"
end
Cap invoke is a simple command, when called on the command line, you can simultaneously send one command to all servers:
$ cap invoke COMMAND="echo 'Hello World'"
If for some reason you want to run a command using sudo, you may find the following syntax useful:
$ cap invoke COMMAND="echo 'Hello World'" SUDO=1
As nifty as “invoke” is, it has at least one significant drawback: every time you invoke a command, it has to reestablish connections to the servers. This isn’t a big deal at all if you’re only executing a single command, but if you have two or three that you want to run, it can quickly get annoying, having to wait for the connections to be made.
Enter “cap shell”. This gives you an interactive prompt from which you can enter adhoc commands and even execute tasks, and all connections that are established during the duration of the shell session are cached and reused. That means that if you execute a command and it has to connect to three different servers, the next time you execute a command that needs to connect to those same servers, the connections are reused. Really slick! It’s a really handy tool for all kinds of system administration tasks.
Just as with the “invoke” task, the “shell” task lets you scope commands and tasks by host or role. You can type “help” from within the shell at any time, to get more information.
Since Capistrano 2.x all tasks are namespaced, this simply means that tasks are separated out into logical modules... most of the tasks that ship with the default recipes are broken into either a deploy or a web namespace, the deploy namespace contains all the tasks used in the Default Deploy Path, whilst web contains a couple of useful helpers for showing and hiding a "Upgrade in Progress" page to your users during a deploy.
Now you should go and read From The Beginning.