Bash Shell Scripting Tutorial

Basics

This tutorial assumes that you already know how to log in to your UNIX machine, bring up the bash shell, and run basic commands such as ls and cat. Getting to this point is fairly easy, but unfortunately this is the level that most users stay at indefinitely. This tutorial is intended to help you start to learn the more advanced features of using a shell, and specifically bash, one of the most powerful shells available.

Please note that most of the command themselves are not explained; you can examine their functionality yourself by either reading their man page, or just experimenting with them to see what they do.

Redirection

Normally programs take input from our keyboard, and display the output to our screens. However, these are just the defaults - UNIX has the ability to redirect the input (commonly referred to as stdin, short for standard input) and output (commonly referred to as stdout, short for standard output).

Here is a simple example: the cat command displays the contents of a file to the screen. But we can redirect those contents to a file using the redirection operator '>', like so:

$ cat myfile.txt
This is the contents of the file myfile.txt
$ cat myfile.txt > newfile.txt
$ cat newfile.txt
This is the contents of the file myfile.txt
$

In effect, we've made cat do the same thing as cp, by redirecting the output from the screen to a file. This isn't terribly useful, but consider another, similar, scenario: cat can take multiple arguments, and it will display the files one after the other. This can be used to append one file onto another and create a new, combined file.

$ cat file1.txt
The quick brown fox...
$ cat file2.txt
...jumped over the lazy dog.
$ cat file1.txt file2.txt > combined.txt
$ cat combined.txt
The quick brown fox...
...jumped over the lazy dog.

Neat, eh?

We can also redirect the input, so that a program takes the contents of a file as if it were typed at your keyboard. cat doesn't take any input, so let's use bc, a command line calculator. Normally, you run the program, and it lets you type in calculations such as "2+2", and then displays the result. But if you have a file which already contains the calculations, you can send it straight to bc - faster and more powerful than cutting-and-pasting the text in with your mouse.

$ cat calc.txt
2+2
$ bc < calc.txt
4
$

A final note: you can use two output redirect symbols together - >> - to indicate that you want to append the file, not overwrite it. Hence:

$ cat file1.txt
Contents of file1.
$ cat file2.txt
Contents of file2.txt.
$ cat file2.txt >> file1.txt
$ cat file1.txt
Contents of file1.
Contents of file2.txt.
$

Pipes

Pipes are similar to redirection, but they are even more powerful, because they allow you to send input and output back and forth between programs. For example, let's say that you want to send the calculation "2+2" to bc in a single command, without creating a file for redirection, or firing up bc and then entering the calculation. You can use the command echo, which normally sends output to the screen:

$ echo 2+2
2+2
$

But instead we can send it through a pipe using the pipe character '|', which you create by typing shift-backslash on your keyboard.

$ echo 2+2 | bc
4
$

One thing that may throw you off about this example is that it seems intuitively out of order; bc is the "important" program being run, and normally we would expect that to come first. But it doesn't, and for good reason: the pipeline works by sending the output of each successive program to the one immediately to its right. In this case, echo has to run first, because it generates the text "2+2" and stuffs it into the pipe. bc then receives the "2+2" text, and does its thing, which is process the calculation.

Possibly one of the most immediately useful features of the UNIX command line is the ability to fire off an email instantaneously using the mail command.

$ echo "Hey, I sent this mail from the bash command line!" | mail -s "bash rocks" user@host.com

Combinations

The true power of UNIX shells, and especially bash, is reflected in your ability to mix and match redirection, pipes, and other operators almost limitlessly. Here's a combination example: we send 2+2 through the pipe, which bc calculates, and then sends the output to a file.

$ echo 2+2 | bc > result.txt
$ cat result.txt
4
$

Variables

Variables are values that you give a name. They are significant because you can call them by that name throughout your bash "code", but it can contain different values at different times. A simpler way to think about them is that they are simply storage containers, for keeping data around that you will later use within bash.

Here is an example of assigning a value, and then displaying its contents:

$ MYVAR=3
$ echo $MYVAR
3
$

When the variable is first assigned (and created), the $ symbol is not used. From that point forward, however, it must be prefixed with the $ symbol or it will not be properly recognized by bash.

Variables can contain text, numbers, lists - just about anything that you can think of. There is no "type casting" as in structured programming languages, so you need not keep track of what sort of data is stored where.

It's possible to do basic math with variables, like so:

$ A=2
$ B=3
$ C=$((A+B))
$ echo $C
5

A simple use for variables is to save some data temporarily that you're going to use in a later command, possibly several commands. In this example, we'll get a list of files, print each one, append them to a master file, and then delete them.

$ FILES=*.txt
$ lpr $FILES
$ cat $FILES >> master.txt
$ rm $FILES

So far variables might not seem terribly useful. In fact, their power doesn't become obvious until we start using constructs such as loops.

Loops

Computers are excellent at repetitious tasks. Yet all too often, because of limited user interfaces (especially the GUI interfaces that are the most popular), computer users find themselves doing mindless tasks, such as renaming a long list of files. This is a task that is ideal for an iterative loop. It is "iterative" because we step through each item in a list, doing a similar operation on each one. The two main types of loops are for loops and while loops, but let's focus on for loops as they are generally more useful at the command line.

The format of the for loop command is as follows:

for counter in list; do command; done

Everything in italics are items you need to replace with your own values. Counter is the name of the variable you want to use to count with; it can be any single word. List is a list of items, one after the other, that you wish to step through, one by one. And command is what you actually want to do to each one. Since an example is worth a thousand words, take a look:

$ for i in 1 2 3; do echo Loop iteration: $i; done
Loop iteration: 1
Loop iteration: 2
Loop iteration: 3
$

The counter variable is called "i", and you may note that when referring to it in places other than right after the for we place a $ character in front of its name, so it becomes $i.

The list is 1, 2, and 3. List items should be separated by spaces.

The command is "echo Loop iteration: $i". This command gets run for each member of the list; and the counter variable ($i) gets replaced with the current list member we are executing on each time we step through it. Thus, the first time it prints "Loop iteration: 1" because the value of $i is 1.

One neat trick we can use is to use wildcard characters (* and ?) in order to loop through files in the current directory. As a simple example, we could simply print each filename, duplicating the effect of the command ls:

$ ls
file1.txt file2.txt
$ for i in *; do echo $i; done
file1.txt
file2.txt
$

That's not so useful, but let's say that we want to rename each file to start with the word "my".

$ ls
file1.txt file2.txt
$ for i in *; do mv $i my-$i; done
$ ls
my-file1.txt my-file2.txt
$

Backticks

The magical backticks (the character `) are one of the best kept secrets of shell scripting, but at the same time one of the most useful. They allow you to run a secondary command inside of a new shell, and the output from that command will be placed onto the command line on the spot! As usual, and example shows best.

$ cat masterfile.txt
file1.txt
file2.txt
file3.txt
$ cat `grep -l 2 masterfile.txt`
This is the contents of file 2.
$

In this example, the result of the command "grep -l 2 masterfile.txt" is going to be "file2.txt". Normally that would just be displayed to the screen, and at that point you could manually type "cat file2.txt" to see its contents. But you can do it in a single command with backticks; the output of the grep command ("file2.txt") is substituted on the command line, producing the command "cat file2.txt" for you!

Bringing It All Together

At this point, you need to explore these tools yourself in order to get a better understanding of their great flexibility. Provided below are a number of examples to aid in your exploration of these topics.

Converting a list of filenames into lower case

$ for i in *; do mv $i `echo $i | tr A-Z a-z`; done

Printing out only files which contain the text "PrintMe"

$ lpr `grep -l PrintMe *`

Determining the difference in length (number of lines) between two files

$ echo `cat file1.txt | wc -l` - `cat file2.txt | wc -l` | bc

Sending mail to a list of email addresses in a file

$ for addr in `cat email-addresses.txt`; do cat message.txt | mail -s "Hi there!" $addr; done

Archiving all logfiles to a file named after the current date

$ tar czvf logs-`date +%m%d%y`.tar.gz *.log