Comparing Files

There are many times when you wish to know what has happened to a
file. You may be interested in comparing one version of a file to an
earlier one. Or you may need to check one file against a reference
file. Linux provides several tools for doing this, depending on how
deep a comparison you need to make.

The most common task involves comparing two text files. The tool of
choice for this task is diff. With diff, you can compare two files,
line by line. By default, diff will notice any differences between the
two text files, no matter how small. This could be as simple as a
space character being changed into a tab character from one file to
the next. The file will look the same to a user, but diff will find
that difference. The real power of diff comes from the options
available to ignore certain kinds of differences between files. In the
above example, you could ignore that change from a space character to
a tab character by using the option "-b" or
"--ignore-space-change". This option tells diff to ignore any
differences in the amount of whitespace from one file to the next. But
what about blank lines? The option "-B" or "--ignore-blank-lines"
tells diff to ignore any changes in the numbers of blank lines from
one file to the next. In this way, diff will effectively be only
looking at the actual characters and comparing them from one file to
the next. You have essentially narrowed the focus of diff to the
actual content.

What if that is not good enough for your situation? You may be
comparing files where one was entered with all capitals on, for some
reason. Maybe the terminal being used was misconfigured. In any case,
you may not want diff to report simple differences in case as "real"
differences. In this case, you can use the option "-i" or "--ignore-case".

What if you have files from a Windows box that you are working with?
Everyone who works on both Linux and Windows has run into the issue
with line endings on text files. Linux expects only a single newline
character while Windows uses a carriage return and a newline
character. diff can be told to ignore this with the option
"--strip-trailing-cr".

The output from diff can take a few different formats. The default
output contains the line which is different, along with a number of
lines just before and just after the line in question. These extra
lines are called context, and can be set with the option "-c", "-C" or
"--context=" and a number of lines to use for context. This default
output can be used by the program patch to change one file into the
other. In this way, you can create source code patches to upgrade code
from one version to the next. diff will also output differences
between files that can be used by ed as a script by using the option
"-e" or "--ed". diff will also output an RCS format diff by using the
option "-n" or "--rcs". The other option is to print out the
differences in two columns, side by side. The option "-y" or
"--side-by-side" will let you see each file side by side with the
differences between them highlighted.

The utility diff only compares two files. What if you need to compare
three files and see what changes exist moving from one to the others?
The utility diff3 comes to the rescue. This utility compares three
files and prints out the diff statements. Again, you can use the "-e"
option to print out a script suitable for the editor ed.

But what if you simply want to see two files and how they differ?
Another utility might be just what you are looking for, comm. With no
other options, comm takes two files and prints out three columns. The
first column contains lines unique to the first file, the second
column contains lines unique to the second file and the third column
contains lines common to both files. You can selectively suppress each
of these columns with the options "-1", "-2" and "-3". They suppress
columns 1, 2 or 3, respectively.

While this works great for text files, what if you need to compare two
binary files? You need some way to compare each and every byte in each
file. The utility that can be used for this is called cmp. It does a
byte by byte comparison of two files. The default output is a print
out of which byte and which line contains the difference. If you are
interested in seeing what the byte values are, you can use the option
"-b". The "-l" option gives even more detail, printing out the byte
count and the byte value from the two files.

Using these utilities, you can start to get a better handle on how
your files are changing. Here's hoping you keep control of you files.

SVN over SSH

There are many times when the only access to a server is over SSH. There are many reasons for this, including security. If you want to host an SVN repository on this machine, how do you access it? The first step is to create the repository. If you are creating a new one, then this easy.Just log in to the server and do the following

svnadmin create /path/to/repo

If you already have a repository that you want to move to the server, you should use svnsync to do so. First, you will want to allow a user to have svnsync access with

#!/bin/sh
USER=""
if [ "$USER" = "myusername" ]; then exit 0; fi
echo "Permission denied"
exit 1

This goes in the file /path/to/repo/hooks/pre-revprop-change. Make sure that it is set as executable

chmod +x ~/mymirror/hooks/pre-revprop-change

You can then tell svnsync to initialize the mirror.

svnsync init file:///path/to/repo
svn+ssh://username@svnmaster.domain.ext/newpath/to/repo

Once this initialization is done, you can sync the full data.

svnsync --non-interactive sync file:///path/to/repo

You can then interact with this new repository over SSH by using

ssh+svn://username@svnmaster.domain.ext/newpath/to/repo

as the connection details.

One problem that may come up when your SVN repository from one server to another is to fix up your local copy. There is a switch command that will allow you to do this.

svn switch --relocate old-url new-url

This way, you don't need to toss away your current working copy and checkout a fresh one.

Listening on the command line

On Linux, there are tons of programs you can use to listen to audio. Programs like totem, rhythmbox, mplayer, and vlc. One thing you may have noticed with each of these examples is that they all run in an X11 environment. What is a person to do if they want to listen to their favorite music on the command line? Well, let's take a look. Before we do, though, I want to stress that everything here will assume that you have a working audio subsystem on your machine. There is far too much variation in audio hardware to even begin to talk about how to setup audio on your machine.

Most distributions now use pulseaudio as the audio server. This provides a standard wrapper around the actual audio hardware. This way, software simply needs to talk to the pulseaudio server and won't need to worry about the messy details of how to talk to each sound card out there. The pulseaudio system also includes utilities that you can use on the command line.The most basic thing you will want to do is to play an audio file. To do this, you would use a command like

paplay --volume=32768 example.mp3

This will play the audio file example.mp3 at 50% volume (volume is set between 0, or silent, and 65536, or 100%). The utility paplay can play any audio format supported by the library libsndfile. So, you should be able to play most audio files you come across. If you don't have any audio files of your own, you can use the included utility parec. This program grabs raw audio data from your audio card's input, and dumps it to standard output. You can pipe this off to a file to save it for later. In order to play this raw data back, you can use the included utility pacat. It takes raw audio data and dumps it out to the audio card's output speaker.

What if you have a whole list of audio files you would like to listen to? There are several choices available to handle entire playlists of music for you from the command line. Two examples are cplay and moc. Both programs give you a file list on first starting up. From here, you can play individual files, or you can start constructing playlists to organize what audio files you are listening to. Both programs use shortcut keys to create, edit and otherwise manage your playlists. You can see the screenshots provided to get a feel for how they might look on your system.

Another audio task you might want to handle from the command line is to have your computer talk to you. Of course, once it can talk, it won't be long before it talks back. But anyway, there are several utilities available to give your computer a voice. These include recite, festival and espeak. The simplest is recite. It takes text in from standard input and outputs audio to the speakers. There aren't many options available. To speak a text file, you would simply execute

recite < example.txt

You can also simply execute recite, then start typing. Just remember that recite won't see the end of your typing until you hit ctl-D, to mark the end of input. It will then speak the entire text you just typed in. The program festival starts to provide more options. To get a basic output of text to audio, you would execute

festival --tts example.txt

You have the option of also using other languages. The default is English, but you can also use Spanish, Russian or welsh by using the command line option "--language ". Beyond this, festival also uses Scheme as a scripting language, which opens it up to a huge amount of modification. You'll need to spend some time reading the manual at http://www.cstr.ac.uk/projects/festival/manual/.

The last utility is espeak. Here you have several other command line options available. You can set the amplitude with "-a " (0 to 20, default is 10), the pitch with "-p " (0 to 99, default is 50) and speed with "-s " (in words per minute, default is 160). There are several voices available. You can find out which ones are installed on your system by using the command line option "--voices". Once you select one, use the command line option "-v " to use it. You then dump it out to a wave file by using the command line option "-w ". A full command line would look like

espeak -v en-scottish -w example.wav -f example.txt

Then you can play it using

paplay example.wav

Now your computer will speak in a Scottish accent, sort of. Speech synthesis is still not perfect.

Now that you have all of these audio files laying around, you may want to do some processing on them. One of the more useful tools to do this is sox. One utility included in the package is called soxi. It gives you file information about your audio files. In it's most basic form, you can use sox to change the file format of an audio file by simply running

sox example.wav example.au

sox is smart enough to use the file extensions to figure out that you are intending to convert the file example.wav (in wav format) to example.au (in the Sun au format). You can also do processing on the audio through command line options. The number of bits per sample to use can be set through "-b ". The number of channels to use can be set through "-c ". So, setting it to mono would be "-c 1", or stereo would be "-c 2". You can set the sample rate, in Hertz, through "-r ". There are also many options that apply filters to the audio files. Please go read the manual page to see just how much you can do with sox. One example command line you might use is

sox example.wav -b 8 -c 1 -r 8000 example.au

This example command takes an input file called example.wav and converts it to 8 bits per sample, mono (or 1 channel), with a sample rate of 8000 Hz and writes it out to a file with an AU format. The sox package also contains two other utilities, play and rec. With these, you have another way of playing audio files and recording audio to a file. You also have the full spectrum of processing and filters available from sox.

You may ask why I picked this particular example? This is so that I can show you one last interesting trick. On Linux systems, you can simply cat this output file (with this specific file format) directly to the device file /dev/audio. This dumps the output directly to the sound card. So, if you want to be sneaky, you can convert some files to the AU file format with the sox command above, copy them over to your target machine, and simply cat them to /dev/audio when you want to make a nuisance of yourself. Hopefully you now have some ideas on how to enjoy your music, and how to play with audio files, without the overhead of the GUI applications available.

Local Subversion Repositories

Everyone should be using some kind of version control system for their own files. There are tons of options available, including mercurial, git, CVS and subversion. I've recently had to move some code to subversion so that I can share code with some other groups. You don't need to have a server set up out there somewhere on the interwebs. You can set up your own private repository on your own machine and use that to track changes to your code and documents. The first step is to create space on your drive

mkdir ~/svn

After that, you need to initialize the actual repository

svnadmin create ~/svn

That's it. Now you have a repository that you can use. The very first thing is to import your current code.

svn import /home/jbernard/temprepos/my_sources file:///home/jbernard/svn/my_sources -m "Initial Import"

After this initial import, you can checkout a working copy of the code to work with.

svn checkout file:///home/jbernard/svn/my_sources

All of the other commands assume that you are actually in the subdirectory holding the checked out code. You can check the surrent status of your code with

svn status

Any edited files will be flagged. You can check in these changes with

svn commit

where it will ask you for a comment about the commit. You can grab any changes committed by someone else with

svn update

I'll be adding more commands here as I think of them.

WiFi on the Command Line

More people than ever are using wireless networks as their primary networking medium. There are great programs available under X11 that give users a graphical interface to their wireless cards. Both Gnome and KDE include network management utilities, and their is a desktop environment agnostic utility called wicd which also offers great functionality. But what do you do if you aren't running X11 and you want to manage your wireless card? I won't be covering how to get your card installed and activated. For that, you should look at projects like madwifi or ndiswrapper, in order to properly configure your system and card. In the rest of this piece, I'll assume that you have your card properly installed and configured, and that it is called wlan0. Also, most of the utilities mentioned below have to talk directly to your wireless card, or at least the card driver, and so will need to be run with root privileges. Just remember to use sudo.

The first step is to see what wireless networks are available in your area. There is a utility called iwlist which provides this information. This utility can give you all sorts of information about your wireless environment. The first thing to do is to scan your environment for available networks. You would run
sudo iwlist wlan0 scan
and get output resembling
Cell 01 - Address: 00:11:22:33:44:55
ESSID:"network-essid"
Mode:Master
Channel:11
Frequency:2.462 GHz (Channel 11)
Quality=100/100 Signal level:-47dBm Noise level=-100dBm
Encryption key:off
.
.
.
The details (address and essid) have been changed to protect the guilty. Also, the ... is extra output that may or may not be available, depending on your hardware. You will get a separate cell entry for each access point within range of your wireless card. For each access point, you can find out the hardware address, the essid and the channel it is operating on. Also, you can find out what mode the access point is operating in (whether master or ad-hoc). In most cases, you will be most interested in the essid, and what encryption is being used.

Once you have information about what is available in your immediate environment, you need to configure your wireless card to use one of these access points. You can use the utility iwconfig to set these parameters for your wireless card. The first thing you will want to set is the essid, which identifies the network access point you are interested in. You would run
sudo iwconfig wlan0 essid network-essid
Depending on your card and its driver, you may have the option to set the essid to the special value "any". In this case, your card will pick the first available access point. This is called "promiscuous mode".

You may also need to set the mode to be used by your wireless card. This will depend on your network topology. You may have a central access point that all of the other devices connect to. Or you may have an ad-hoc wireless network, where all of the devices communicate as peers. You may wish to have your computer act as an access point. If so, you can set the mode to master using iwconfig. Or, you may want to simply sniff what is happening in the air around you. You can do this by setting the mode to monitor and passively monitor all packets on the frequency your card is set to. You can set the frequency, or channel, by running
sudo iwconfig wlan0 freq 2.422G
or
sudo iwconfig wlan0 channel 3

You can also set other parameters, but you should only consider doing so if you have a really good reason. One option of interest is the sensitivity threshold. This defines how sensitive the card is to noise and signal strength. You can set the behavior of the retry mechanism for the wireless card. You may need to play with this in very noisy environments. You can set the maximum number of retries with
sudo iwconfig wlan0 retry 16
or set the maximum lifetime to keep retrying to 300 milliseconds with
sudo iwconfig wlan0 retry lifetime 300m
In a very noisy environment you may also need to play with packet fragmentation. If entire packets can't make it from point to point without corruption, your wireless card may have to break packets down into smaller chunks to try and avoid this. You can tell the card what to use as a maximum fragment size with
sudo iwconfig wlan0 frag 512
This value can be anything less than the size of a packet. The last thing you may need to run is
sudo iwconfig wlan0 commit
Some cards may not apply these settings changes immediately. In these cases, you'll need to run this command to flush all pending changes to the card and get them applied.

Two other commands that may prove useful are iwspy and iwpriv. If your card supports it, you can collect wireless statistics by using
sudo iwspy wlan0
The second command gives you access to optional parameters for your particular card. iwconfig is used for the generic options available. If you run it without any parameters
sudo iwpriv wlan0
it will list all available options for the card. If there are no extra options, then you will get an output looking like
wlan0 no private ioctls
To set one of these private options, you would run
sudo iwpriv wlan0 private-command [private parameters]

Now that you have your card configured and connected to the wireless network, you'll need to configure your networking options to actually use it. If you are using DHCP on the network, you can simply run dhclient to query the DHCP server and get your IP address and other network settings. If you wish to set these options manually, you can do so through the command ifconfig. I would suggest that you give the man page for ifconfig a read. Hopefully this article will help those on the command line to use their wireless cards and networks.

Matlab OS Checks

Matlab, at least on Linux, is pretty firm on which versions it will run on. You may see glibc warnings when you try and run on older distributions. The problem is, as long as the distro isn't too old, Matlab will probably run fine. When you run Matlab interactively, you can simply accept the warning and continue on. But this doesn't work if you want to run scripts in batch mode. To get around this, you can set an environment variable to skip the OS check. If you use bash, simply use

   export OSCHECK_ENFORCE_LIMITS=0

This should let you go ahead and run your script without pausing. Hope this helps.

Putty timeouts

Some of my users were having problems with ssh connections dropping. Don't forget to go into the connection section of the preferences and set the keepalive options.