Monday 31 October 2011

~/bin ~/lib and ~/include

I'm taking a brief interlude to explain fully what is going on with this setup. After this I will get going on actually using the system and (quite importantly) setting up python with easy-install etc. But first a little exposition on our motivation...

One thing that makes Unix different to Windows is that places matter. In fact this a pretty deep difference whose knock-on effects account for many of the differences between Windows and Unix-style systems. To simplify horribly, it's all to do with how the Operating System knows how to find things. Specifically header files and libraries, and even more specifically, libraries that are loaded at run time: dynamically loaded libraries ("dll"s) or shared object libraries ("so"s).

In Unix based systems, the os looks for so files and header files in certain "well known" places. Here I am using "well known" in a specific sense, it means that anyone writing software for a Unix system can rely on these folders / files existing. So when Unix needs a configuration file, or a program file or a library, it knows where to look - in one of these "well known" places. An example of this is /etc where all the configuration files live.

In Windows based systems, the files can be anywhere. There are certain "well known" places (such as c:\windows\system32) but other than that pretty much anything goes. So how does windows find things? Well, you can put your files anywhere, but you need to record where you put the file in a central location known as The Registry. So when Windows wants a file, it looks it up in the registry and then knows where to go and get it.

Note: anyone with serious in-depth knowledge of current os internals will be tearing their hair out and screaming in frustration at this out-of-date over simplification of things, but hopefully they will admit that it is a simple explanation of the basics. If not, feel free to correct me in the comments.

Now, when you are developing software in windows, you are probably using Microsoft Visual Studio. If not, you are probably using one of a handful of well known IDE's that is at least compatible with MSVS. This means that when you install a library that you want to use in your program, the installer can tell the registry where the library is, and it can tell MSVS where the headers are. Then in your project you select the headers you want to use, specify the library you want and bingo - it all compiles and runs.

Now Unix-based systems are different. For a start, there are more development environments than you can shake a stick at, and none of them use the same configuration. This is partly because of the way Unix works. You see, nothing needs to tell anything where to look, everyone knows that for a header file you look in /usr/include and for a library file you look in /lib or /usr/lib. So there is no need for a common configuration. Also for a distribution of Unix pretty much everything will use the same compiler - gcc. So if you configure gcc via environment variables, that will work whatever IDE you use.

So where am I going with all this - well, this was all well and good when the computer you used was administered by a sysadmin, who would make sure that all the libraries you needed were installed on the system, but nowadays you are most likely the admin of your computer.

"Great" you say, so I can just put the library I want into /usr/lib and /usr/include and everything will work!

Well, yes and no. What if you can't put things in /usr/lib, or what if you don't want to?

Why wouldn't you be able to? If you are working somewhere where they don't (for security reasons) give you admin rights to the machine, then maybe you can't write to those folders.

Why wouldn't you want to? If you are running a Unix distribution such as Ubuntu, you will have loads of tools for installing libraries on your machine (such as the fantastic Synaptic and Aptitude), and they will automatically put all the right files in the right places. So if you want to install a library and it comes in an Ubuntu package then everything is hunky dory. But what if the library isn't packaged? What if you want a more recent version of a library. Most advice on the internet will tell you to use sudo to put the files in /usr/lib but this is a bad idea. You can put stuff there and it will work, but Ubuntu doesn't know about or understand the changes you have made. So when a new version comes out Ubuntu won't be able to update things properly. It won't be able to clean up properly. In fact a million and one things could wrong down the line.

So I suggest avoiding the issue entirely.

The way to do this is to create your own "well known" places that you will look after yourself. These will live in your home directory and will act just like the real "well known" places except they will only ever be edited by you and it will be up to you to keep them clean and tidy. They will have precedence over the real places, so if you put some_program in ~/bin and there is already a version in /usr/bin, then the version in ~/bin will get run. This means you can install a library on the machine using Synaptic, and then also use a more recent version for your own code.

So how do we do this? If you follow the instructions that I have been giving you then you are already set up and ready to go!

But now you know why you were doing it!