5.5. Using Psyco

Psyco (see []) is a kind of specialized compiler for Python that typically accelerates Python applications with no change in source code. You can think of Psyco as a kind of just-in-time (JIT) compiler, a little bit like Java's, that emits machine code on the fly instead of interpreting your Python program step by step. The result is that your unmodified Python programs run faster.

Psyco is very easy to install and use, so in most scenarios it is worth to give it a try. However, it only runs on Intel 386 architectures, so if you are using other architectures, you are out of luck (at least until Psyco will support yours).

As an example, imagine that you have a small script that reads and selects data over a series of datasets, like this:


	    def readFile(filename):
	    "Select data from all the tables in filename"

	    fileh = openFile(filename, mode = "r")
	    result = []
	    for table in fileh("/", 'Table'):
	    result = [ p['var3'] for p in table if p['var2'] <= 20 ]

	    fileh.close()
	    return result

	    if __name__=="__main__":
	    print readFile("myfile.h5")
	  

In order to accelerate this piece of code, you can rewrite your main program to look like:


	    if __name__=="__main__":
	    import psyco
	    psyco.bind(readFile)
	    print readFile("myfile.h5")
	  

That's all!. From now on, each time that you execute your Python script, Psyco will deploy its sophisticated algorithms so as to accelerate your calculations.

You can see in the graphs 5.15 and 5.16 how much I/O speed improvement you can get by using Psyco. By looking at this figures you can get an idea if these improvements are of your interest or not. In general, if you are not going to use compression you will take advantage of Psyco if your tables are medium sized (from a thousand to a million rows), and this advantage will disappear progressively when the number of rows grows well over one million. However if you use compression, you will probably see improvements even beyond this limit (see section 5.3). As always, there is no substitute for experimentation with your own dataset.

Figure 5.15. Writing tables with/without Psyco.

Figure 5.16. Reading tables with/without Psyco.