Friday, July 24, 2009


Ive just uploaded psyco.leo, a file containing the major C sources for psyco. This is the file I will use to study the psyco sources. So can you. You can find the file here.

This is a large (1.3 mb), self-contained file: it will not affect any of actual psyco files.

P.S. I created this file using a script that scans a directory and its sub-diretories, creating @auto nodes for all files in a given list of extensions, in this case, .c and .h files. Reloading psyco.leo populated the @auto nodes with the contents of the corresponding external files. I then removed the @auto prefixes. Finally, saving psyco.leo caused the contents of all files to be written to psyco.leo itself.

P.P.S. Those of you who are Leo users know that @auto imports the external file every time the .leo file is opened. @auto has a series of strict consistency checks to ensure that writing an imported @auto file will yield the same file. These checks failed for some of the files in the psyco sources because of intermixed blanks and tabs. When those checks fail, the @auto import code puts an @ignore directive in the top-level @auto node.

Eventually, I may want to regularize the whitespace so that @auto doesn't complain. This could be done simply by removing the @ignore directives and saving the .leo file. That is, assuming that the version of psyco.leo still contained @auto nodes...

Wednesday, July 22, 2009

Hello world in psyco

Aristotle said (presumably in Greek), "that which we learn, we learn by doing". In other words, there comes a point, rather quickly, when reading the manual (or even the code) is mostly waste of time :-) The only way, imo, to understand a complex program like pypy or psyco, is to run the program. So installing from sources is imperative, because we want to change the source and see what happens :-)

I've just built psyco with a "hello world" message at startup. Some notes:

- There is no point in trying to build the docs. You will simply get the psyco guide And a good thing too. The make file for the docs using something called mkhowto. Googling shows this may be part of Python tools, but there is no obvious way to get mkhowto on ubuntu.

- It looks like running "sudo python" install will do a proper build (make) of changed sources. The docs say this:

As usual, other commands are available, e.g.

    python build_ext -i
will compile the C source and put the result directly into the py-support/ subdirectory (no administrator priviledge is required).
[end quote]

In my case, I don't mind destroying psyco by mistake, so maybe I'll just go with the plain vanilla install.

Anyway, I now have a version of psyco that I can play with. After 8 years working only with Python, I had forgotten quirks of C and Make, but the irritations, though real, are minor.

The next step will be to investigate the existing tracing and debugging capabilities of psyco.

One "significant" (if hardly earth-shaking) improvement to psyco would be to add Sherlock tracing. This would be good for development: it allows enabling and disabling of traces from the command line with minimal overhead during testing and no overhead in production.

Tuesday, July 21, 2009

What success would look like

This post will summarize what I consider the criteria for success for projects such as pypy and psyco. I was going to say that this post will be a brief summary, but interesting ideas intruded :-)

You could say that this post distills a series of my questions and answers given by the psyco folks several years ago. As I consider these questions now, I'm surprised by how my intuitions seem to have changed in the interval.

To state the conclusions first: to be considered successful, both psyco and pypy would at least have to meet the performance of CPython, Python's default interpreter.

As the jit article should make clear, there is every reason to believe that this goal can be accomplished. Indeed, jits routinely succeed in this sense. And its not hard to see why. Even though CPython is written in C, it executes the exact same code every time it interprets a particular bytecode. It's ignorant of context, and stays ignorant. This ignorance is costly: it means that it must execute potentially unnecessary tests on the type of of arguments to the bytecode, including arguments on the stack. Worse, there is no way to optimize operations for arguments of known types, as discussed in the runtime-specialization article.

So it seems that out-performing CPython is a fairly reasonable goal to attempt.

However, there same strategies could conceivably produce gains that seem counter-intuitive to old-timers like me. That is, one can imagine situations in which runtime specialization will do better than even C or assembly languages codes that do no take advantage of information available at run time.

I have a hunch there is an important Aha lurking here. Before trying to say what that is, let me recommend to you the following paper:
Transparent Dynamic Optimization:The Design and Implementation of Dynamo

When I first read this paper, I thought that what they had accomplished was clearly impossible, namely to use software to speed up hardware. All my intuitions said that hardware operations were orders of magnitude faster than software operations, and for that reason, trying to intervene with software was doomed to failure. When I read the paper, my first thought was, "how did anyone think to do something this clever?"

But my intuitions were quite wrong. True, software is much slower than hardware, but in some sense this is irrelevant. In some cases, optimizing the instructions to be presented to the CPU is so valuable that it is worth significant extra effort, even if that effort takes place in software.

This is pretty close to the heart of the matter, so let's look at this from several other perspectives. Originally, I had thought of the situation as a matter of accounting. This isn't wrong, but it may be misleading. Considered as an accounting problem, one asks whether the optimization effort (done by software) can be "amortized" over the potential savings. Considered in this way, perhaps the answer is less clear than it might be. But notice, there is ample experimental evidence that the effort can be amortized.

But what if I simply stop fixating on the notion that software is much slower than hardware? Is there another point of view that makes the desired speedups seem not so outlandish? My intuition tells me there is, although it's not completely clear.

Let's try some ideas and see where they lead.

First, let's focus on the improved code, not on how long it takes to improve the code. Clearly, the improved code could be arbitrarily better than the unimproved code. For example, the improved code could be in a loop that is executed a huge number of time.

Second, let's not assume that "dispatching" to improved code or caching the improved code, or stitching together the improved code are necessarily time consuming operations. We might find clever ways of doing these operations very quickly. We might even find ways of eliminating these "overhead" operations entirely.

Lets go back again and focus on the improved code. As a thought experiment, lets ask, "How good could the improved code get?" The beauty of thought experiments is that we can rely on magic as needed. So let's assume some garden-variety magic and assume that all the overhead that formerly we were so worried about has magically just disappeared. Is this such an outrageous thing to assume? I think not. You could imagine it as being a species of loop unwinding combined with type specific specializations.

Furthermore, lots of information can be held implicitly in specialized code. The more I think about this, the more it seems like the master key to optimization. Indeed, all specialized code could be considered to be (or have) implicit information simply by virtue of the fact that the specialized code is a valid code improvement. In other words, if we execute the improved code there are a raft of underlying assumptions about context that must, in fact, be true for the improved code to be valid.

But this implicit information might indeed be sufficient to perform further operations. In particular, it may be sufficient to eliminate explicit overhead operations such as fetching code from a cache, linking to other improved code snippets, etc., etc.

In other words, we might be able to make any knowledge-based "magic" a reality.

I'm not entirely convinced that this is the best demonstration that large speedups are possible, but I think it may be good enough. OTOH, I have one of those feelings that an even better point of view is lurking nearby. For example, one could propose the following hypothesis:

Given enough optimization, we can treat all implicit knowledge as if it were explicitly available at no additional cost.

This still doesn't seem like a perfectly convincing argument, but it will have to do for now...I think you can sense the excitement promised by psyco and (even more) in pypy.


pypy and psyco

The first exploration will concern two projects that aim to speed up Python, namely pypy and psyco. The plan is to learn one or both of these projects in enough detail to make at least one significant contribution. It's a big task, and there will presumably be many posts on this topic.

The first step is to get some idea of what these projects are all about. The Wikipedia entries give excellent introductions:

Each project might be called a jit:

The two projects are very different. There are many "fancy" ways to describe the differences, but the most obvious difference is:

- psyco is written in C (mostly)
- pypy is pure python.

With psyco, what you see is what you get (and all that you get). In contrast, pypy works on any python code (including itself), so all optimizations within pypy will improve pypy itself.

That's enough for now. The next post will discuss what we might reasonably expect from these projects. That is, we will look at what "success" might mean.


About this blog

This blog will chronicle various explorations re computers, science, engineering and whatever else interests me.