This post will conclude the discussion of how I converted Leo to Python 3k. In fact, the work was mostly complete several weeks ago. Vacation got in the way of writing up these notes.
The biggest surprise was that almost all the surprises were good surprises. Aside from the complications previously discussed, porting Leo was a matter of running Leo and fixing the obvious problems reported by Python itself.
To recap, it took just a few days to get Leo to the point at which Leo could run its own unit tests with Python 3k. After that, it took just a few more hours, spread over several days, to get all the unit tests to pass.
Yesterday I completed the one and only "tricky" part of the port. This was a library problem. I won't bore you with the details: the problem arose from a mis-reading of the Python 3k docs. Today, as I was writing this post, I discovered my mistake and was able to clean up an ugly workaround.
So that's it. Porting a major app like Leo from Python 2.x to Python 3.x was almost completely straightforward.
Edward
Friday, December 18, 2009
Sunday, December 6, 2009
Coverting Leo to Python 3k: tactics
In this posting I'll discuss how to create code that will compile on both Python 2.x and Python 3.x. This is mostly straightforward, but there are some interesting tactical details.
1. The g.isPython3 constant
Leo uses the toPython3 constant to distinguish between old (Python 2.x) and new (Python 3.x) code:
This constant is defined in only one place, in Leo's leoGlobals module, known as 'g' throughout Leo's code. All of Leo's modules import this module before any other imports as follows:
Leo distinguishes old and new code this way:
Naturally, we want to keep this kind of code to a minimum by refactoring common instances of this pattern into new functions. In Leo, the obvious place for these new functions is in g, the leoGlobals module.
2. Print statements and g.pr
Print statements with a trailing comma pose a special problem. Such statements are syntactically invalid in Python 3.x, so there is *no way* to use the pattern above. Late last year I converted almost all of Leo's print statements to the g.pr function. This conversion was the single biggest change to Leo's code base for Python 3.x.
Theoretically, I only needed to eliminate print statements with trailing commas, but even "normal" print statements pose problems. In Python 3.x, 'print(a,b)' produces the same output as 'print a,b' produces in Python 2.x. But in Python 2.x, 'print(a,b)' prints the *tuple* (a,b).
Yes, there is a workaround: to translate 'print a,b' to `print('%s %s' % (a,b))'. But I thought this was too ugly. Besides, defining g.pr gave me the opportunity to use a clever convention that supports automatic translation of some, but not all arguments. The convention is just this: g.pr translates only *odd* arguments. For example:
prints the translation of a, c and e, but prints b and d as they are, untranslated. To suppress the translation of the first argument, use an empty string as the first argument:
Imo, this convention is much better than the typical convention of using _(a) to denote the translation of a. Besides being cleaner visually, it is often exactly what is desired anyway. For example:
does exactly the right thing: we want to translate 'can not open' but we should not translate fileName!
In short, removing print statements from Leo makes a virtue out of necessity. A few print statements do remain in Leo, but they are just for debugging and are seldom used. It doesn't matter whether those statements produce the same results in Python 3.x as they do in Python 2.x.
One more detail. There is no way to use 'print' in the g.pr function so that it compiles on both Python 2.x and Python 3.x. Instead, the g.pr function uses sys.stdout.write. Alas, sys.stdout.write is not precisely equivalent to print in all situations, but this can not be helped.
3. Unicode constants
Unicode constants of the form u'whatever' are invalid in Python 3.x. The workaround is code such as the following:
This kind of code is common enough to have been refactored:
I'll be saying more about unicode issues in later posts.
1. The g.isPython3 constant
Leo uses the toPython3 constant to distinguish between old (Python 2.x) and new (Python 3.x) code:
isPython3 = sys.version_info >= (3,0,0)
This constant is defined in only one place, in Leo's leoGlobals module, known as 'g' throughout Leo's code. All of Leo's modules import this module before any other imports as follows:
import leo.core.leoGlobals as g
Leo distinguishes old and new code this way:
if g.isPython3:
<< old code >>
else:
<< new code >>
Naturally, we want to keep this kind of code to a minimum by refactoring common instances of this pattern into new functions. In Leo, the obvious place for these new functions is in g, the leoGlobals module.
2. Print statements and g.pr
Print statements with a trailing comma pose a special problem. Such statements are syntactically invalid in Python 3.x, so there is *no way* to use the pattern above. Late last year I converted almost all of Leo's print statements to the g.pr function. This conversion was the single biggest change to Leo's code base for Python 3.x.
Theoretically, I only needed to eliminate print statements with trailing commas, but even "normal" print statements pose problems. In Python 3.x, 'print(a,b)' produces the same output as 'print a,b' produces in Python 2.x. But in Python 2.x, 'print(a,b)' prints the *tuple* (a,b).
Yes, there is a workaround: to translate 'print a,b' to `print('%s %s' % (a,b))'. But I thought this was too ugly. Besides, defining g.pr gave me the opportunity to use a clever convention that supports automatic translation of some, but not all arguments. The convention is just this: g.pr translates only *odd* arguments. For example:
g.pr(a,b,c,d,e)
prints the translation of a, c and e, but prints b and d as they are, untranslated. To suppress the translation of the first argument, use an empty string as the first argument:
g.pr('',a)
Imo, this convention is much better than the typical convention of using _(a) to denote the translation of a. Besides being cleaner visually, it is often exactly what is desired anyway. For example:
g.pr('can not open', fileName)
does exactly the right thing: we want to translate 'can not open' but we should not translate fileName!
In short, removing print statements from Leo makes a virtue out of necessity. A few print statements do remain in Leo, but they are just for debugging and are seldom used. It doesn't matter whether those statements produce the same results in Python 3.x as they do in Python 2.x.
One more detail. There is no way to use 'print' in the g.pr function so that it compiles on both Python 2.x and Python 3.x. Instead, the g.pr function uses sys.stdout.write. Alas, sys.stdout.write is not precisely equivalent to print in all situations, but this can not be helped.
3. Unicode constants
Unicode constants of the form u'whatever' are invalid in Python 3.x. The workaround is code such as the following:
if isPython3:
f = str
else:
f = unicode
aConstant = f('whatever')
This kind of code is common enough to have been refactored:
aConstant = g.toUnicode('whatever')
I'll be saying more about unicode issues in later posts.
Friday, December 4, 2009
Coverting Leo to Python 3k: strategy
In the first post of this series, I stated the goal of this project: to create a common code base that will run Leo on both Python 2.x and Python 3.x. I am confident this goal will be accomplished because the following 3-step strategy guarantees success:
Step 1: Create a common code base that compiles on both Python 2.6 and Python 3.1. I completed this step yesterday, although one or two minor issues remain. Leo continues to run with Python 2.6 as it always has, and all of Leo's existing unit tests pass with the new code base when run with Python 2.6.
Step 2: Run the new code using Python 3.1, fixing bugs as they are found, until Leo executes its startup code without crashing. I completed this step yesterday as well.
Step 3: Run all of Leo's unit tests using Python 3.1. When this step is complete, there will be little or nothing left to do: Leo's unit tests cover almost all the essential features of Leo, except for some gui-related issues that already appear to work properly.
This strategy is guaranteed to work: porting Leo to Python 3.x is a low-risk project.
Only tactical details remain. Most details need no further comments, but I'll discuss two tactical issues in further posts:
A. How Leo's code can be made to compile in Python 2.6/3.1.
B. How Leo handles unicode in both Python 2.6/3.1.
The second is more important theoretically, the first more difficult practically.
P.S. A word or two about timing. Leo depends on Qt, so I had to wait until Qt itself supported Python 3.x before I could attempt to finish steps 2 and 3 above. PyQt supported Python 3.x at version 4.6, and it is now at version 4.6.2, so I can be pretty confident that Qt will handle Python 3.x pretty well.
I restarted this project yesterday after reading a comment praising Python 3.x. My guess is that 2010 will be the year the Python world moves decisively to Python 3.x.
Step 1: Create a common code base that compiles on both Python 2.6 and Python 3.1. I completed this step yesterday, although one or two minor issues remain. Leo continues to run with Python 2.6 as it always has, and all of Leo's existing unit tests pass with the new code base when run with Python 2.6.
Step 2: Run the new code using Python 3.1, fixing bugs as they are found, until Leo executes its startup code without crashing. I completed this step yesterday as well.
Step 3: Run all of Leo's unit tests using Python 3.1. When this step is complete, there will be little or nothing left to do: Leo's unit tests cover almost all the essential features of Leo, except for some gui-related issues that already appear to work properly.
This strategy is guaranteed to work: porting Leo to Python 3.x is a low-risk project.
Only tactical details remain. Most details need no further comments, but I'll discuss two tactical issues in further posts:
A. How Leo's code can be made to compile in Python 2.6/3.1.
B. How Leo handles unicode in both Python 2.6/3.1.
The second is more important theoretically, the first more difficult practically.
P.S. A word or two about timing. Leo depends on Qt, so I had to wait until Qt itself supported Python 3.x before I could attempt to finish steps 2 and 3 above. PyQt supported Python 3.x at version 4.6, and it is now at version 4.6.2, so I can be pretty confident that Qt will handle Python 3.x pretty well.
I restarted this project yesterday after reading a comment praising Python 3.x. My guess is that 2010 will be the year the Python world moves decisively to Python 3.x.
Coverting Leo to Python 3k: goals
This is the first in a series of posts that will describe how I am converting my Leo app to run on Python 3k. I will be writing these posts before the project is complete so that the details are fresh in my mind. But the intention is to say something that will be generally useful to anyone contemplating a similar project: I'll keep Leo-specific details to a minimum. My emphasis will be on strategy and tactics, not code-level details.
The one and only goal of this project is straightforward: the final product will be code that compiles and runs on both Python2k and Python3k without any modification whatsoever. A common code base is an absolute requirement, for two reasons:
1. Leo is under active development using bzr. It would be intolerable to attempt to support multiple code bases for any length of time.
2. Python's 2to3 code-conversion tool does not begin to have the sophistication needed to automatically convert Leo's code. But even if 2to3 did have the smarts, it would odious to add an intermediate code-conversion step every time anyone changed Leo's code base.
This goal is just a bit ambitious, but I have absolute confidence that it can be accomplished. I'll tell you why in the next post.
The one and only goal of this project is straightforward: the final product will be code that compiles and runs on both Python2k and Python3k without any modification whatsoever. A common code base is an absolute requirement, for two reasons:
1. Leo is under active development using bzr. It would be intolerable to attempt to support multiple code bases for any length of time.
2. Python's 2to3 code-conversion tool does not begin to have the sophistication needed to automatically convert Leo's code. But even if 2to3 did have the smarts, it would odious to add an intermediate code-conversion step every time anyone changed Leo's code base.
This goal is just a bit ambitious, but I have absolute confidence that it can be accomplished. I'll tell you why in the next post.
Friday, August 14, 2009
Announcing Leo 4.6.2 final
Leo 4.6.2 final is now available here. For more info, see Leo's web site.
The highlights of Leo 4.6
--------------------------
- Cached external files *greatly* reduces the time to load .leo files.
- Leo supports both the Tk and Qt gui interfaces.
- New --config, --file and --gui command-line options.
- Leo tests syntax of .py files when saving them.
- Leo can now open any kind of file into @edit nodes.
- @auto-rst nodes allow easy editing of reStructuredText files.
- Properties of commanders, positions and nodes simplify programming.
- Improved Leo's unit testing framework.
- Leo now requires Python 2.5 or later.
The highlights of Leo 4.6
--------------------------
- Cached external files *greatly* reduces the time to load .leo files.
- Leo supports both the Tk and Qt gui interfaces.
- New --config, --file and --gui command-line options.
- Leo tests syntax of .py files when saving them.
- Leo can now open any kind of file into @edit nodes.
- @auto-rst nodes allow easy editing of reStructuredText files.
- Properties of commanders, positions and nodes simplify programming.
- Improved Leo's unit testing framework.
- Leo now requires Python 2.5 or later.
Saturday, August 1, 2009
Better Python logging
This is a response to
http://sayspy.blogspot.com/2009/07/results-of-informal-poll-about-standard.html
Back when Python's logging module was being designed, I asserted that there is a much better way to enable and disable logs than a linear sequence of logging levels. That proposal went nowhere, perhaps because people heard it as dismissing the whole notion of having a standard logging module.
In fact, though, it would be easy to graft a small extension on to Python's logging module that would make logging much more flexible. Here's how
1. Add another logging level, say 'custom'. When custom logging is in effect, log messages are generated if they match a **tracepoint name**. Such names are simply strings.
2. Add the following method to the logging module::
logging.custom("abc",'This a custom message logged when "abc" is enabled.')
3. Enable or disable custom tracepoint names with::
logging.enable_custom(pattern1,pattern2,...)
logging.disable_custom(pattern,pattern2,...)
Each patten is a string, possibly containing wildcard characters '*' and '?', that matches zero or more tracepoint names. The enable/disable_custom methods simply add or delete entries in a **pattern database** of enable and disabled names.
When logging.custom(s,message) is executed, it searches the pattern database for a match with the string s. It prints the message to the log if a match is indeed found.
That's about it. It's easy to do, compatible with the present logging system, but much more flexible.
Many small tweaks to this scheme are possible, but I'll save those in case there is some real interest.
http://sayspy.blogspot.com/2009/07/results-of-informal-poll-about-standard.html
Back when Python's logging module was being designed, I asserted that there is a much better way to enable and disable logs than a linear sequence of logging levels. That proposal went nowhere, perhaps because people heard it as dismissing the whole notion of having a standard logging module.
In fact, though, it would be easy to graft a small extension on to Python's logging module that would make logging much more flexible. Here's how
1. Add another logging level, say 'custom'. When custom logging is in effect, log messages are generated if they match a **tracepoint name**. Such names are simply strings.
2. Add the following method to the logging module::
logging.custom("abc",'This a custom message logged when "abc" is enabled.')
3. Enable or disable custom tracepoint names with::
logging.enable_custom(pattern1,pattern2,...)
logging.disable_custom(pattern,pattern2,...)
Each patten is a string, possibly containing wildcard characters '*' and '?', that matches zero or more tracepoint names. The enable/disable_custom methods simply add or delete entries in a **pattern database** of enable and disabled names.
When logging.custom(s,message) is executed, it searches the pattern database for a match with the string s. It prints the message to the log if a match is indeed found.
That's about it. It's easy to do, compatible with the present logging system, but much more flexible.
Many small tweaks to this scheme are possible, but I'll save those in case there is some real interest.
The Age of Entanglement
Here is a quote from the preface of "The Age of Entanglement", by Louisa Gilder,
http://www.amazon.com/Age-Entanglement-Quantum-Physics-Reborn/dp/1400044170
one of the few books I have read lately from start to finish:
QQQ
"Science rests of experiments," wrote Heisenberg, but "science is rooted in conversations."
Nothing could be further from the impression physics textbooks give to students. There, physics seems to be a perfect sculpture sitting in a vacuum-sealed case, as if brains, only tenuously connected to bodies, had given birth to insights fully formed...
Physics, in actuality, is a never-ending search made by human beings...The schoolbook simplifications obscure the crooked, strange and fascinating paths that stretch out from each idea, not only back into the past but onwards into the future...
Conversations are essential to science. But the off-the-cuff nature of conversation poses a difficulty. It is rare, even in these digital times, to have a complete transcript of every word spoken between two people on a given day, even if that conversation someday leads to a new understanding of the world. The result is that history books rarely have much of the to-and-fro of human interaction. Heisenberg's statement suggests that something is therefore lost.
QQQ
Being part of the Python world via its many conversations is one of my life's great joys.
http://www.amazon.com/Age-
one of the few books I have read lately from start to finish:
QQQ
"Science rests of experiments," wrote Heisenberg, but "science is rooted in conversations."
Nothing could be further from the impression physics textbooks give to students. There, physics seems to be a perfect sculpture sitting in a vacuum-sealed case, as if brains, only tenuously connected to bodies, had given birth to insights fully formed...
Physics, in actuality, is a never-ending search made by human beings...The schoolbook simplifications obscure the crooked, strange and fascinating paths that stretch out from each idea, not only back into the past but onwards into the future...
Conversations are essential to science. But the off-the-cuff nature of conversation poses a difficulty. It is rare, even in these digital times, to have a complete transcript of every word spoken between two people on a given day, even if that conversation someday leads to a new understanding of the world. The result is that history books rarely have much of the to-and-fro of human interaction. Heisenberg's statement suggests that something is therefore lost.
QQQ
Being part of the Python world via its many conversations is one of my life's great joys.
Friday, July 24, 2009
psyco.leo
Ive just uploaded psyco.leo, a file containing the major C sources for psyco. This is the file I will use to study the psyco sources. So can you. You can find the file here.
This is a large (1.3 mb), self-contained file: it will not affect any of actual psyco files.
P.S. I created this file using a script that scans a directory and its sub-diretories, creating @auto nodes for all files in a given list of extensions, in this case, .c and .h files. Reloading psyco.leo populated the @auto nodes with the contents of the corresponding external files. I then removed the @auto prefixes. Finally, saving psyco.leo caused the contents of all files to be written to psyco.leo itself.
P.P.S. Those of you who are Leo users know that @auto imports the external file every time the .leo file is opened. @auto has a series of strict consistency checks to ensure that writing an imported @auto file will yield the same file. These checks failed for some of the files in the psyco sources because of intermixed blanks and tabs. When those checks fail, the @auto import code puts an @ignore directive in the top-level @auto node.
Eventually, I may want to regularize the whitespace so that @auto doesn't complain. This could be done simply by removing the @ignore directives and saving the .leo file. That is, assuming that the version of psyco.leo still contained @auto nodes...
This is a large (1.3 mb), self-contained file: it will not affect any of actual psyco files.
P.S. I created this file using a script that scans a directory and its sub-diretories, creating @auto nodes for all files in a given list of extensions, in this case, .c and .h files. Reloading psyco.leo populated the @auto nodes with the contents of the corresponding external files. I then removed the @auto prefixes. Finally, saving psyco.leo caused the contents of all files to be written to psyco.leo itself.
P.P.S. Those of you who are Leo users know that @auto imports the external file every time the .leo file is opened. @auto has a series of strict consistency checks to ensure that writing an imported @auto file will yield the same file. These checks failed for some of the files in the psyco sources because of intermixed blanks and tabs. When those checks fail, the @auto import code puts an @ignore directive in the top-level @auto node.
Eventually, I may want to regularize the whitespace so that @auto doesn't complain. This could be done simply by removing the @ignore directives and saving the .leo file. That is, assuming that the version of psyco.leo still contained @auto nodes...
Wednesday, July 22, 2009
Hello world in psyco
Aristotle said (presumably in Greek), "that which we learn, we learn by doing". In other words, there comes a point, rather quickly, when reading the manual (or even the code) is mostly waste of time :-) The only way, imo, to understand a complex program like pypy or psyco, is to run the program. So installing from sources is imperative, because we want to change the source and see what happens :-)
I've just built psyco with a "hello world" message at startup. Some notes:
- There is no point in trying to build the docs. You will simply get the psyco guide And a good thing too. The make file for the docs using something called mkhowto. Googling shows this may be part of Python tools, but there is no obvious way to get mkhowto on ubuntu.
- It looks like running "sudo python setup.py" install will do a proper build (make) of changed sources. The docs say this:
[quote]
As usual, other commands are available, e.g. will compile the C source and put the result directly into the py-support/ subdirectory (no administrator priviledge is required).
[end quote]
In my case, I don't mind destroying psyco by mistake, so maybe I'll just go with the plain vanilla install.
Anyway, I now have a version of psyco that I can play with. After 8 years working only with Python, I had forgotten quirks of C and Make, but the irritations, though real, are minor.
The next step will be to investigate the existing tracing and debugging capabilities of psyco.
One "significant" (if hardly earth-shaking) improvement to psyco would be to add Sherlock tracing. This would be good for development: it allows enabling and disabling of traces from the command line with minimal overhead during testing and no overhead in production.
I've just built psyco with a "hello world" message at startup. Some notes:
- There is no point in trying to build the docs. You will simply get the psyco guide And a good thing too. The make file for the docs using something called mkhowto. Googling shows this may be part of Python tools, but there is no obvious way to get mkhowto on ubuntu.
- It looks like running "sudo python setup.py" install will do a proper build (make) of changed sources. The docs say this:
[quote]
As usual, other commands are available, e.g.
python setup.py build_ext -i
[end quote]
In my case, I don't mind destroying psyco by mistake, so maybe I'll just go with the plain vanilla install.
Anyway, I now have a version of psyco that I can play with. After 8 years working only with Python, I had forgotten quirks of C and Make, but the irritations, though real, are minor.
The next step will be to investigate the existing tracing and debugging capabilities of psyco.
One "significant" (if hardly earth-shaking) improvement to psyco would be to add Sherlock tracing. This would be good for development: it allows enabling and disabling of traces from the command line with minimal overhead during testing and no overhead in production.
Tuesday, July 21, 2009
What success would look like
This post will summarize what I consider the criteria for success for projects such as pypy and psyco. I was going to say that this post will be a brief summary, but interesting ideas intruded :-)
You could say that this post distills a series of my questions and answers given by the psyco folks several years ago. As I consider these questions now, I'm surprised by how my intuitions seem to have changed in the interval.
To state the conclusions first: to be considered successful, both psyco and pypy would at least have to meet the performance of CPython, Python's default interpreter.
As the jit article should make clear, there is every reason to believe that this goal can be accomplished. Indeed, jits routinely succeed in this sense. And its not hard to see why. Even though CPython is written in C, it executes the exact same code every time it interprets a particular bytecode. It's ignorant of context, and stays ignorant. This ignorance is costly: it means that it must execute potentially unnecessary tests on the type of of arguments to the bytecode, including arguments on the stack. Worse, there is no way to optimize operations for arguments of known types, as discussed in the runtime-specialization article.
So it seems that out-performing CPython is a fairly reasonable goal to attempt.
However, there same strategies could conceivably produce gains that seem counter-intuitive to old-timers like me. That is, one can imagine situations in which runtime specialization will do better than even C or assembly languages codes that do no take advantage of information available at run time.
I have a hunch there is an important Aha lurking here. Before trying to say what that is, let me recommend to you the following paper:
Transparent Dynamic Optimization:The Design and Implementation of Dynamo
When I first read this paper, I thought that what they had accomplished was clearly impossible, namely to use software to speed up hardware. All my intuitions said that hardware operations were orders of magnitude faster than software operations, and for that reason, trying to intervene with software was doomed to failure. When I read the paper, my first thought was, "how did anyone think to do something this clever?"
But my intuitions were quite wrong. True, software is much slower than hardware, but in some sense this is irrelevant. In some cases, optimizing the instructions to be presented to the CPU is so valuable that it is worth significant extra effort, even if that effort takes place in software.
This is pretty close to the heart of the matter, so let's look at this from several other perspectives. Originally, I had thought of the situation as a matter of accounting. This isn't wrong, but it may be misleading. Considered as an accounting problem, one asks whether the optimization effort (done by software) can be "amortized" over the potential savings. Considered in this way, perhaps the answer is less clear than it might be. But notice, there is ample experimental evidence that the effort can be amortized.
But what if I simply stop fixating on the notion that software is much slower than hardware? Is there another point of view that makes the desired speedups seem not so outlandish? My intuition tells me there is, although it's not completely clear.
Let's try some ideas and see where they lead.
First, let's focus on the improved code, not on how long it takes to improve the code. Clearly, the improved code could be arbitrarily better than the unimproved code. For example, the improved code could be in a loop that is executed a huge number of time.
Second, let's not assume that "dispatching" to improved code or caching the improved code, or stitching together the improved code are necessarily time consuming operations. We might find clever ways of doing these operations very quickly. We might even find ways of eliminating these "overhead" operations entirely.
Lets go back again and focus on the improved code. As a thought experiment, lets ask, "How good could the improved code get?" The beauty of thought experiments is that we can rely on magic as needed. So let's assume some garden-variety magic and assume that all the overhead that formerly we were so worried about has magically just disappeared. Is this such an outrageous thing to assume? I think not. You could imagine it as being a species of loop unwinding combined with type specific specializations.
Furthermore, lots of information can be held implicitly in specialized code. The more I think about this, the more it seems like the master key to optimization. Indeed, all specialized code could be considered to be (or have) implicit information simply by virtue of the fact that the specialized code is a valid code improvement. In other words, if we execute the improved code there are a raft of underlying assumptions about context that must, in fact, be true for the improved code to be valid.
But this implicit information might indeed be sufficient to perform further operations. In particular, it may be sufficient to eliminate explicit overhead operations such as fetching code from a cache, linking to other improved code snippets, etc., etc.
In other words, we might be able to make any knowledge-based "magic" a reality.
I'm not entirely convinced that this is the best demonstration that large speedups are possible, but I think it may be good enough. OTOH, I have one of those feelings that an even better point of view is lurking nearby. For example, one could propose the following hypothesis:
Given enough optimization, we can treat all implicit knowledge as if it were explicitly available at no additional cost.
This still doesn't seem like a perfectly convincing argument, but it will have to do for now...I think you can sense the excitement promised by psyco and (even more) in pypy.
Edward
You could say that this post distills a series of my questions and answers given by the psyco folks several years ago. As I consider these questions now, I'm surprised by how my intuitions seem to have changed in the interval.
To state the conclusions first: to be considered successful, both psyco and pypy would at least have to meet the performance of CPython, Python's default interpreter.
As the jit article should make clear, there is every reason to believe that this goal can be accomplished. Indeed, jits routinely succeed in this sense. And its not hard to see why. Even though CPython is written in C, it executes the exact same code every time it interprets a particular bytecode. It's ignorant of context, and stays ignorant. This ignorance is costly: it means that it must execute potentially unnecessary tests on the type of of arguments to the bytecode, including arguments on the stack. Worse, there is no way to optimize operations for arguments of known types, as discussed in the runtime-specialization article.
So it seems that out-performing CPython is a fairly reasonable goal to attempt.
However, there same strategies could conceivably produce gains that seem counter-intuitive to old-timers like me. That is, one can imagine situations in which runtime specialization will do better than even C or assembly languages codes that do no take advantage of information available at run time.
I have a hunch there is an important Aha lurking here. Before trying to say what that is, let me recommend to you the following paper:
Transparent Dynamic Optimization:The Design and Implementation of Dynamo
When I first read this paper, I thought that what they had accomplished was clearly impossible, namely to use software to speed up hardware. All my intuitions said that hardware operations were orders of magnitude faster than software operations, and for that reason, trying to intervene with software was doomed to failure. When I read the paper, my first thought was, "how did anyone think to do something this clever?"
But my intuitions were quite wrong. True, software is much slower than hardware, but in some sense this is irrelevant. In some cases, optimizing the instructions to be presented to the CPU is so valuable that it is worth significant extra effort, even if that effort takes place in software.
This is pretty close to the heart of the matter, so let's look at this from several other perspectives. Originally, I had thought of the situation as a matter of accounting. This isn't wrong, but it may be misleading. Considered as an accounting problem, one asks whether the optimization effort (done by software) can be "amortized" over the potential savings. Considered in this way, perhaps the answer is less clear than it might be. But notice, there is ample experimental evidence that the effort can be amortized.
But what if I simply stop fixating on the notion that software is much slower than hardware? Is there another point of view that makes the desired speedups seem not so outlandish? My intuition tells me there is, although it's not completely clear.
Let's try some ideas and see where they lead.
First, let's focus on the improved code, not on how long it takes to improve the code. Clearly, the improved code could be arbitrarily better than the unimproved code. For example, the improved code could be in a loop that is executed a huge number of time.
Second, let's not assume that "dispatching" to improved code or caching the improved code, or stitching together the improved code are necessarily time consuming operations. We might find clever ways of doing these operations very quickly. We might even find ways of eliminating these "overhead" operations entirely.
Lets go back again and focus on the improved code. As a thought experiment, lets ask, "How good could the improved code get?" The beauty of thought experiments is that we can rely on magic as needed. So let's assume some garden-variety magic and assume that all the overhead that formerly we were so worried about has magically just disappeared. Is this such an outrageous thing to assume? I think not. You could imagine it as being a species of loop unwinding combined with type specific specializations.
Furthermore, lots of information can be held implicitly in specialized code. The more I think about this, the more it seems like the master key to optimization. Indeed, all specialized code could be considered to be (or have) implicit information simply by virtue of the fact that the specialized code is a valid code improvement. In other words, if we execute the improved code there are a raft of underlying assumptions about context that must, in fact, be true for the improved code to be valid.
But this implicit information might indeed be sufficient to perform further operations. In particular, it may be sufficient to eliminate explicit overhead operations such as fetching code from a cache, linking to other improved code snippets, etc., etc.
In other words, we might be able to make any knowledge-based "magic" a reality.
I'm not entirely convinced that this is the best demonstration that large speedups are possible, but I think it may be good enough. OTOH, I have one of those feelings that an even better point of view is lurking nearby. For example, one could propose the following hypothesis:
Given enough optimization, we can treat all implicit knowledge as if it were explicitly available at no additional cost.
This still doesn't seem like a perfectly convincing argument, but it will have to do for now...I think you can sense the excitement promised by psyco and (even more) in pypy.
Edward
pypy and psyco
The first exploration will concern two projects that aim to speed up Python, namely pypy and psyco. The plan is to learn one or both of these projects in enough detail to make at least one significant contribution. It's a big task, and there will presumably be many posts on this topic.
The first step is to get some idea of what these projects are all about. The Wikipedia entries give excellent introductions:
http://en.wikipedia.org/wiki/PyPy
http://en.wikipedia.org/wiki/Psyco
Each project might be called a jit:
http://en.wikipedia.org/wiki/Just-in-time_compilation
http://en.wikipedia.org/wiki/Run-time_algorithm_specialisation
The two projects are very different. There are many "fancy" ways to describe the differences, but the most obvious difference is:
- psyco is written in C (mostly)
- pypy is pure python.
With psyco, what you see is what you get (and all that you get). In contrast, pypy works on any python code (including itself), so all optimizations within pypy will improve pypy itself.
That's enough for now. The next post will discuss what we might reasonably expect from these projects. That is, we will look at what "success" might mean.
Edward
The first step is to get some idea of what these projects are all about. The Wikipedia entries give excellent introductions:
http://en.wikipedia.org/wiki/PyPy
http://en.wikipedia.org/wiki/Psyco
Each project might be called a jit:
http://en.wikipedia.org/wiki/Just-in-time_compilation
http://en.wikipedia.org/wiki/Run-time_algorithm_specialisation
The two projects are very different. There are many "fancy" ways to describe the differences, but the most obvious difference is:
- psyco is written in C (mostly)
- pypy is pure python.
With psyco, what you see is what you get (and all that you get). In contrast, pypy works on any python code (including itself), so all optimizations within pypy will improve pypy itself.
That's enough for now. The next post will discuss what we might reasonably expect from these projects. That is, we will look at what "success" might mean.
Edward
About this blog
This blog will chronicle various explorations re computers, science, engineering and whatever else interests me.
Subscribe to:
Posts (Atom)