July 18, 2004

Python, Pie-thon, and lexical scopes

So here I am, trying to puzzle out how Python handles parameter passing, since passing parameters into functions is, after all, really useful For a while this afternoon I wasreally nervous that they all went in on the stack. That'd be bad, since it'd really have messed up the translator program, which works with the assumptions that the stack starts out clean, can be statically analyzed, and is effectively empty at brach destinations. Needless to say, a variably-sized starting stack kinda kills static analysis.

Luckily... that turned out not to be the case. Plus it turns out that the pie-thon benchmark code only uses two of the four call types, which makes life a bit easier, as I don't have to get CALL_FUNCTION_KW and CALL_FUNCTION_VAR_KW implemented. (Though they'll probably be reasonably simple to do, and I might anyway. Probably not right away though) There's also no mix--it's either all positional or all keyword. (Dunno if it's even legal to mix them, but bluntly I think that's a bad idea anyway, so I'm just as glad to not have to bother with that case)

I also found an interesting implementation feature of Python that ties into one of the bits of the language.

Now Python, as you probably know, doesn't really do lexical variables. You've really got two types--function-local variables and global variables. The locals could be considered lexical if you squinted really hard, but... nah, not really. It does do named parameters, though, unlike Perl's "slam everything into @_" hack, and the named parameters can be passed in by name. They're also considered locals to the function. Now, as I was digging in, I was thinking "how the heck do the parameters get passed in?" Well, it turns out that they don't. What Python does instead is build up a frame (sorry, a dict) that contains all the function's locals, fills them in, and passes that frame into the called function. The caller does this. Which, I've gotta admit, is clever. Granted, for a fully general solution for me it's a major pain in the ass, but I don't need a fully general solution. Not yet, at least. Maybe in a few weeks.

Anyway, this is relatively easy to do. (Relatively...) Each code object is tagged with the number of locals it takes as well as the names of the variables it has, so allocating the hash is simple. For the vararg case the overflow goes into a list which is the last element of the parameter list. Or so I think. I haven't yet found where in the code it handles those overflow things. Gotta be in there somewhere, but I'll get to that when I get to it.

It's odd. I've been digging through ceval.c to figure out what this stuff all does, and I'm finding that, in general, it's almost not worth digging too deeply. Yeah, all of the python engine's grubby secrets are in there but, honestly, I don't care how it does what it does, just what it actually does. Which is to say that I care about what the stack looks like and how the environment behaves, not how the engine flips its bits and does its tests to get there.

Hopefully when I'm done with this functions will all work. Woohoo for that, but I'm waiting until it actually works.

Probably the single biggest annoyance of all this is that the lambda functions, mainly because they're anonymous.

Posted by Dan at July 18, 2004 12:36 AM | TrackBack (1)
Comments

About excess positional arguments, it is named fxn.func_code.co_varnames(fxn.func_code.co_argcount+1) if ``fxn.func_code.co_flags & 4`` is true. You can read http://docs.python.org/lib/inspect-types.html for all the gory details of the object attributes of code objects (and other basic execution objects). As always, dir() at the interpreter prompt is your friend.

And as for mixing positional and keyword arguments, yes you can::

parrot("Dan", alive=True)

is a legit call. You just have to have keyword arguments come last after all positional arguments.

Posted by: Brett at July 19, 2004 02:35 AM

Interesting. I'll have to go look and see what happens when the keyword parameters overlay the positional parameters. I'd presume the keyword version wins, but I should go check.

I've been making friends instead with ceval.c and frameobject.c. It's been... interesting. I still need to sort out exactly what the heck the block stack's used for, though I'm assuming it's for nesting exception handlers.

Posted by: dan at July 19, 2004 09:35 AM

Either i'm misunderstanding what you wrote (quite possible!) or you're unaware, and need to know, that python has real lexical scoping (with a flaw).

You wrote:

> Now Python, as you probably know, doesn't really do lexical
> variables. You've really got two types--function-local variables and
> global variables. The locals could be considered lexical if you squinted
> really hard, but... nah, not really.

Jeremy Hylton added statically nested lexical scoping to Python at v 2.1. See PEP 0227, "Statically Nested Scopes".

For example:

 python
 Python 2.3.3 (#1, May 16 2004, 19:18:10) 
 [GCC 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7)] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> x = 101
 >>> def around(x):
 ...   def enclosed():
 ...     print "Enclosed 'x' ==", x
 ...   enclosed()
 ... 
 >>> around("from the enclosing function locals!")
 Enclosed 'x' == from the enclosing function locals!
 >>> 

I apologize if this was already obvious to you, and i just misunderstood what you said. I don't know the substance of the benchmark, so don't know whether it would make any material difference to your pie-thon directed efforts, anyway.

In any case, good luck with the contest! I hope you have enough time so you don't feel cheated by fate, get to show parrot in the best light! As a python partisan, i want to thank you for your work - i appreciate the effort, whatever the relative measurements show.

Posted by: ken at July 20, 2004 01:52 PM

(Oh yeah - i mentioned a flaw in the lexical scoping provisions. It so happens that the syntax conspires to prevent assignment to variables gotten from intermediate scopes. Assignments signal local bindings, unless counteracted by 'global' variable declaration - which makes it global, not connected to the intermediate scope.

It may not be right to call it a flaw, though - depending on whether you think it's appropriate to foster assignments variables from enclosing local scopes - opinions can vary.)

Posted by: ken at July 20, 2004 04:32 PM

Oh, right, those things. They're kinda-sorta lexicals, aren't they? Luckily for me it doesn't actually matter, since the bytecode itself has no real concept of scoping. Those statically scoped things translate down to access some entries in the free & cell variable store by offset, which is a snap to compile. (Though I don't think I did get the version of the translator with support for them uploaded)

Posted by: Dan at July 20, 2004 04:44 PM

Overlay how? If you mean repeat yourself for the same variable, such as ``bunk(1, when=2)`` with the function of ``def bunk(when): pass``, you get an error about trying to assign a value more than once in the parameter list.

Keyword arguments tend to only be used either to be explicit about what values you are assigning to parameters or to assign to parameters past positional arguments that have interleaving default values: ``bunk(a, b, c=3, d=4, e=5): pass`` might be called as ``bunk(1, 2, d=-6)`` or ``bunk(1, 2, 3, -1)``, both of which lead to the same assignment of values to the parameters.

As for the block stack, I believe that is mostly used by 'for' loops. Not really sure, though, since I have not messed with those yet. I would just search Python/compile.c for the block stack opcodes and see what uses them.

Posted by: Brett at July 20, 2004 10:50 PM

FYI: keep in mind that MAKE_FUNCTION is where most of the lexical variable magic happens. Specifically, a Python "function" object consists of:

1. A code object
2. A reference to the module globals
3. Default values for the function
4. The closure, consisting of a tuple of "cells" representing variables from enclosing scope(s).

The same code object can in principle be used with multiple globals (e.g. via 'exec codeobject in somedict'), and it definitely can be used with different cells, since each time the enclosing function is invoked there will be new cells.

I don't know if the challenge code uses lexical scopes or not, but if it does, you'll need to deal with this in your MAKE_FUNCTION opcode, along with any default values that need to be set up as well.

Posted by: Phillip J. Eby at July 22, 2004 10:25 PM