osa1

Micro-Manual for Lisp in Python

March 23, 2012 - Tagged as: python, lisp, tr.

I was reading some papers about first Lisps and their implementations, and I came across with John McCarthy’s A Micro-Manual for Lisp - Not the Whole Truth. I had heard about it some time ago but never read the paper, and as a programming exercise, I implemented in Python within an hour or so.

If you’re into Lisps, I highly recommend you Recursive Functions of Symbolic Expressions and their Computation by Machine (Part I) and the micro-manual. What I really love about micro-manual is that it’s a great way to see simplicity in original Lisp. With just 9 forms you have a working Lisp system. It also introduces it’s own eval function in Lisp.

In the rest of the post, when I use Lisp, I’ll be mentioning the Lisp in micro-manual, not the modern dialects.

Lisp programs are formed by symbolic expressions(s-exps, sexps, sexprs), and a sexp is either a list or an atom. Before interpreting our Lisp, we should read sexps and convert them into a form that we can work on with our host language(in my case, it’s Python).

I’ll convert Lisp lists into ordinary Python lists, and atoms into Python strings. My reader has two parts, the first part is getting the input and parsing tokens:

class Reader:
    def __init__(self, form):
        self.form = form
        self.index = 0
    def seek_char(self):
        if self.index >= len(self.form):
            return None
        return self.form[self.index]
    def unread_char(self):
        self.index -= 1
    def read_char(self):
        if self.index >= len(self.form):
            return None
        self.index += 1
        return self.form[self.index-1]
    def read_token(self):
        ch = self.read_char()
        if ch == '(':
            return '('
        elif ch == ')':
            if self.seek_char() == ' ':
                self.read_char()
            return ')'
        else:
            buf = ''
            while ch != ' ' and ch != None:
                if ch == ')':
                    self.unread_char()
                    return buf
                else:
                    buf += ch
                    ch = self.read_char()
            return buf

Well, it may not be the best way to parse an input to tokens, but it works great. It return a (, ), or a string each time you call read_token. When we read (, we start collecting a list, until reading a )(we should also consider lists into lists):

def read_list(reader):
    result = []
    token = reader.read_token()
    while token:
        if token == '(':
            result.append(read_list(reader))
        elif token == ')':
            return result
        else:
            result.append(token)
        token = reader.read_token()
    return result

read_atom functions is easier, since each each atom is also a token:

def read_atom(reader):
    return reader.read_token()

At this point, we have a Lisp reader that reading s-expressions, converting lists to Python lists and atoms to Python strings. Now we can define our 9 primitive procedures(quote, car, cdr, cons, equal, atom, cond, lambda, label):

def quote(e):
    return e
def car(e):
    assert isinstance(e, list)
    return e[^0]def cdr(e):
    assert isinstance(e, list)
    return e[1:]
def cons(e1, e2):
    if isinstance(e2, list):
        return [e1] + e2
    return [e1, e2]
def equal(e1, e2):
    return e1 == e2
def atom(e):
    return not isinstance(e, list)

These are obvious. Since I’ve already converted sexp lists into Python lists, all I need to do is to call some Python list methods.

def cond(*cases):
    for case in cases:
        if eval_(case[^0] is not None:
            for expr in cdr(case)[:-1]:
                eval_(expr)
            return eval_(case[-1])
    return None

An important point here is that in cond, I’m not evaluating all expressions, but I’m evaluating the test forms until I find a form that evaluates something that’s not nil, and then evaluating it’s form.

def lambda_(args, *exprs):
    def fn(*arg_vals):
        fn_env = {k:v for (k, v) in zip(args, arg_vals)}
        fn_env['parent_env'] = env
        for expr in exprs[:-1]:
            eval_(expr, fn_env)
        return eval_(exprs[-1], fn_env)
    return fn
def label(name, lambda_exp, *exprs):
    func = eval_(lambda_exp)
    label_env = env.copy()
    label_env[name] = func
    for exp in exprs[:-1]:
        eval_(exp, label_env)
    return eval_(exprs[-1], label_env)

label is a way to name lambdas, so you can create recursive functions. I’m creating a new environment for each label, and connecting it to the parent environment.

env = {'quote': quote,
       'car': car,
       'cdr': cdr,
       'cons': cons,
       'equal': equal,
       'atom': atom,
       'cond': cond,
       'lambda': Lambda,
       'label': label,
       'defun': defun}

The global environment. When an atom is evaluated, it’s value is searched in here, with this function:

def search_in_env(env, key):
    val = env.get(key)
    if val:
        return val
    if env.has_key('parent_env'):
        return search_in_env(env['parent_env'], key)
    raise KeyError(key)

Since each environment may be connected to a parent environment(the case of label), we should search all the chain of environments.

So now we have the Lisp described in the micro-manual, we only need the eval:

def eval_(exp, env=env):
    # print "evaluating: %s" % str(exp)
    if isinstance(exp, list):
        if isinstance(exp[^0] list):
            op = eval_(exp[^0]
        else:
            op = search_in_env(env, exp[^0]
        if op in [quote, cond, lambda_, label, defun]:
            return apply(op, exp[1:])
        return apply(op, [eval_(e, env) for e in exp[1:]])
    return search_in_env(env, exp)

And some helpers for REPL:

def eval_from_string(string):
    return eval_(read(Reader(string)))
def expr_repr(expr):
    if isinstance(expr, list):
        return '(' + ' '.join([expr_repr(e) for e in expr]) + ')'
    elif isinstance(expr, bool):
        if expr:
            return 'T'
        return 'nil'
    return str(expr)
def repl():
    while True:
        try:
            input = raw_input("> ")
            print expr_repr(eval_(read(Reader(input))))
        except (KeyboardInterrupt, EOFError):
            return

I also wrote some unit-tests based on examples in the paper. You can read the implementation and tests from the gist. To run the interpreter, just save the code and run python lisp.py.

(Show comments)