zondag 20 december 2009

Doing SQL with applicative functors in python

An example implementation of a applicative functor in python

I haven't had a lot of time lately, so I have to break my promise made in my previous blog. No javascript decorators this time. This is because I am studying Haskell. Therefore I changed the subject and will show an example implementation of a small combinators library based on applicative functors.

First I will explain shortly how to interpret the Haskell type system. Then I will talk about functors and applicative functors. Then I will show you how to implement these in python.

Haskell type system

Haskell is a static strong typed language, unlike python, which is dynamic strong typed. Therefore every function in haskell must have a known type signature at compile time. These can be specified by the programmer in the source or can be omitted so the compiler has to sort it out itself. These signatures look like this:
func :: (Num a) => a -> a -> a
func is a function, which takes in 2 numeric arguments and returns one numeric argument. All functions in Haskell are first class, so this is also possible:
func :: (a -> b) -> a -> b
This is a function which takes in a function and a variable of type a and returns a variable of type b.

The signatures are right associative, so this:
func :: (a -> b) -> a -> b
is the same as this:
func :: (a -> b) -> (a -> b)
This mean, that if I call func with one argument it returns a function. Thus all functions are curried!

This is a rather short explaination, but enough to understand the rest of the story. Lets go on to the functors.

What are functors

Functors in most languages are callable objects. In Haskell they are special mathematical objects. I won't and can't dive to deep into this matter, because I never had category theory, but I can try to explain what it is. A functor in Haskell is a data structure (again this is rather simplified), which has a special function attach to it with the following type signature:
fmap :: (Functor f) => (a -> b) -> f a -> f b
It takes in a function which maps a to b and functor of type a and it returns a functor of type b. In most programming languages there are functors. In Haskell they are named explicitly. E.g. Lists in python are a functor, fmap here is map:
map(function, iterable, ...)¶
So map takes in a function and a iteratable (like a list) and applies the function to each member of the list and then returns the modified list. Almost everything can be a functor, we only have to define its own fmap:
class BoringFunctor:
        def __init__(self, res):
                self.res = res
        @staticmethod
        def fmap(func, bf):
                return BoringFunctor(func(bf.res))


t = BoringFunctor(5)
p = BoringFunctor.fmap((lambda x: x + 5), t)
#prints 10
print p.res
Well that was easy, lets go on to applicative functors.

Applicative functors

Applicative functors have a couple of extra functions. These are:
pure :: a -> f a 
and
<*> :: f (a -> b) -> f a -> f b
Pure simply returns a new applicative functor. <*> is somewhat more puzzling. This takes in a functor, which carries a function of type a -> b and a functor, which carries an variable a. It returns a functor of variable b. What can we do with this? First I have to add <*> is an left associative infix operator.

We can define fmap:
fmap f x= pure f <*> x 
We can define an fmap function, which takes 2 arguments -`fmap` makes fmap infix-
fmap2 :: (a -> b -> c -> d) -> f a -> f b -> f c -> f d
fmap2 f a b = f `fmap` a <*> b <*> c
This is looks rather difficult. Let take the left associativeness into account:
fmap3 f a b c = (((f `fmap` a) <*> b) <*> c) 
We could break this into steps:
--- (a -> b -> c -> d) -> f a -> f (b -> c -> d)
fmap3' f a = f `fmap` a
-- f (b -> c -> d) -> f b -> f ( c -> d) 
fmap3'' f b = f <*> b
-- f (c -> d) -> f c -> f d
fmap3''' f c = f <*> c
fmap3 f a b c = fmap3''' (fmap3'' (fmap3' f a) b) c
With each step the function gets more and more curried.

Putting it into practice

We are going to setup a combinator library for communicating with the database. Let first define our functor:
class AppFunctor:
        """Applicative functor (well it has grown to a monad)"""
        _inner = None
        def __init__(self,inner):
                self._inner = inner
        # (a -> b) -> f a -> f b
        @staticmethod
        def fmap(func, functor):
                inner = functor.getInner()
                inner = func(inner)
                return AppFunctor.pure(inner)
        # a -> f a
        @staticmethod
        def pure(inner):
                return AppFunctor(inner)
It is just a class with two static methods. fmap an pure. I have added the type signature so I don't get confused.
pure returns just a new AppFunctor. And fmap unpacks the functor apply the function and returns the return value in a new AppFunctor. That was rather easy. Now we are creating <*>. Unfortunately it isn't possible to create your own operators in python, so I chose -, which works.
# f (a -> b) -> f a -> f b 
        @staticmethod
        def ap(apfunctor, connfunctor):
                inner = connfunctor.getInner()
                func = apfunctor.getInner()
                return AppFunctor.pure(func(inner))
        def __sub__(self, other):
                return AppFunctor.ap(other, self)

As we can see, f <> b unpacks f and b, and then applies b to f after this it returns the return value in a new AppFunctor. Here I create a couple of methods, which aren't related to applicative functors, but can be handy:
# extract (comonad) f a -> (f a ->   f b) -> f b
        def __add__(self, func):
                return func(self)
        # normal bind  f a -> (a -> f b) -> f b (Not used in this tutorial, 
        #but can be very powerfull)
        def __ge__(self, func):
                inner = self.getInner()
                return func(inner)
 # f (a -> b -> c) -> a -> f ( b -> c )
        # injects an argument into functor
        def inject(self, arg):
                func = self.getInner()
                return func(arg)
        def __mul__(self, arg):
                return self.inject(arg)
Now we need a simple decorator, which can turn functions into applicative functors:
def app(func):
        def new(*args):
                return func(*args)
        return AppFunctor.pure(new)

We can use this as follow:
@app
def printResult(obj):
       ...

The whole class

def app(func):
        def new(*args):
                return func(*args)
        return AppFunctor.pure(new)

class AppFunctor:
        """Applicative functor (well it has grown to a monad)"""
        _inner = None
        def __init__(self,inner):
                self._inner = inner
        # (a -> b) -> f a -> f b
        @staticmethod
        def fmap(func, functor):
                inner = functor.getInner()
                inner = func(inner)
                return AppFunctor.pure(inner)
        # pseudo bind f a -> (f a ->   b) -> f b
        def __add__(self, func):
                return func(self)
        # normal bind  f a -> (a -> f b) -> f b
        def __ge__(self, func):
                inner = self.getInner()
                return func(inner)
        # a -> f a
        @staticmethod
        def pure(inner):
                return AppFunctor(inner)
        # f (a -> b) -> f a -> f b 
        @staticmethod
        def ap(apfunctor, connfunctor):
                inner = connfunctor.getInner()
                func = apfunctor.getInner()
                return AppFunctor.pure(func(inner))
        def __sub__(self, other):
                return AppFunctor.ap(other, self)
        def getInner(self):
                return self._inner
        # f (a -> b) -> a -> f b
        # injects an argument into functor
        def inject(self, arg):
                func = self.getInner()
                return func(arg)
        def __mul__(self, arg):
                return self.inject(arg)

Lets use it

We want to make a connection to the database, run a query and retrieve the results. We want to do this with our new toy. I use MySQLdb for the communication with the database. First I need an object to hold the current cursor in and the connection:
class ConnectionObject:
        _cursor = None
        _conn = None
        def __init__(self, conn):
                self._conn = conn
        def getConn(self):
                return self._conn
        def getCursor(self):
                return self._cursor
        def setCursor(self, cursor):
                self._cursor = cursor

And a function to connect to the database. This returns a functor:
def connectToDb():
        conn = connect(host="your host", passwd="your password", user="your user", db="yourdb");
        obj = ConnectionObject(conn)
        return AppFunctor.pure(obj)
I need a function to fetch a cursor. This should be a applicative functor, so we can apply @app to it
@app
def getCursor(obj):
        obj.setCursor(obj.getConn().cursor())
        return obj
And I need a function, which executes a query:
@app
def execute(sql):
        @app
        def inner2(obj):
                obj.getCursor().execute(sql)
                return obj

        return inner2
The inner lambda is needed to wrap the sql statement in a closure until a cursor object is passed.

And a function which fetches the results and returns the cursor object:
def fetchAll(cfunc):
        @app
        def inner(obj):
                return obj.getCursor().fetchall()
        return cfunc - inner + lambda x: dict(obj=cfunc, results=x) 
This function has some special things. It is used at the end of our assembly line. Now lets use our new functions.

First we make the connection and request a cursor:
obj = connectToDb() - getCursor
The wrapped connection object is passed to getCursor with (-). obj is also an appfunctor

We also want to make a query:
obj = connectToDb() - getCursor - execute * "select * from orders limit 10"
connectToDb is feeded to getCursor, then execute * "select * from orders limit 10" is evaluated, this yields another appfunctor. Then the result of connectToDb - getCursor is put into the result of execute * ...,

The last step is to fetch the results. Fetchall is different, because it doesn't want the result to be unpacked. This is because it is an holder of an applicative functor. Namely inner, so we have to use the (+) operator:
test =  connectToDb() - getCursor - execute * 'select * from orders limit 10'  + fetchAll
Now we can see the result by unpacking test:
pp.pprint(test['results'].getInner())
As you can see, we have designed a really elegant way to pull stuff from the database. You can use this on many places. It is clear what is happening. It connects to the database, gets a cursor, executes a query and then fetch it all in one rule.

The whole file

import sys
from MySQLdb import *
import pprint


class ConnectionObject:
        _cursor = None
        _conn = None
        def __init__(self, conn):
                self._conn = conn
        def getConn(self):
                return self._conn
        def getCursor(self):
                return self._cursor
        def setCursor(self, cursor):
                self._cursor = cursor

def app(func):
        def new(*args):
                return func(*args)
        return AppFunctor.pure(new)

class AppFunctor:
        """Applicative functor (well it has grown to a monad)"""
        _inner = None
        def __init__(self,inner):
                self._inner = inner
        # (a -> b) -> f a -> f b
        @staticmethod
        def fmap(func, functor):
                inner = functor.getInner()
                inner = func(inner)
                return AppFunctor.pure(inner)
        # pseudo bind f a -> (f a ->  f b) -> f b
        def __add__(self, func):
                return func(self)
        # normal bind  f a -> (a -> f b) -> f b
        def __ge__(self, func):
                inner = self.getInner()
                return func(inner)
        # a -> f a
        @staticmethod
        def pure(inner):
                return AppFunctor(inner)
        # f (a -> b) -> f a -> f b 
        @staticmethod
        def ap(apfunctor, connfunctor):
                inner = connfunctor.getInner()
                func = apfunctor.getInner()
                return AppFunctor.pure(func(inner))
        def __sub__(self, other):
                return AppFunctor.ap(other, self)
        def getInner(self):
                return self._inner
        # f (a -> b) -> a -> f b
        # injects an argument into functor
        def inject(self, arg):
                func = self.getInner()
                return func(arg)
        def __mul__(self, arg):
                return self.inject(arg)



@app
def getCursor(obj):
        obj.setCursor(obj.getConn().cursor())
        return obj

@app
def execute(sql):
        @app
        def inner2(obj):
                obj.getCursor().execute(sql)
                return obj

        return inner2

def fetchAll(cfunc):
        @app
        def inner(obj):
                return obj.getCursor().fetchall()
        return cfunc - inner + lambda x: dict(obj=cfunc, results=x)

def connectToDb():
        conn = connect(host="assasa", passwd="sasasa", user="assasa", db="asas");
        obj = ConnectionObject(conn)
        return AppFunctor.pure(obj)


test =  connectToDb() - getCursor - execute * 'select * from orders limit 10'  + fetchAll

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(test['results'].getInner())

Update

The fetchAll function is somewhat unclean. It should return an AppFunctor. To facilitate this, I change the connectionobject as follow:
class ConnectionObject:
        _cursor = None
        _conn = None
        _res = None
        def __init__(self, conn):
                self._conn = conn
        def getConn(self):
                return self._conn
        def getCursor(self):
                return self._cursor
        def setCursor(self, cursor):
                self._cursor = cursor
        def setResults(self, res):
                self._res = res
        def getResults(self):
                return self._res
The fetch all function looks a lot nicer now and it returns an functor:
def fetchAll(cfunc):
        @app
        def inner(obj):
                obj.setResults(obj.getCursor().fetchall())
                return obj
        return cfunc - inner

To see the results we can do:
test =  connectToDb() - getCursor - execute * 'select * from orders limit 10'  + fetchAll

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(test.getInner().getResults())
We even could simplify the changeAll function:
@app
def fetchAll(obj):
        obj.setResults(obj.getCursor().fetchall())
        return obj
The statement now becomes:
test =  connectToDb() - getCursor - execute * 'select * from orders limit 10'  - fetchAll
Which is much neater than my first try.

Full file

import sys
from MySQLdb import *
import pprint


class ConnectionObject:
        _cursor = None
        _conn = None
        _res = None
        def __init__(self, conn):
                self._conn = conn
        def getConn(self):
                return self._conn
        def getCursor(self):
                return self._cursor
        def setCursor(self, cursor):
                self._cursor = cursor
        def setResults(self, res):
                self._res = res
        def getResults(self):
                return self._res

def app(func):
        def new(*args):
                return func(*args)
        return AppFunctor.pure(new)

class AppFunctor:
        """Applicative functor (well it has grown to a monad)"""
        _inner = None
        def __init__(self,inner):
                self._inner = inner
        # (a -> b) -> f a -> f b
        @staticmethod
        def fmap(func, functor):
                inner = functor.getInner()
                inner = func(inner)
                return AppFunctor.pure(inner)
        # comonad extract f a -> (f a ->   b) -> f b
        def __add__(self, func):
                return func(self)
        # normal bind  f a -> (a -> f b) -> f b
        def __ge__(self, func):
                inner = self.getInner()
                return func(inner)
        # a -> f a
        @staticmethod
        def pure(inner):
                return AppFunctor(inner)
        # f (a -> b) -> f a -> f b 
        @staticmethod
        def ap(apfunctor, connfunctor):
                inner = connfunctor.getInner()
                func = apfunctor.getInner()
                return AppFunctor.pure(func(inner))
        def __sub__(self, other):
                return AppFunctor.ap(other, self)
        def getInner(self):
                return self._inner
        # f (a -> b) -> a -> f b
        # injects an argument into functor
        def inject(self, arg):
                func = self.getInner()
                return func(arg)
        def __mul__(self, arg):
                return self.inject(arg)



@app
def getCursor(obj):
        obj.setCursor(obj.getConn().cursor())
        return obj

@app
def execute(sql):
        @app
        def inner2(obj):
                obj.getCursor().execute(sql)
                return obj

        return inner2

@app
def fetchAll(obj):
        obj.setResults(obj.getCursor().fetchall())
        return obj

def connectToDb():
        conn = connect(host="mysql1.sdfsdf.nl", passwd="dsfsdf", user="goodforall", db="fssdf_eu",ssl="true");
        obj = ConnectionObject(conn)
        return AppFunctor.pure(obj)



test =  connectToDb() - getCursor - execute * 'select * from orders limit 10'  - fetchAll

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(test.getInner().getResults())

Geen opmerkingen:

Een reactie posten