[Haskell-cafe] encoding for least fixpoint

Discussion:

ben

2009-03-17 23:36:21 UTC

Hello,

I am trying to understand the definition of (co)inductive types in
Haskell. After reading the beginning of Vene's thesis [1] I was happy,
because I understood the definition of the least fixpoint:

newtype Mu f = In (f (Mu f)).

But this definition coincides with his definition of the greatest fixpoint

newtype Nu f = Wrap (f (Nu f)),

which in reality is not true
(e. g. for
f x = a * x -- the 'stream functor'
the type Mu f should be empty, isn't it?).

Then I stumbled over a blog entry of Shin-Cheng Mu [2] and from there
over an article of Wadler [3], where the least fixpoint is encoded as

Lfix X. F X = All X. (F X -> X) -> X.

and the greatest fixpoint as

Gfix X. F X = Exists X. (X -> F X) * X.

I would like to understand these definitions, or get an intuition about
their meaning.
Could somebody give some explanation or a pointer to a place where I can
find one?

Thanks a lot in advance,
ben

[1] http://www.cs.ut.ee/~varmo/papers/thesis.pdf
[2]
http://www.iis.sinica.edu.tw/~scm/2007/encoding-inductive-and-coinductive-types-in-polymorphic-lambda-calculus/
[3]http://homepages.inf.ed.ac.uk/wadler/papers/free-rectypes/free-rectypes.txt

Dan Doel

2009-03-18 07:22:56 UTC

Permalink

Post by ben
I am trying to understand the definition of (co)inductive types in
Haskell. After reading the beginning of Vene's thesis [1] I was happy,
newtype Mu f = In (f (Mu f)).
But this definition coincides with his definition of the greatest fixpoint
newtype Nu f = Wrap (f (Nu f)),
which in reality is not true
(e. g. for
f x = a * x -- the 'stream functor'
the type Mu f should be empty, isn't it?).

This is true in a categorical semantics where initial algebras and final
coalbebras are distinct, like in a total language that gets its semantics from
sets and total functions thereon. However, Haskell gets its semantics (modulo
some potential weirdness in IO that people have been discussing lately) from
categories of partial orders and monotone functions. It turns out that you can
show that initial algebras and final coalgebras coincide in such a category,
and so least and greatest fixed points are the same in Haskell (I don't know
how to demonstrate this; my fairly cursory searching hasn't revealed any very
good online references on DCPO categories and algebraic semantics), which is
why there's no distinction in the data declaration (as opposed to, say, Agda).

Post by ben
Then I stumbled over a blog entry of Shin-Cheng Mu [2] and from there
over an article of Wadler [3], where the least fixpoint is encoded as
Lfix X. F X = All X. (F X -> X) -> X.
and the greatest fixpoint as
Gfix X. F X = Exists X. (X -> F X) * X.
I would like to understand these definitions, or get an intuition about
their meaning.
Could somebody give some explanation or a pointer to a place where I can
find one?

You can glean some of this from the initial algebra/final coalgebra
definitions of (co)data. The least fixed point Mu F of F is an F-algebra,
which means there's an operation:

in : F (Mu F) -> Mu F

And Mu F is initial in the category of F-algebras, which means that for any
other F-algebra (X, f : F X -> X), there's a unique algebra homomorphism:

cata_f : Mu F -> X

such that: cata_f . in = f . map cata_f. And you can finesse that in Haskell
and with higher order functions and make it:

cata :: forall x. (F x -> x) -> Mu F -> x

And, I suppose, since initial algebras Mu F correspond uniquely with such
parameterized algebra homomorphisms, you're safe in *representing* them as
such, which gives you:

Mu F ~ forall x. (F x -> x) -> x

which is what the Lfix equation gets you above.

In the other case you have final coalgebras Nu F, which come with a coalgebra
operation:

out : Nu F -> F (Nu F)

and for any other F-coalgebra (X, f : X -> F X), a unique homomorphism:

ana_f : X -> Nu F

which, parameterizing by f in the same way gets us:

ana :: forall x. (x -> F x) -> x -> Nu F

Which results in a Nu F this time. So suppose we took the definition:

ana = C

for some Haskell constructor C. Well, that means we need a constructor with
the type of ana, and that is exactly the existential:

data Nu f = forall x. C (x -> f x) x -- exists x. (x -> F x) * x

which gets you the Gfix equation above.

I realize some of that was hand-wavey, and maybe someone with more expertise
could flesh it out a bit, but hopefully that helps.

Cheers,
-- Dan

Duncan Coutts

2009-03-18 09:28:35 UTC

Permalink

Post by Dan Doel

You can explain it to yourself (not a proof) by writing out the example
for lists and co-lists along with fold for the list and unfold function
for the co-list. Then write conversion functions between them. You can
go from lists to co-lists using fold or unfold, but going from co-list
to list requires general recursion. And that's the key of course. In
Agda or something that distinguishes inductive and co-inductive types
you could do the first two conversions but not the conversion in the
other direction.

Duncan

Dan Doel

2009-03-18 17:52:49 UTC

Permalink

Post by Duncan Coutts
You can explain it to yourself (not a proof) by writing out the example
for lists and co-lists along with fold for the list and unfold function
for the co-list. Then write conversion functions between them. You can
go from lists to co-lists using fold or unfold, but going from co-list
to list requires general recursion. And that's the key of course. In
Agda or something that distinguishes inductive and co-inductive types
you could do the first two conversions but not the conversion in the
other direction.

Yeah, I know how to show it can be done in Haskell. I meant more along the
lines of a proof (sketch) starting from a definition of a CPO category and
ending with the fact that initial algebras and terminal coalgebras are the
same.

Because, of course, in addition to going from, "we have general recursion,"
to, "least and greatest fixed points are the same," we can do the opposite:

fix :: (a -> a) -> a
fix = cata (\(f,fixf) -> f fixf) . ana (\f -> (f,f))

I suppose it's obvious that the least fixed point of 'F x = a*x' isn't empty
in a category like the one people use to model Haskell, because every type is
inhabited by at least one element, _|_. Once you have that as a base, you can
construct plenty of elements in ways that would be acceptable even in a total
language:

(1, _|_), (2, (1, _|_)), etc.

What remains is to show that you can get the infinite elements somehow (and
that the greatest fixed point has the above values, but seems less radical).

Anyhow, lastly I wanted to mention to ben that it's probably less hand-wavey
to explain where you get:

Mu F = forall x. (F x -> x) -> x

from by mentioning the paper Build, Augment and Destroy. Universally. That
develops a slightly tweaked semantics that takes as its basis, for some
functor F, essentially pairs (Mu' F, fold') where Mu' F is an F-algebra, and
fold' is obviously similar in function to the cata you get from an initial
algebra. You define a data type as a limit for such things, which gets you a
unique morphism that, when you finesse it back into haskell types, gives you:

build :: (forall x. (F x -> x) -> x) -> Mu F

And the paper of course goes on to show that initial algebras also fit with
this construction, and vice versa, so the two formulations are equivalent.
But, the above type looks a lot like the one we had for terminal coalgebras in
my last mail, so the definition:

newtype Mu f = Build (forall x. (f x -> x) -> x)

seems in order. I realize that was still pretty vague, but the paper gives a
rigorous treatment which should hopefully clear up the details.

(There's an analogous construction for greatest fixed points in the paper, but
it gets you

destroy :: (forall x. (x -> F x) -> x -> c) -> Nu f -> c

which gives you interesting ways to take apart coinductive objects, not build
them. Of course, I suppose you could recognize that this can be tweaked into:

destroy :: ((exists x. (x -> F x) * x) -> c) -> Nu f -> c

and further suppose that the above is just an identity function, modulo some
wrapping to make the type nice, which would get you to a similar place.)

Cheers,
-- Dan

Ryan Ingram

2009-03-18 09:15:32 UTC

Permalink

This is interesting. So, there are two things going on. First, we
only allow "x" to appear in positive form in f; for standard data
types in Haskell, this is equivalent to saying that you can write an
instance of Functor for f. (I have an interesting proof of this which
I think I will write up for the Monad.Reader)

Once you have that, then you can get to defining fixed points on those
types. Lfix encodes a fixed point by its fold. For example, consider

Post by ben
data Cell x xs = Nil | Cons x xs

Lfix (Cell a) is then equivalent to to (forall b. (Cell a b -> b) ->
b), which, if you expand the function argument by the case analysis on
Cell, you get (forall b. b -> (a -> b -> b) -> b); that is, foldr f z
xs = xs f z.

This is where the "for free" comes in to the description; it's like
the encoding of datatypes in System F without actual datatypes/case.
Let "Pair a b" be an abbreviation for "forall c. (a -> b -> c)", then
you have:

pair :: forall A B. a -> b -> Pair A B
pair = /\A B -> \a b -> /\C -> \k -> k a b

But in this case, instead of building simple datatypes, we are instead
building *recursive* datatypes, without actually requiring fixed
points in the language!

Gfix is similar, but instead of encoding a type by its fold, it
encodes it as a state machine. So a Gfix ((,) Integer) stream of the
natural numbers could look something like this: Gfix (\x -> (x, x+1))
0; the first element of the pair is the stream result, and the second
is the existential state which is used to generate the next pair.

For reference, here's some code implementing this concept. Keep in
mind that we are working in a subset of Haskell without fixed points;
so no recursion is allowed.

{-# LANGUAGE ExistentialQuantification, RankNTypes #-}
module FixTypes where

newtype Lfix f = Lfix (forall x. (f x -> x) -> x)

l_in :: Functor f => f (Lfix f) -> Lfix f
l_in fl = Lfix (\k -> k (fmap (\(Lfix j) -> j k) fl))
-- derivation of l_in was complicated!

-- l_out :: Functor f => Lfix f -> f (Lfix f)

instance Functor (Either a) where
fmap f (Left a) = Left a
fmap f (Right x) = Right (f x)

test_l :: Lfix (Either Int)
test_l = Lfix (\k -> k $ Right $ k $ Right $ k $ Left 2)

data Gfix f = forall x. Gfix (x -> f x) x

g_out :: Functor f => Gfix f -> f (Gfix f)
g_out (Gfix f s) = fmap (\x -> Gfix f x) (f s)

-- g_in :: Functor f => f (Gfix f) -> Gfix f

instance Functor ((,) a) where
fmap f ~(a,x) = (a, f x)

test_g :: Gfix ((,) Integer)
test_g = Gfix (\x -> (x, x+1)) 0

Post by ben
[3]http://homepages.inf.ed.ac.uk/wadler/papers/free-rectypes/free-rectypes.txt
This introduces a new type, T = Lfix X. F X, satisfying the isomorphism
T ~ F T. Note that it is an isomorphism, not an equality: the type
comes equipped with functions in : T -> F T and out : F T -> T.

While I was able to derive "in" for Lfix, and "out" for Gfix, I don't
think it's possible to derive a generic "out" for Lfix or "in" for
Gfix; maybe such a function *can* always be generated (specific to a
particular type), but I don't think it's possible to do so while just
relying on Functor. Perhaps stronger generic programming methods are
necessary.

-- ryan

Ryan Ingram

2009-03-18 09:20:13 UTC

Permalink

Post by Ryan Ingram
Let "Pair a b" be an abbreviation for "forall c. (a -> b -> c)",

This should say that Pair a b is an abbreviation for
forall c. (a -> b -> c) -> c

-- ryan

David Menendez

2009-03-18 15:10:23 UTC

Permalink

Post by Ryan Ingram
newtype Lfix f = Lfix (forall x. (f x -> x) -> x)
l_in :: Functor f => f (Lfix f) -> Lfix f
l_in fl = Lfix (\k -> k (fmap (\(Lfix j) -> j k) fl))
-- derivation of l_in was complicated!

I found l_in easiest to write in terms of cata and build.

cata :: (f a -> a) -> Lfix f -> a
cata f (Lfix t) = f t

build :: (forall x. (f x -> x) -> c -> x) -> c -> Lfix f
build t c = Lfix (\f -> t f c)

l_in :: Functor f => f (Lfix f) -> Lfix f
l_in = build (\f -> f . fmap (cata f))

And, dually,

ana :: (a -> f a) -> a -> Gfix f
ana f a = Gfix f a

destroy :: (forall x. (x -> f x) -> x -> c) -> Gfix f -> c
destroy t (Gfix f x) = t f x

g_out :: Functor f => Gfix f -> f (Gfix f)
g_out = destroy (\f -> fmap (ana f) . f)

Post by Ryan Ingram

Post by ben
This introduces a new type, T = Lfix X. F X, satisfying the isomorphism
T ~ F T. Note that it is an isomorphism, not an equality: the type
comes equipped with functions in : T -> F T and out : F T -> T.

l_out :: Functor f => Lfix f -> f (Lfix f)
l_out = cata (fmap l_in)

g_in :: Functor f => f (Gfix f) -> Gfix f
g_in = ana (fmap g_out)

--
Dave Menendez <***@zednenem.com>
<http://www.eyrie.org/~zednenem/>

Ryan Ingram

2009-03-18 16:15:17 UTC

Permalink

Post by Ryan Ingram
l_out :: Functor f => Lfix f -> f (Lfix f)
l_out = cata (fmap l_in)
g_in :: Functor f => f (Gfix f) -> Gfix f
g_in = ana (fmap g_out)

Well, you just blew my mind. I had an informal proof in my head of
why g_in shouldn't be possible to write, but I must be missing
something huge.

Looks like I need to get out some paper and reduce this by hand,
because there is some black magic going on here :)

-- ryan

Ryan Ingram

2009-03-18 21:47:48 UTC

Permalink

Post by Ryan Ingram
l_out :: Functor f => Lfix f -> f (Lfix f)
l_out = cata (fmap l_in)
g_in :: Functor f => f (Gfix f) -> Gfix f
g_in = ana (fmap g_out)

OK, I understand these now.

But they both seem to have significant performance implications, which
I think are related to the "tail-in-terms-of-foldr" problem.

Consider this definition:

safeTail [] = []
safeTail (x:xs) = xs

The simplest way to write this directly with foldr is:

safeTail' = fst . foldr f ([], []) where
f x (t, l) = (l, x:l)

But the difference is that safeTail' here reconstructs the list from
scratch, instead of reusing the existing data as safeTail does. This
means that (safeTail . safeTail . safeTail ...) takes O(n^2) time
instead of O(n).

Similarily, l_out and g_in both seem to add many administrative
redexes to every element of the object being manipulated; consider

gid = g_in . g_out

then consider
(gid . gid . gid . gid) x

The result gets filled up quickly with administrative redexes that look like
Gfix (fmap g_out) $ g_out $ Gfix (fmap g_out) $ g_out $ Gfix ...

Is there a way to solve this problem and still maintain the nice
totality guarantees that you get from System F without fix?

-- ryan

Dan Doel

2009-03-18 20:29:43 UTC

Permalink

Post by Ryan Ingram

The isomorphism comes from the fact that f (Nu/Mu f) is an f-(co)algebra.

fmap l_in :: Functor f => f (f (Mu f)) -> f (Mu f)
fmap g_out :: Functor f => f (Nu f) -> f (f (Nu f))

Because of this and initiality/finality, there is a unique morphism from Mu f
to f (Mu f), and from f (Nu f) to Nu f, namely:

cata (fmap l_in) :: Functor f => Mu f -> f (Mu f)
ana (fmap g_out) :: Functor f => f (Nu f) -> Nu f

where

cata f (Lfix k) = k f
ana g x = Gfix g x

Of course, this isomorphism is subject to caveats about bottom and seq in
Haskell.

-- Dan

Wouter Swierstra

2009-03-18 09:53:22 UTC

Permalink

Hi Ben,

Post by ben
But this definition coincides with his definition of the greatest fixpoint

In Haskell the least and greatest fixed points coincide.
(Categorically, this is called "algebraically compact" I think). You
can still "fake" coinductive types to some degree by consistently
using unfolds rather than folds.

So here's my attempt at an explanation.

For every least fixed point of a functor:

data Mu f = In (f (Mu f))

you can define a fold:

fold :: forall a . (f a -> a) -> Mu f -> a
fold algebra (In t) = algebra (fmap (fold algebra) t)

Now your definition of Lfix above basically identifies the data type
with all possible folds over it. (I suspect you will need some
parametricity result to show that this is really an isomorphism)

For codata, instead of having a fold you get an unfold:

unfold :: forall a . (a -> f a) -> a -> Nu f
unfold coalg x = In (fmap (unfold coalg) (g x))

And your Gfix type above identifies every codata type with its unfold.
To see this, you need to realise that:

forall a . (a -> f a) -> a -> Nu f

is isomorphic to:

forall a . (a -> f a , a) -> Nu f

is isomporphic to:

(exists a . (a -> f a, a)) -> Nu f

which gives you one direction of the iso.

Now in case you think this is all airy-fairy category theory, there's
a really nice paper "Stream fusion: from lists to streams to nothing
at all" that shows how to use this technology to get some serious
speedups over all kinds of list-processing functions.

Hope this helps,

Wouter

Benedikt Ahrens

2009-03-18 16:41:04 UTC

Permalink

Thanks a lot to all of you for your help.

It took some time for me to realize that the only difference between
Vene and Wadler is in fact, that Wadler has an explicit representation
for the fixpoints - which answers the question of existence.

I will spend some more time on digesting all the information :-) and
will try to find some information about algebraic compacity.

Greetings
ben