Discussion:
[Haskell-cafe] Cyclic Inclusions
C.M.Brown
2008-08-11 21:03:09 UTC
Permalink
Hi,

I have a question about cyclic inclusions. It appears in the Haskell 98
report that mutually recursive modules are allowed, however GHC complains
at any Haskell project that has cyclic inclusions (implicit or explicit).
Am I right in thinking that this is a GHC limitation?


http://www.haskell.org/onlinereport/modules.html


Kind regards,
Chris.
Don Stewart
2008-08-11 21:07:00 UTC
Permalink
Post by C.M.Brown
Hi,
I have a question about cyclic inclusions. It appears in the Haskell 98
report that mutually recursive modules are allowed, however GHC complains
at any Haskell project that has cyclic inclusions (implicit or explicit).
Am I right in thinking that this is a GHC limitation?
http://www.haskell.org/onlinereport/modules.html
GHC provides a mechanism to support mutually recursive modules, but
you must break the cycle manually, via a boot file.

http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#mutual-recursion
C.M.Brown
2008-08-11 22:02:52 UTC
Permalink
Hi Don,
Post by Don Stewart
GHC provides a mechanism to support mutually recursive modules, but
you must break the cycle manually, via a boot file.
http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#mutual-recursion
Yes, I saw that, thanks! I guess this is because it's hard to compile a
mutually recursive module...


Chris.
a***@spamcop.net
2008-08-12 00:53:44 UTC
Permalink
G'day all.
Post by C.M.Brown
Yes, I saw that, thanks! I guess this is because it's hard to compile a
mutually recursive module...
It's because you don't need to declare the types of exported definitions.

Consider, this highly artificial example:

module A where

import B

f (x,y) = g (x,'A')


module B where

import A

g (x,y) = f (True,y)

To infer the types of f and g, you need to analyse both modules together.

And yes, some people think that this is a bug in the specification.

Cheers,
Andrew Bromage
C.M.Brown
2008-08-12 09:59:48 UTC
Permalink
Andrew,

Thanks very much for your reponse. It was very helpful; this makes a lot
of sense!

Regards,
Chris.
Post by a***@spamcop.net
G'day all.
Post by C.M.Brown
Yes, I saw that, thanks! I guess this is because it's hard to compile a
mutually recursive module...
It's because you don't need to declare the types of exported definitions.
module A where
import B
f (x,y) = g (x,'A')
module B where
import A
g (x,y) = f (True,y)
To infer the types of f and g, you need to analyse both modules together.
And yes, some people think that this is a bug in the specification.
Cheers,
Andrew Bromage
_______________________________________________
Haskell-Cafe mailing list
http://www.haskell.org/mailman/listinfo/haskell-cafe
Thomas Davie
2008-08-12 13:32:14 UTC
Permalink
Post by C.M.Brown
Andrew,
Thanks very much for your reponse. It was very helpful; this makes a lot
of sense!
Post by a***@spamcop.net
And yes, some people think that this is a bug in the specification.
I'm not sure that it does make a lot of sense -- we allow (mutually)
recursive functions, even though they come with an efficiency
penalty. Why should we not allow (mutually) recursive modules, even
though they too come with an efficiency penalty. This is even an
example where the efficiency loss is *only* at compile time, and only
happens once, so it's somewhat a better situation than allowing
mutually recursive functions.

I'd say it falls very heavily into the ghc-bug category, not the spec
bug category (even if there's reasons for the bug existing in ghc).

Bob
C.M.Brown
2008-08-12 13:43:37 UTC
Permalink
Post by Thomas Davie
I'm not sure that it does make a lot of sense -- we allow (mutually)
recursive functions, even though they come with an efficiency
penalty. Why should we not allow (mutually) recursive modules, even
though they too come with an efficiency penalty. This is even an
example where the efficiency loss is *only* at compile time, and only
happens once, so it's somewhat a better situation than allowing
mutually recursive functions.
I'd say it falls very heavily into the ghc-bug category, not the spec
bug category (even if there's reasons for the bug existing in ghc).
Perhaps it would be better for GHC to allow compilation of cyclic
inclusions via a flag? -fcyclic or something?
Or, to do it by default unless a -fno-cyclic flag is raised?
It does seem strange that the only way to compile cyclic modules is
to hack together a build using hi-boot files.

Regards,
Chris.
a***@spamcop.net
2008-08-12 14:01:06 UTC
Permalink
G'day all.
Post by Thomas Davie
I'm not sure that it does make a lot of sense -- we allow (mutually)
recursive functions, even though they come with an efficiency penalty.
Why should we not allow (mutually) recursive modules, even though they
too come with an efficiency penalty.
The problem is not mutually recursive modules. Plenty of statically
typed languages support mutually recursive modules.

The problem is that it's impossible in general to say what the
"interface" of a module is by examining the module alone. This is a
very unusual property as real-world programming languages go.

You could fix this by, for example, requiring that all symbols
exported from a module have an explicit type annotation. Or, if you
think that's not lazy enough, it could be an error to use an imported
symbol that doesn't have an explicit type annotation. You could even
formalise .hi-boot files if you were truly desperate.

If the Haskell spec requires that multiple modules be analysed
simultaneously (or multiple times to fixpoint), then that's a bug in
the spec. Separate compilation is too important.

Cheers,
Andrew Bromage
Henning Thielemann
2008-08-12 14:28:04 UTC
Permalink
Post by a***@spamcop.net
G'day all.
Post by Thomas Davie
I'm not sure that it does make a lot of sense -- we allow (mutually)
recursive functions, even though they come with an efficiency penalty.
Why should we not allow (mutually) recursive modules, even though they
too come with an efficiency penalty.
The problem is not mutually recursive modules. Plenty of statically
typed languages support mutually recursive modules.
The problem is that it's impossible in general to say what the
"interface" of a module is by examining the module alone. This is a
very unusual property as real-world programming languages go.
You could fix this by, for example, requiring that all symbols
exported from a module have an explicit type annotation.
As far as I know the real difficulties come from mutually recursive class
definitions.
a***@spamcop.net
2008-08-13 03:27:17 UTC
Permalink
G'day all.
Post by Henning Thielemann
As far as I know the real difficulties come from mutually recursive
class definitions.
I wouldn't be surprised, because that's a more blatant instance of the
same problem. With classes and instances, there is no way to specify
whether or not they are exported or not, which makes it that much
harder to identify what the interface of a module is.

Cheers,
Andrew Bromage
Thomas Davie
2008-08-12 14:35:41 UTC
Permalink
Post by a***@spamcop.net
G'day all.
Post by Thomas Davie
I'm not sure that it does make a lot of sense -- we allow (mutually)
recursive functions, even though they come with an efficiency
penalty.
Why should we not allow (mutually) recursive modules, even though they
too come with an efficiency penalty.
The problem is not mutually recursive modules. Plenty of statically
typed languages support mutually recursive modules.
The problem is that it's impossible in general to say what the
"interface" of a module is by examining the module alone. This is a
very unusual property as real-world programming languages go.
You could fix this by, for example, requiring that all symbols
exported from a module have an explicit type annotation. Or, if you
think that's not lazy enough, it could be an error to use an imported
symbol that doesn't have an explicit type annotation. You could even
formalise .hi-boot files if you were truly desperate.
If the Haskell spec requires that multiple modules be analysed
simultaneously (or multiple times to fixpoint), then that's a bug in
the spec. Separate compilation is too important.
Why is separate compilation important? I can understand your point
about a module on it's own not being analyzable, and that that's an
odd property -- on the other hand, the rapidly emerging "atomic" unit
in Haskell is the package, so I see no reason why modules within one
package shouldn't depend on each other.

Finally, as chris suggests, if separate compilation is important to
you, why not have a flag in ghc -frequire-hi-boot or something?

Bob
a***@spamcop.net
2008-08-13 03:06:48 UTC
Permalink
G'day all.
Post by Thomas Davie
Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this
question. Two people have asked it now.

The simplest answer is that unless your program fits in cache, it
takes longer to compile two modules concurrently than it takes to
compile them separately. This is even more true if one of those
modules does not actually need recompilation.

The longer answer is that in the Real World, programmer time is
far, far more precious than computer time. Every second that the
programmer waits for the computer to finish some task, there is a
priority inversion.

It's therefore highly desirable that jobs done by computer are as
fast as possible or, failing that, as predictable as possible, so
the programmer knows to do something else while waiting.

Separate compilation puts a predictable upper bound on the amount
of recompilation that has to be done given some set of changes.
True, so does requiring recompilation of a package as a whole, but
Haskell development would then be notorious for its sluggishness.
Post by Thomas Davie
I can understand your point
about a module on it's own not being analyzable, and that that's an odd
property -- on the other hand, the rapidly emerging "atomic" unit in
Haskell is the package, so I see no reason why modules within one
package shouldn't depend on each other.
Implicit in my working definition of "module" is that a module has a
well-defined interface. Am I the only one who has that understanding?
(Well, me and David Parnas, at any rate.)

For the record, I have no problem with modules depending on each other,
so long as they only depend on their well-defined interfaces.
Post by Thomas Davie
Finally, as chris suggests, if separate compilation is important to
you, why not have a flag in ghc -frequire-hi-boot or something?
Well, if I wanted separate header files, and the inevitable multiple-
maintenance headaches associated with them, I'd program in C. Except
for mutually recursive modules, GHC can and does generate header files
automatically, so I don't see why my time should be wasted doing the
job of a compiler.

If something is preventing the compiler from doing that job, then that
something should be fixed.

Cheers,
Andrew Bromage
Thomas Davie
2008-08-13 08:26:57 UTC
Permalink
Post by a***@spamcop.net
G'day all.
Post by Thomas Davie
Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this
question. Two people have asked it now.
The simplest answer is that unless your program fits in cache, it
takes longer to compile two modules concurrently than it takes to
compile them separately. This is even more true if one of those
modules does not actually need recompilation.
The longer answer is that in the Real World, programmer time is
far, far more precious than computer time. Every second that the
programmer waits for the computer to finish some task, there is a
priority inversion.
Really? So you're using YHC then? It after all compiles *much*
faster than GHC, but produces slower binaries. To be honest, ghc
compiles things so fast (at least on any of my systems) that I
couldn't care less if it took 10 times as long (I would however like
some added convenience for that time spent)
Post by a***@spamcop.net
It's therefore highly desirable that jobs done by computer are as
fast as possible or, failing that, as predictable as possible, so
the programmer knows to do something else while waiting.
Separate compilation puts a predictable upper bound on the amount
of recompilation that has to be done given some set of changes.
True, so does requiring recompilation of a package as a whole, but
Haskell development would then be notorious for its sluggishness.
It does? If I compile a module on which lots of other modules depend,
I have to do lots of recompilation... If I compile a module which is
in a cyclic dependancy group, I have to do lots of recompilation, I
don't see that there's a difference here.
Post by a***@spamcop.net
Post by Thomas Davie
I can understand your point
about a module on it's own not being analyzable, and that that's an odd
property -- on the other hand, the rapidly emerging "atomic" unit in
Haskell is the package, so I see no reason why modules within one
package shouldn't depend on each other.
Implicit in my working definition of "module" is that a module has a
well-defined interface. Am I the only one who has that understanding?
(Well, me and David Parnas, at any rate.)
For the record, I have no problem with modules depending on each other,
so long as they only depend on their well-defined interfaces.
That's a fair point about programming style, otoh, I don't think it's
a reason to restrict users to not using cyclic dependancies.
Post by a***@spamcop.net
Post by Thomas Davie
Finally, as chris suggests, if separate compilation is important to
you, why not have a flag in ghc -frequire-hi-boot or something?
Well, if I wanted separate header files, and the inevitable multiple-
maintenance headaches associated with them, I'd program in C. Except
for mutually recursive modules, GHC can and does generate header files
automatically, so I don't see why my time should be wasted doing the
job of a compiler.
If something is preventing the compiler from doing that job, then that
something should be fixed.
Something *does* prevent the compiler doing that job -- the fact that
ghc can't deal with cyclic module includes without an hi-boot file.
This is *exactly* my point -- I don't see why my time should be wasted
doing the job of the compiler just because I happen to have a cyclic
dependancy, that the compiler could quite happily sort out, by making
my compile time in this situation slightly longer.

If I *really* think I'm going to save more time by writing a hi-boot
file, then I would be able to turn on the option!

Bob
Henning Thielemann
2008-08-13 09:10:38 UTC
Permalink
Post by a***@spamcop.net
Post by Thomas Davie
Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this
question. Two people have asked it now.
Really? So you're using YHC then? It after all compiles *much* faster than
GHC, but produces slower binaries. To be honest, ghc compiles things so fast
(at least on any of my systems) that I couldn't care less if it took 10 times
as long (I would however like some added convenience for that time spent)
It's the ubiquitous "computers are fast enough today" argument. I don't
buy it. We don't have compile time to waste. There will always be
computers that are much slower and have less memory than the current
customer desktop computers, there are always tasks that a computer can do
instead of doing slowed down compilation. I'm glad that we have overcome
C's way of concatening all header files together before starting
compilation.
Thomas Davie
2008-08-13 10:02:11 UTC
Permalink
Post by Henning Thielemann
Post by Thomas Davie
Post by a***@spamcop.net
Post by Thomas Davie
Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this
question. Two people have asked it now.
Really? So you're using YHC then? It after all compiles *much*
faster than GHC, but produces slower binaries. To be honest, ghc
compiles things so fast (at least on any of my systems) that I
couldn't care less if it took 10 times as long (I would however
like some added convenience for that time spent)
It's the ubiquitous "computers are fast enough today" argument. I
don't buy it. We don't have compile time to waste. There will always
be computers that are much slower and have less memory than the
current customer desktop computers, there are always tasks that a
computer can do instead of doing slowed down compilation. I'm glad
that we have overcome C's way of concatening all header files
together before starting compilation.
If you don't want to be slowed down by the compiler, why aren't you
writing machine code directly? The point is that time saved with
convenience (often) comes at the cost of time spent while compiling.
Adding an option to ghc to allow you to not waste time writing hi-boot
files allows me to make an informed decision about whether it will
take me longer to (a) figure out *how* to write an hi-boot file (b)
actually do the writing, or if it'll take less time/effort to just let
the compiler do it for me.

Bob
a***@spamcop.net
2008-08-13 11:18:01 UTC
Permalink
G'day all.
To be honest, ghc compiles things so fast (at least on any of
my systems) that I couldn't care less if it took 10 times as long (I would
however like some added convenience for that time spent)
Have you ever compiled GHC itself? Just curious what you'd think about
a 10x speed hit there.

If it helps, think about the lifetime of a program. If you assume that
a program grows linearly over time, and that recompilations occur at
a roughly constant rate, it follows that the time spent recompiling
is O(n^2). Constant factors matter.
If I compile a module on which lots of other modules depend,
I have to do lots of recompilation... If I compile a module which is in
a cyclic dependancy group, I have to do lots of recompilation, I don't
see that there's a difference here.
If you only change the implementation of a module, not its interface,
you don't need to recompile anything that imports it. (At least, this
is true at -O0, which if you care about fast recompilation because
you're deep in development, you're probably doing.)
That's a fair point about programming style, otoh, I don't think it's a
reason to restrict users to not using cyclic dependancies.
As previously noted, cyclic dependencies alone aren't the problem.

Cheers,
Andrew Bromage
Thomas Davie
2008-08-13 11:29:39 UTC
Permalink
Post by a***@spamcop.net
G'day all.
To be honest, ghc compiles things so fast (at least on any of
my systems) that I couldn't care less if it took 10 times as long (I would
however like some added convenience for that time spent)
Have you ever compiled GHC itself? Just curious what you'd think about
a 10x speed hit there.
On the machine I'm sat at now, it takes 20 minutes. Secondly, you
would *only* pay the speed penalty when you had cyclic includes, and
you didn't use the flag to use an hi-boot file, so it would continue
to take 20 minutes.
Post by a***@spamcop.net
If it helps, think about the lifetime of a program. If you assume that
a program grows linearly over time, and that recompilations occur at
a roughly constant rate, it follows that the time spent recompiling
is O(n^2). Constant factors matter.
But the penalty would be no where near that big -- unless you happened
to have *all* your modules depend on *all* other modules.
Post by a***@spamcop.net
If I compile a module on which lots of other modules depend,
I have to do lots of recompilation... If I compile a module which is in
a cyclic dependancy group, I have to do lots of recompilation, I don't
see that there's a difference here.
If you only change the implementation of a module, not its interface,
you don't need to recompile anything that imports it. (At least, this
is true at -O0, which if you care about fast recompilation because
you're deep in development, you're probably doing.)
Well, if this turned out to take a long time, then I'd probably switch
to writing an hi-boot file... Something that I wouldn't have to waste
time doing at all if it happened that ghc was still "fast enough"
without providing one.

Bob
Aaron Denney
2008-08-20 12:23:32 UTC
Permalink
Post by a***@spamcop.net
If you only change the implementation of a module, not its interface,
you don't need to recompile anything that imports it. (At least, this
is true at -O0, which if you care about fast recompilation because
you're deep in development, you're probably doing.)
This is only true if the interface can be tracked separately from the
implementation. Which, despite the flaws, C's header files can be
coaxed into doing.
--
Aaron Denney
-><-
C.M.Brown
2008-08-13 10:43:59 UTC
Permalink
Andrew,

[...]
Post by a***@spamcop.net
For the record, I have no problem with modules depending on each other,
so long as they only depend on their well-defined interfaces.
Post by Thomas Davie
Finally, as chris suggests, if separate compilation is important to
you, why not have a flag in ghc -frequire-hi-boot or something?
Well, if I wanted separate header files, and the inevitable multiple-
maintenance headaches associated with them, I'd program in C. Except
for mutually recursive modules, GHC can and does generate header files
automatically, so I don't see why my time should be wasted doing the
job of a compiler.
If something is preventing the compiler from doing that job, then that
something should be fixed.
But isn't this exactly the point I was trying to make!? The whole point,
to me, in functional programming, is that we shouldn't have to worry about
the underlying implementation. What you've listed above are the restrictions
of a particular compiler implementation. All of which, if needed, can be
controlled by a flag if compilation performance is so important.

Regards,
Chris.
a***@spamcop.net
2008-08-13 11:03:45 UTC
Permalink
G'day all.
Post by C.M.Brown
But isn't this exactly the point I was trying to make!? The whole point,
to me, in functional programming, is that we shouldn't have to worry about
the underlying implementation.
It is not exposing an underlying implementation detail to mandate that
modules should have well-defined interfaces. If anything, it's
enforcing good programming practice.

Cheers,
Andrew Bromage
C.M.Brown
2008-08-13 11:15:08 UTC
Permalink
Andrew,
Post by a***@spamcop.net
Post by C.M.Brown
But isn't this exactly the point I was trying to make!? The whole point,
to me, in functional programming, is that we shouldn't have to worry about
the underlying implementation.
It is not exposing an underlying implementation detail to mandate that
modules should have well-defined interfaces. If anything, it's
enforcing good programming practice.
I agree absolutely that having well-defined interfaces is a good
thing. I wasn't actually referring to that, I apologise for not being
clear.

However I saw no real argument for not having cyclic inclusions. You
say we shouldn't have to spend time writing hi-boot files, and yet you also think
that GHC should not do it automatically. So we have to restrict all
programmers to never writing cyclic inclusions? :)


Kind Regards,
Chris.
a***@spamcop.net
2008-08-13 11:30:27 UTC
Permalink
G'day.
Post by C.M.Brown
However I saw no real argument for not having cyclic inclusions. You
say we shouldn't have to spend time writing hi-boot files, and yet you also think
that GHC should not do it automatically. So we have to restrict all
programmers to never writing cyclic inclusions? :)
GHC generates .hi files for most modules automatically. The only reason
why hi-boot files are needed for cyclic imports is because of the
possibility that you can't generate a .hi file from the module alone. If
you could do that, then you could support cyclic imports without needing
hi-boot files.

Cheers,
Andrew Bromage
Iavor Diatchki
2008-08-13 18:46:44 UTC
Permalink
Hello,

The Haskell'98 report does not specify if/how recursive modules should
work. I wrote a paper a long time ago that formalizes and implements
this feature (http://www.purely-functional.net/yav/publications/modules98.pdf).
I very much doubt that separate compilation is much of a problem in
practise because you only need to compile modules that are _recursive_
at the same time, and usually these tend to be fairly small. Figuring
out the interface of the modules is a bit trickier in some
pathological cases involving module re-exports, but this is already
the case with non-recursive modules.

One real technical problem that I remember was implementing
"defaulting", which is specified in terms of a single module. With
recursive modules, one could get mutually recursive functions from
different modules, in which case it is not clear which set of
defaulting rules to apply or how to combine them.

Hope that this helps,
-Iavor
Post by a***@spamcop.net
G'day.
Post by C.M.Brown
However I saw no real argument for not having cyclic inclusions. You
say we shouldn't have to spend time writing hi-boot files, and yet you
also think
that GHC should not do it automatically. So we have to restrict all
programmers to never writing cyclic inclusions? :)
GHC generates .hi files for most modules automatically. The only reason
why hi-boot files are needed for cyclic imports is because of the
possibility that you can't generate a .hi file from the module alone. If
you could do that, then you could support cyclic imports without needing
hi-boot files.
Cheers,
Andrew Bromage
_______________________________________________
Haskell-Cafe mailing list
http://www.haskell.org/mailman/listinfo/haskell-cafe
C.M.Brown
2008-08-12 17:38:11 UTC
Permalink
Post by a***@spamcop.net
The problem is not mutually recursive modules. Plenty of statically
typed languages support mutually recursive modules.
The problem is that it's impossible in general to say what the
"interface" of a module is by examining the module alone. This is a
very unusual property as real-world programming languages go.
You could fix this by, for example, requiring that all symbols
exported from a module have an explicit type annotation. Or, if you
think that's not lazy enough, it could be an error to use an imported
symbol that doesn't have an explicit type annotation. You could even
formalise .hi-boot files if you were truly desperate.
I don't really see this as being any kind of real issue at all. Surely all
GHC needs to do is to concatenate all the modules together, alpha-reduce
the import/export relations and do a compile/type check over the
concatenated module.
Post by a***@spamcop.net
If the Haskell spec requires that multiple modules be analysed
simultaneously (or multiple times to fixpoint), then that's a bug in
the spec. Separate compilation is too important.
Why is it a bug in the spec exactly? And why is separate compilation so
important?

Chris.
Isaac Dupree
2008-08-12 19:40:09 UTC
Permalink
Post by C.M.Brown
I don't really see this as being any kind of real issue at all. Surely all
GHC needs to do is to concatenate all the modules together, alpha-reduce
the import/export relations and do a compile/type check over the
concatenated module.
FWIW, I agree (in principle -- I haven't looked at the GHC
implementation enough to see whether its internal
representations could easily do type inference involving
symbols in multiple modules, and GHC HQ understandably isn't
interested in implementing it themselves
http://hackage.haskell.org/trac/ghc/ticket/1409 ).

There is a bit of a question in my mind about what to do
with .hi/.o files for the combined blob.

There's also a scalability / memory use issue, that is also
the plug for separate compilation: if you make really big
modules (whether manually or programmatically), compiling
them takes lots of memory and time, significantly more total
time and maximum memory than you need for via separate
compilation. I consider this a bug. (Well, I don't know if
what I said is true -- it's just an excuse that's often
given for why GHC compiling things together is a bad idea.
And there are several compile-time performance bugs in
Trac.) Admittedly, it's a challenging problem to
automatically segment a large module so you only need to do
a little work at once. Essentially, the module system
partly has the purpose of telling the compiler how not to
try to optimize, even if it would be beneficial; and thereby
you get reasonable compilation times. This would be
inexcusable in my mind if the same sort of module
segmentation didn't also help *humans* understand the
contents of the modules :-). I wonder if there can be some
sort of system with pragmas that accomplishes the same
purpose of guiding the compiler... probably not for Haskell
so much as in general



-Isaac
Loading...