Discussion:
[Haskell-cafe] semantics of concurrent program depends on -O level, -f[no-]omit-yields
Johannes Waldmann
2018-11-29 18:42:42 UTC
Permalink
Dear Cafe,

I am surprised by the behaviour of the program below
(the interesting property is whether it will output "foo").

Behaviours (plural) actually: it seems to depend
on optimisation level, on omit-yields,
and on very small changes in the source code:

It does print (mostly), when compiled with -O0.
It does not, when compiled with -O2.
With -O2 -fno-omit-yields, it will print.
With -O0 -fno-omit-yields, and when I remove the two newTVar
in the beginning, it will mostly not print.

How come?

These differences already occur with the
last two lines replaced by "forever $ return ()",
so the STM stuff may be inessential here.
But that's the context where I came across the problem.

- J.W.


import Control.Concurrent.STM
import Control.Concurrent ( forkIO )
import Control.Monad ( forever )
import System.IO

main = do

atomically $ newTVar "bar"
atomically $ newTVar False

forkIO $ putStrLn "foo"

x <- atomically $ newTVar False
forever $ atomically $ writeTVar x True
Brandon Allbery
2018-11-29 18:48:21 UTC
Permalink
This is undoubtedly nothing more than timing issues. Remember that the main
thread exiting will kill the entire process, automatically killing all
other threads as side effect. So the question is how much the thread
manages to get done before that happens.

If you disable output buffering, you may find that "f" or "fo" sometimes
gets written before process exit.

On Thu, Nov 29, 2018 at 1:43 PM Johannes Waldmann <
Post by Johannes Waldmann
Dear Cafe,
I am surprised by the behaviour of the program below
(the interesting property is whether it will output "foo").
Behaviours (plural) actually: it seems to depend
on optimisation level, on omit-yields,
It does print (mostly), when compiled with -O0.
It does not, when compiled with -O2.
With -O2 -fno-omit-yields, it will print.
With -O0 -fno-omit-yields, and when I remove the two newTVar
in the beginning, it will mostly not print.
How come?
These differences already occur with the
last two lines replaced by "forever $ return ()",
so the STM stuff may be inessential here.
But that's the context where I came across the problem.
- J.W.
import Control.Concurrent.STM
import Control.Concurrent ( forkIO )
import Control.Monad ( forever )
import System.IO
main = do
atomically $ newTVar "bar"
atomically $ newTVar False
forkIO $ putStrLn "foo"
x <- atomically $ newTVar False
forever $ atomically $ writeTVar x True
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
brandon s allbery kf8nh
***@gmail.com
Johannes Waldmann
2018-11-29 18:51:19 UTC
Permalink
main thread exiting will kill the entire process,
The main thread does "forever $ something".

The process does not exit (I am not getting
the console prompt). I observe that the process
either prints and then hangs, or it hangs immediately.

- J.
Bryan Richter
2018-11-29 19:25:07 UTC
Permalink
On Thu, Nov 29, 2018, 20:51 Johannes Waldmann <
Post by Johannes Waldmann
main thread exiting will kill the entire process,
The main thread does "forever $ something".
The process does not exit (I am not getting
the console prompt). I observe that the process
either prints and then hangs, or it hangs immediately.
Printing to the console still isn't a great test to see if something has
"run", because of buffering that seems to behave unintuitively in these
situations. Maybe try flushing stdout within the forked thread, to ensure
the runtime is doing what you think it's doing?
Brandon Allbery
2018-11-29 19:28:14 UTC
Permalink
I also note that the "forever" is in fact writing to a TVar. I'd be curious
as to whether it's retrying for some reason, possibly related to the "lost"
TVars confusing the STM machinery. I seem to recall it has some
infelicities currently; and I have no idea how (or if) STM retries interact
with thread yielding.
Post by Bryan Richter
On Thu, Nov 29, 2018, 20:51 Johannes Waldmann <
Post by Johannes Waldmann
main thread exiting will kill the entire process,
The main thread does "forever $ something".
The process does not exit (I am not getting
the console prompt). I observe that the process
either prints and then hangs, or it hangs immediately.
Printing to the console still isn't a great test to see if something has
"run", because of buffering that seems to behave unintuitively in these
situations. Maybe try flushing stdout within the forked thread, to ensure
the runtime is doing what you think it's doing?
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
brandon s allbery kf8nh
***@gmail.com
Johannes Waldmann
2018-11-29 19:37:35 UTC
Permalink
... try flushing stdout within the forked thread,
I did. The behaviour is still as described:
depends on -O0/2, [no]omit-yield,
and small changes in the source.

While I agree with the general point -
why would I need to hFlush after putStrLn?
hGetBuffering stdout tells me it's LineBuffering,
and putStrLn does write a line?

- J.
Brandon Allbery
2018-11-29 19:40:07 UTC
Permalink
The idea is that putStrLn iterates putChar over the String, then putChar
'\n'; so thread scheduling would be more obvious with individual characters
being output instead of a single flush triggered by the final putChar.

On Thu, Nov 29, 2018 at 2:37 PM Johannes Waldmann <
Post by Johannes Waldmann
... try flushing stdout within the forked thread,
depends on -O0/2, [no]omit-yield,
and small changes in the source.
While I agree with the general point -
why would I need to hFlush after putStrLn?
hGetBuffering stdout tells me it's LineBuffering,
and putStrLn does write a line?
- J.
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
brandon s allbery kf8nh
***@gmail.com
Johannes Waldmann
2018-11-29 19:49:03 UTC
Permalink
Post by Brandon Allbery
so thread scheduling would be more obvious with individual
characters being output instead of a single flush triggered by the final
putChar.
Yes but in my example program, there is no contention for stdout,
as only one thread is using it.

I am inclined to enter this into the GHC issue tracker
as it seems there's no obvious explanation,
and "lost TVars confusing the STM machinery" was mentioned.
Do you mean that this a known thing? Searching the tracker
for "lost TVar" does not turn up anything.

- J.W.
Brandon Allbery
2018-11-29 19:53:36 UTC
Permalink
What does this have to do with contention for stdout? Thread switching is
unrelated; seeing individual output operations just gives more hints about
when the thread switches happen. And with -fno-omit-yields it presumably
can happen when putChar is evaluated, not because of I/O but because of
function entry.

On Thu, Nov 29, 2018 at 2:49 PM Johannes Waldmann <
Post by Johannes Waldmann
Post by Brandon Allbery
so thread scheduling would be more obvious with individual
characters being output instead of a single flush triggered by the final
putChar.
Yes but in my example program, there is no contention for stdout,
as only one thread is using it.
I am inclined to enter this into the GHC issue tracker
as it seems there's no obvious explanation,
and "lost TVars confusing the STM machinery" was mentioned.
Do you mean that this a known thing? Searching the tracker
for "lost TVar" does not turn up anything.
- J.W.
--
brandon s allbery kf8nh
***@gmail.com
Sven Panne
2018-11-29 20:28:24 UTC
Permalink
Am Do., 29. Nov. 2018 um 19:43 Uhr schrieb Johannes Waldmann <
Post by Johannes Waldmann
I am surprised by the behaviour of the program below
(the interesting property is whether it will output "foo").
Behaviours (plural) actually: it seems to depend
on optimisation level, on omit-yields,
and on very small changes in the source code: [...]
IMHO there is nothing very surprising here: You have 2 threads with no
synchronization between them whatsoever, so you get what you deserve:
Undefined behavior. :-) This is the behavior you get in basically all
programming languages/execution environments I know of, *unless* they make
a very strong guarantee about their scheduling behavior (whichis very rare,
for good reasons). Do we have such a guarantee somewhere in the GHC/base
documentation? I don't think so, but if we had, I would be interested to
see a reference to that.

Cheers,
S.
Johannes Waldmann
2018-11-29 20:54:15 UTC
Permalink
Post by Sven Panne
IMHO there is nothing very surprising here: You have 2 threads with no
Undefined behavior. :-)
Well, yes.

It feels as if the scheduler is mighty unfair here
(delaying the printing indefinitely)
but apparently it is allowed to do so - mainly since there is
no specification that would require otherwise.

But then (seconding your question) what guarantees *do* we have?
For a single-threaded program, it would certainly not be OK
to execute "main = print ()" as "block immediately"?
But when we forkIO this, then it can happen?

Possibly related: discussion about (state of formal
specification of) GHC RTS memory model at
https://mail.haskell.org/pipermail/ghc-devs/2018-November/016583.html

- J.W.
Ian Denhardt
2018-11-29 20:52:06 UTC
Permalink
Quoting Johannes Waldmann (2018-11-29 13:42:42)
Post by Johannes Waldmann
These differences already occur with the
last two lines replaced by "forever $ return ()",
so the STM stuff may be inessential here.
But that's the context where I came across the problem.
There's another thread right now with a subject line of "Timing out a
pure evaluation of an expression I did not write myself," that seems
like it might be related: I would expect forever $ return () to not
allocate, which would mean it would never hit any yields, and thus never
be rescheduled, and hogging the CPU.

I've been able to reproduce your results, and if I change the last line
to:

forever $ do
yield
atomically $ writeTVar x True

..it always prints -- so the culprit is definitely a failure to yield.

-Ian
Johannes Waldmann
2018-11-29 21:45:09 UTC
Permalink
Post by Ian Denhardt
forever $ do
yield
atomically $ writeTVar x True
..it always prints -- so the culprit is definitely a failure to yield.
A-ha.

So my implicit assumption was
that a run of the transaction manager (because "atomically")
is also a yield - but this example shows that it isn't.

If this is indeed the case, then this deserves to be mentioned
in the documentation of Control.Concurrent.STM ?

- J.W.
Sven Panne
2018-11-30 10:05:20 UTC
Permalink
[...] I've been able to reproduce your results, and if I change the last
line
forever $ do
yield
atomically $ writeTVar x True
..it always prints -- so the culprit is definitely a failure to yield.
But even that is not enough from a specification POV: After the yield, the
same thread might be schedule immediately again, and again, ... Or do we
have some specification of the scheduler? I don't think so, but perhaps I'm
wrong in this respect. If we have one, it has to state explicitly that the
scheduling is fair in the sense that every runnable thread actually runs
after a finite amount of time, otherwise you are in undefined land again...

The question where scheduling can actually happen is a totally different
issue, and I don't know of a specification here, either. In GHC, this seems
to be tied to allocations, but this is a bit brittle and unintuitive. To
guarantee that you hit a scheduling point after a finite amount of time is
easy in principle, e.g. do this on every backwards branch and on every
function entry. But this has an associated cost, so we have a tradeoff here.

In general, I wouldn't worry too much about the semantics of unsynchronized
threads, if you rely on this somehow, you will sooner or later enter a
world of pain. Add e.g. thread priorities to the mix, and you will suffer
even more, experiencing wonderful things like priority inversion etc. :-P
Johannes Waldmann
2018-11-30 10:50:03 UTC
Permalink
Hi,
the same thread might be schedule immediately again, and again, ... Or
do we have some specification of the scheduler?
My working assumption is that the scheduler tries to be fair.
So all strange behaviour could be explained with the scheduler
not running at all, because threads weren't yielding.
The question where scheduling can actually happen is a totally different
issue, and I don't know of a specification here, either. In GHC, this
seems to be tied to allocations, but this is a bit brittle and
unintuitive.
Yes, especially if the compiler might (re)move allocations
due to some code transformations.

Given that, it now feels strange that the following *does* work:

main = do
forkIO $ do threadDelay 1000000 ; putStrLn "foo"
forever $ putStr ""

I am seeing the "foo" output. I expect the last line
to be non-allocating. But it does still yield? Why?

- J.W.
Ian Denhardt
2018-11-30 18:33:22 UTC
Permalink
Quoting Johannes Waldmann (2018-11-30 05:50:03)
Post by Johannes Waldmann
main = do
forkIO $ do threadDelay 1000000 ; putStrLn "foo"
forever $ putStr ""
I am seeing the "foo" output. I expect the last line
to be non-allocating. But it does still yield? Why?
putStr has to acquire a lock on stdout, so that's probably enough to
allow the scheduler to run.
Brandon Allbery
2018-11-30 19:06:01 UTC
Permalink
Hm, has that been optimized to output all at once? The implementation I
recall is more or less mapM_ putChar, deferring the lock to putChar which
never gets invoked because the list is empty.

Okay, just checked; it reserves the handle up front, and then the above
implementation (albeit directly instead of via mapM_) is used only in the
NoBuffering case, using an internal function that doesn't reserve. Which
will complicate understanding what's going on, although my suggestion
earlier about unbuffering output still applies.
Post by Ian Denhardt
Quoting Johannes Waldmann (2018-11-30 05:50:03)
Post by Johannes Waldmann
main = do
forkIO $ do threadDelay 1000000 ; putStrLn "foo"
forever $ putStr ""
I am seeing the "foo" output. I expect the last line
to be non-allocating. But it does still yield? Why?
putStr has to acquire a lock on stdout, so that's probably enough to
allow the scheduler to run.
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
brandon s allbery kf8nh
***@gmail.com
Loading...