[Haskell-cafe] property testing with context

Discussion:

Ben Franksen

2018-10-19 07:19:35 UTC

Hi everyone

it seems to be the season for new variations on the "property testing"
theme, so I would like to chime in... not to announce a new library,
sadly, but with a rough idea how the existing ones could perhaps be
improved, based on practical experience in Darcs.

The problem I have is that there is a tension between

(a) stating a property in a clear and simple way, so its code doubles as
a formal specification

and

(b) writing the property in such a way that when it fails, the reported
value(s) give enough information about the context to be useful for
finding the cause of the problem.

Let me give an example to demonstrate what I mean.

There is a simple law that says if a sequential pair of patches A;B
commutes to B';A' then B';A' commutes back to A;B. In code this looks
(more or less) like this:

prop_recommute :: Commute p => (p :> p) wA wB -> Bool
prop_recommute (x:>y)
| Just (y':>x') <- commute (x:>y) =
case commute (y':>x')of
Just (x'':>y'') -> x==x'' && y==y''
Nothing -> False
| otherwise = True

This is a bit more verbose than the informal spec but still quite readable.

Now suppose this property fails. So quickcheck reports the counter
example pair (X:>Y) for some X and Y. But that isn't too useful in
itself. We'd like to know a bit more:

* did the second commute fail?
* or did it succeed but x/=x'' or y/=y''?
* and in the latter case, which of the two?

So in practice our property code looks more like this:

prop_recommute :: (ShowPatch p, Commute p) => (p :> p) wA wB -> Bool
prop_recommute (x :> y)
| Just (y' :> x') <- commute (x :> y) =
case commute (y' :> x') of
Nothing ->
failed (redText "failed, where x" $$ displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x')
Just (x'' :> y'') ->
if y'' /= y
then
failed (redText "y'' =/\\= y failed, where x" $$
displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x' $$
redText "x''" $$ displayPatch x'' $$
redText ":> y''" $$ displayPatch y'')
else
if x'' /= x
then
failed (redText "x'' /= x, where x" $$
displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x' $$
redText "x''" $$ displayPatch x'' $$
redText ":> y''" $$ displayPatch y'')
else True
| otherwise = True

Now this code tells us exactly what went wrong when the property fails
but it is ugly and repetitive and the additional code obscures the
simple logical content. So this is no longer quite as suitable for a
(human readable) formal spec.

I wonder if displaying

(1) all relevant contextual variables and
(2) "where in the code it fails"

could be automated away, somehow. I guess this is not trivial and may
require syntactic analysis, so perhaps expecting a /library/ to solve
the problem is unrealistic, except perhaps by using Template Haskell.
I'd also go with a separate tool that extracts properties from a module
and enriches them with code that displays the additional information.

Tackling this problem might be an interesting theme for a master
thesis... ;-)

Cheers
Ben

Olaf Klinke

2018-10-19 20:53:38 UTC

Permalink

Ben,

the solution might be simple: Don't express the property as an atomic function (a -> Bool).
This reminds me a bit of the Cutty-Howard correspondence, only that proofs are replaced by counterexamples. A testing library should have an operator for each of the logical connectives, where

- a counterexample to True is never found
- a counterexample to False is ()
- a counterexample to (not x) is an example for x
- a counterexample to (x && y) is either a counterexample to x or a counterexample to y, whichever is found first
- a counterexample to (x || y) is a pair of counterexamples to both x and y
- a counterexample to (x ==> y) is an example for x together with a counterexample for y

Then, if you build your property out of smaller properties and the connectives above, the counterexample-finder would be able to inform you which part of the property failed. I'm pretty sure at least one of the testing libraries has a system like this.

Cheers,
Olaf

Ben Franksen

2018-10-24 07:44:48 UTC

Permalink

Post by Olaf Klinke
the solution might be simple: Don't express the property as an atomic
function (a -> Bool). This reminds me a bit of the Cutty-Howard
correspondence, only that proofs are replaced by counterexamples. A
testing library should have an operator for each of the logical
connectives, where
- a counterexample to True is never found
- a counterexample to False is ()
- a counterexample to (not x) is an example for x
- a counterexample to (x && y) is either a counterexample to x or a counterexample to y, whichever is found first
- a counterexample to (x || y) is a pair of counterexamples to both x and y
- a counterexample to (x ==> y) is an example for x together with a counterexample for y
Then, if you build your property out of smaller properties and the connectives above, the counterexample-finder would be able to inform you which part of the property failed. I'm pretty sure at least one of the testing libraries has a system like this.

Hi Olaf

while this approach works in principle it has practical disadvantages.

One charming feature of property testing in Haskell is that expressing
properties does normally /not/ require to import the testing library.
This allows me to write my properties in a generic style that works with
e.g. quickcheck, leancheck, smallcheck etc. With your approach I would
be tied to a single property testing library.

I guess this would be less of a problem if in Haskell the boolean
operators were overloaded i.e. part of a class. I think this was an
oversight when the Prelude and the standard libraries were designed. IME
boolean algebras (or perhaps more generally lattices) appear
/everywhere/. I can't count the number of times I wanted to use these
operators on structures containing Bools or functions returning Bool and
couldn't. IMHO such a generalization would be a lot more useful and less
controversial than the one from List to Foldable.

(I can't remember a single instance where I missed this generalization
of lists. In fact I think is completely useless. Foldable essentially
allows me to say: "give me all your elements, one by one, in /your/
preferred order, so I can combine them to something new". Due to the
laziness of lists this is really the same as having a toList function
(modulo list fusion to take care of the constant factor overhead). Only
that a container type may offer many different versions of toList but
only a single Foldable instance. I understand that foldr and foldl have
been added to the class to allow at least two different "preferred"
traversal orders, but adding methods to a class in this way obviously
doesn't scale and is poor design, again IMHO.)

Cheers
Ben

Oliver Charles

2018-10-19 21:25:45 UTC

Permalink

Perhaps the work in my "assert-explainer" project is relevant here -
https://github.com/ocharles/assert-explainer.

The idea is your tests should just be writing "assert
someArbitraryExpressionEvaluatingToBool", and then having some
compiler magic recover the context for you when it goes wrong. To cite
my own README:

You write:

assert (length xs == 4)

No need for lots of special assertEqual etc functions.

When the assertion fails, you will get much more context:

✘ Assertion failed!
length xs == 4 /= True (at Test.hs:18:12-25)

I found the following sub-expressions:
- length xs = 3
- xs = [1,2,3]

This is done via a GHC source plugin that rewrites "assert (length xs
== 4)" into an expression that - if False - prints much more
information.

It's very early days for this plugin, but the goal is to reach parity
with https://docs.pytest.org/en/latest/example/reportingdemo.html#tbreportdemo.

Hope this helps,
Ollie

Post by Ben Franksen
Hi everyone
it seems to be the season for new variations on the "property testing"
theme, so I would like to chime in... not to announce a new library,
sadly, but with a rough idea how the existing ones could perhaps be
improved, based on practical experience in Darcs.
The problem I have is that there is a tension between
(a) stating a property in a clear and simple way, so its code doubles as
a formal specification
and
(b) writing the property in such a way that when it fails, the reported
value(s) give enough information about the context to be useful for
finding the cause of the problem.
Let me give an example to demonstrate what I mean.
There is a simple law that says if a sequential pair of patches A;B
commutes to B';A' then B';A' commutes back to A;B. In code this looks
prop_recommute :: Commute p => (p :> p) wA wB -> Bool
prop_recommute (x:>y)
| Just (y':>x') <- commute (x:>y) =
case commute (y':>x')of
Just (x'':>y'') -> x==x'' && y==y''
Nothing -> False
| otherwise = True
This is a bit more verbose than the informal spec but still quite readable.
Now suppose this property fails. So quickcheck reports the counter
example pair (X:>Y) for some X and Y. But that isn't too useful in
* did the second commute fail?
* or did it succeed but x/=x'' or y/=y''?
* and in the latter case, which of the two?
prop_recommute :: (ShowPatch p, Commute p) => (p :> p) wA wB -> Bool
prop_recommute (x :> y)
| Just (y' :> x') <- commute (x :> y) =
case commute (y' :> x') of
Nothing ->
failed (redText "failed, where x" $$ displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x')
Just (x'' :> y'') ->
if y'' /= y
then
failed (redText "y'' =/\\= y failed, where x" $$
displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x' $$
redText "x''" $$ displayPatch x'' $$
redText ":> y''" $$ displayPatch y'')
else
if x'' /= x
then
failed (redText "x'' /= x, where x" $$
displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x' $$
redText "x''" $$ displayPatch x'' $$
redText ":> y''" $$ displayPatch y'')
else True
| otherwise = True
Now this code tells us exactly what went wrong when the property fails
but it is ugly and repetitive and the additional code obscures the
simple logical content. So this is no longer quite as suitable for a
(human readable) formal spec.
I wonder if displaying
(1) all relevant contextual variables and
(2) "where in the code it fails"
could be automated away, somehow. I guess this is not trivial and may
require syntactic analysis, so perhaps expecting a /library/ to solve
the problem is unrealistic, except perhaps by using Template Haskell.
I'd also go with a separate tool that extracts properties from a module
and enriches them with code that displays the additional information.
Tackling this problem might be an interesting theme for a master
thesis... ;-)
Cheers
Ben
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

Ben Franksen

2018-10-22 09:22:53 UTC

Permalink

Hi Ollie

This is fantastic! It's certainly much more than I hoped for.

I haven't looked into the details of your plugin yet, but from your
description it looks as if this is currently geared towards "classical"
unit tests. Perhaps this is not a problem: I think quickcheck & co
report exceptions as a failed property, and if they don't we'll just
have to catch the exception, print it and return False.

I will definitely try it out and report back.

Cheers
Ben

Post by Oliver Charles
Perhaps the work in my "assert-explainer" project is relevant here -
https://github.com/ocharles/assert-explainer.
The idea is your tests should just be writing "assert
someArbitraryExpressionEvaluatingToBool", and then having some
compiler magic recover the context for you when it goes wrong. To cite
assert (length xs == 4)
No need for lots of special assertEqual etc functions.
✘ Assertion failed!
length xs == 4 /= True (at Test.hs:18:12-25)
- length xs = 3
- xs = [1,2,3]
This is done via a GHC source plugin that rewrites "assert (length xs
== 4)" into an expression that - if False - prints much more
information.
It's very early days for this plugin, but the goal is to reach parity
with https://docs.pytest.org/en/latest/example/reportingdemo.html#tbreportdemo.
Hope this helps,
Ollie

_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

Oliver Charles

2018-10-22 21:42:48 UTC

Permalink

Post by Ben Franksen
Hi Ollie
This is fantastic! It's certainly much more than I hoped for.
I haven't looked into the details of your plugin yet, but from your
description it looks as if this is currently geared towards "classical"
unit tests. Perhaps this is not a problem: I think quickcheck & co
report exceptions as a failed property, and if they don't we'll just
have to catch the exception, print it and return False.

Yea, everything is very proof-of-concept at the moment. Right now, we
have assert :: Bool -> IO (), but that could be changed to have a
richer return type (and likely will as this project progresses).

Post by Ben Franksen
I will definitely try it out and report back.

Super, please feel free to open issues/questions on GitHub!

a***@gmail.com

2018-10-22 09:59:24 UTC

Permalink

I think that this is possible by simply using QuickCheck's === and ==>
(if you have Show and Eq instances for :>):

prop_recommute :: Commute p => (p :> p) wA wB -> Bool
prop_recommute (x:>y)
= isJust commuted ==> commute commuted === Just (x:>y)
where
commuted = commute (x:>y)

See
https://hackage.haskell.org/package/QuickCheck-2.11.3/docs/Test-QuickCheck-Property.html
for information on ==> and ===.

This is more readable and quite similar to your example above. It would
print both left and right side of === when a counter-example is found.

Depending on your implementation of Show for :>, it could look
like: Nothing /= Just (...A... :> ...B...) or Just (...A... :> ...B...)
/= Just (...C... :> ...D...).

I did not try this myself, so I could have made a mistake or I may have
missed why this is not good enough for your case.

Post by Ben Franksen
prop_recommute :: (ShowPatch p, Commute p) => (p :> p) wA wB -> Bool
prop_recommute (x :> y)
| Just (y' :> x') <- commute (x :> y) =
case commute (y' :> x') of
Nothing ->
failed (redText "failed, where x" $$ displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x')
Just (x'' :> y'') ->
if y'' /= y
then
failed (redText "y'' =/\\= y failed, where x" $$
displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x' $$
redText "x''" $$ displayPatch x'' $$
redText ":> y''" $$ displayPatch y'')
else
if x'' /= x
then
failed (redText "x'' /= x, where x" $$
displayPatch x $$
redText ":> y" $$ displayPatch y $$
redText "y'" $$ displayPatch y' $$
redText ":> x'" $$ displayPatch x' $$
redText "x''" $$ displayPatch x'' $$
redText ":> y''" $$ displayPatch y'')
else True
| otherwise = True
Now this code tells us exactly what went wrong when the property fails
but it is ugly and repetitive and the additional code obscures the
simple logical content. So this is no longer quite as suitable for a
(human readable) formal spec.
I wonder if displaying
(1) all relevant contextual variables and
(2) "where in the code it fails"
could be automated away, somehow. I guess this is not trivial and may
require syntactic analysis, so perhaps expecting a /library/ to solve
the problem is unrealistic, except perhaps by using Template Haskell.
I'd also go with a separate tool that extracts properties from a module
and enriches them with code that displays the additional information.
Tackling this problem might be an interesting theme for a master
thesis... ;-)
Cheers
Ben
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

Ben Franksen

2018-10-24 06:31:45 UTC

Permalink

Post by a***@gmail.com
I think that this is possible by simply using QuickCheck's === and ==>
prop_recommute :: Commute p => (p :> p) wA wB -> Bool
prop_recommute (x:>y)
= isJust commuted ==> commute commuted === Just (x:>y)
where
commuted = commute (x:>y)
See
https://hackage.haskell.org/package/QuickCheck-2.11.3/docs/Test-QuickCheck-Property.html
for information on ==> and ===.
This is more readable and quite similar to your example above. It would
print both left and right side of === when a counter-example is found.

Thanks for the hint wrt (===) which I wasn't aware of.

In the example I posted this would work, but not in more complex ones.
There are properties that get more patches as input and require many
combinations of e.g. commute to succeed. Any one of them can fail and
the rest can only be performed if the previous ones have succeeded.

Cheers
Ben

a***@gmail.com

2018-10-24 13:44:23 UTC

Permalink

Agreed, I think QuickCheck is already up to the task. The simpler prop_recommute could be rewritten with (===), (.&&.), and maybe "counterexample" to label which branch failed.

Tom

Post by a***@gmail.com

I think that this is possible by simply using QuickCheck's === and ==>
prop_recommute :: Commute p => (p :> p) wA wB -> Bool
prop_recommute (x:>y)
= isJust commuted ==> commute commuted === Just (x:>y)
where
commuted = commute (x:>y)
See
https://hackage.haskell.org/package/QuickCheck-2.11.3/docs/Test-QuickCheck-Property.html
for information on ==> and ===.
This is more readable and quite similar to your example above. It would
print both left and right side of === when a counter-example is found.
Depending on your implementation of Show for :>, it could look
like: Nothing /= Just (...A... :> ...B...) or Just (...A... :> ...B...)
/= Just (...C... :> ...D...).
I did not try this myself, so I could have made a mistake or I may have
missed why this is not good enough for your case.

Oliver Charles

2018-10-24 20:51:31 UTC

Permalink

I agree with Ben's point though that you have to "buy in" to the testing
framework and write tests in a DSL, though. This seems unfortunate.

Post by a***@gmail.com
Agreed, I think QuickCheck is already up to the task. The simpler
prop_recommute could be rewritten with (===), (.&&.), and maybe
"counterexample" to label which branch failed.
Tom

Post by a***@gmail.com

https://hackage.haskell.org/package/QuickCheck-2.11.3/docs/Test-QuickCheck-Property.html

Post by a***@gmail.com
for information on ==> and ===.
This is more readable and quite similar to your example above. It would
print both left and right side of === when a counter-example is found.
Depending on your implementation of Show for :>, it could look
like: Nothing /= Just (...A... :> ...B...) or Just (...A... :> ...B...)
/= Just (...C... :> ...D...).
I did not try this myself, so I could have made a mistake or I may have
missed why this is not good enough for your case.