[Haskell-cafe] Faster set intersections?

Discussion:

Jeffrey Brown

2018-12-09 00:45:49 UTC

S.intersection (S.fromList [1,2]) (S.fromList [1..])

fromList ^CInterrupted.

S.intersection (S.fromList [1..]) (S.fromList [1,2])

fromList ^CInterrupted.
Is there a smarter way to take the intersection of sets when at least one
of them is small (but I don't know which one that is)?

--
Jeff Brown | Jeffrey Benjamin Brown
Website <https://msu.edu/~brown202/> | Facebook
<https://www.facebook.com/mejeff.younotjeff> | LinkedIn
<https://www.linkedin.com/in/jeffreybenjaminbrown>(spammy, so I often miss
messages here) | Github <https://github.com/jeffreybenjaminbrown>

Olaf Klinke

2018-12-09 19:30:47 UTC

Permalink

That does seem to be what the source code says. The code only handles non-empty sets, but I *believe* the Maybe-wrapped version can handle intersections, albeit inefficiently. Can a version that looks like (a -> Bool) -> Maybe a work?

I think so. The empty set is then represented by the function that returns Nothing even on (const True).
Indeed, with Maybe on the outside the intersection function requires the Eq constraint on the base type: We have
member :: Eq a => a -> ((a -> Bool) -> a) -> Bool
member x find = x == (find (x==))

Then the intersection of find and find' can be implemented using
\p -> find (\x -> p x && member x find')

But I fail to see how having Maybe on the inside remedies this situation. Furthermore, on Eq types these sets are not so interesting, for the following reason.

One can write a function
Eq a => ((a -> Bool) -> a) -> [a]
that enumerates the elements of the set. Because we have universal quantification, this list can not be infinite. Which makes sense, topologically: These so-called searchable sets are topologically compact, and the Eq constraint means the space is discrete. Compact subsets of a discrete space are finite.

Olaf

My guess — by finding a member that satisfies a predicate, if it's at all possible, and any member if the predicate is const False. It's actually pretty awesome.

Naïvely, a set implemented as a predicate determining membership?
I don't understand, how does
(a -> Bool) -> a
model a set?
Thanks
Siddharth

Note that a concrete set "concretizes" anything it touches. Don't take
unions of these sets, though, it'll just be a mess.
Won't a union just be the same as intersection but using || instead of && ?
-Jan-Willem Maessen

union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)
What you can not do, of course, is enumerate and fold these sets.
Set a = Maybe ((a -> Bool) -> a)
It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
Caveat: Performance can be poor, depending on how the function inside the set was defined.
Cheers,
Olaf
[1] http://hackage.haskell.org/package/infinite-search
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
Sending this from my phone, please excuse any typos!
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
brandon s allbery kf8nh
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

David Feuer

2018-12-09 20:56:12 UTC

Permalink

I don't understand how that's finite and not just countable.

Ah, I guess because there's no way to look at all elements of an infinite
set. D'oh.

Siddharth Bhat

2018-12-10 07:11:14 UTC

Permalink

According to this construction, enumeration will need some SAT enumeration
like process, where you search with a new predicate barring all previous
candidates?

S.intersection (S.fromList [1,2]) (S.fromList [1..])
S.intersection (S.fromList [1..]) (S.fromList [1,2])
Is there a smarter way to take the intersection of sets when at least one
of them is small (but I don't know which one that is)?

"Small" is not enough. If all you know is that the lists
represent sets of integers, then
S.intersection (S.fromList [-1,-2]) (S.fromList [1..])
must diverge. A sufficient condition for success in such examples
is that the sets come from a well-ordered domain and are presented
in order. Then it's easy to write a merge-like algorithm.
Both sets can be infinite; every finite prefix of the intersection
will eventually appear.
(Note that the integers are not well-ordered, though the positive
integers are.)
Doug
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

--
Sending this from my phone, please excuse any typos!

Brandon Allbery

2018-12-09 00:51:13 UTC

Permalink

I suspect not. Maybe with fromAscList or fromDistinctAscList so it can know
the list won't pop a smaller (here, duplicate) value on it, if it's lazy
enough (I haven't checked). Lists don't record that they were generated
from enumerations.

Post by Jeffrey Brown

S.intersection (S.fromList [1,2]) (S.fromList [1..])

fromList ^CInterrupted.

S.intersection (S.fromList [1..]) (S.fromList [1,2])

fromList ^CInterrupted.
Is there a smarter way to take the intersection of sets when at least one
of them is small (but I don't know which one that is)?
--
Jeff Brown | Jeffrey Benjamin Brown
Website <https://msu.edu/~brown202/> | Facebook
<https://www.facebook.com/mejeff.younotjeff> | LinkedIn
<https://www.linkedin.com/in/jeffreybenjaminbrown>(spammy, so I often
miss messages here) | Github <https://github.com/jeffreybenjaminbrown>
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

--
brandon s allbery kf8nh
***@gmail.com

Olaf Klinke

2018-12-09 18:48:46 UTC

Permalink

I don't understand, how does
(a -> Bool) -> a
model a set?

MigMit was right. When I learned about this, I thought is was black magic. Suppose
find :: (a -> Bool) -> a
This function 'find' models a hypothetical non-empty set S. The specification is that for every predicate
p :: a -> Bool
that terminates on every element of S (this condition is important!), find p will be a member of S that satisfies p, if there is such a member, and any member of S otherwise. Since find is specified to always return a member of S, the set S is guaranteed to be non-empty. You can select some element from S by calling 'find' on (const True).

For example, the singleton x is defined as the constant function
\p -> x
The doubleton {x,y} is defined as
\p -> if p x then x else y
Binary union is defined as
union find find' = \p -> if p (find p) then find p else find' p
Existential quantification over S:
exists p = p (find p)
Universal quantification over S:
forall p = not (exists (not.p))

In order to represent the empty set as well, I stuck in the Maybe, so that Nothing represents the empty set and (Just find) represents a non-empty set.

Olaf

Thanks
Siddharth

2018-12-09 18:49:37 UTC

Permalink

Bloom Filters?
If some false positives are OK

--
Sent from an expensive device which will be obsolete in a few months
Casey

Post by Jeffrey Brown

S.intersection (S.fromList [1,2]) (S.fromList [1..])

fromList ^CInterrupted.

S.intersection (S.fromList [1..]) (S.fromList [1,2])

Ivan Lazar Miljenovic

2018-12-09 08:21:47 UTC

Permalink

On Dec 9, 2018 11:36 AM, "Jan-Willem Maessen" <***@alum.mit.edu> wrote:

Note that a concrete set "concretizes" anything it touches. Don't take
unions of these sets, though, it'll just be a mess.

Won't a union just be the same as intersection but using || instead of && ?

-Jan-Willem Maessen

Post by Jeffrey Brown
--
Jeff Brown | Jeffrey Benjamin Brown
Website <https://msu.edu/~brown202/> | Facebook
<https://www.facebook.com/mejeff.younotjeff> | LinkedIn
<https://www.linkedin.com/in/jeffreybenjaminbrown>(spammy, so I often
miss messages here) | Github <https://github.com/jeffreybenjaminbrown>
_______________________________________________
Haskell-Cafe mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

David Feuer

2018-12-09 02:36:32 UTC

Permalink

Not like that, no. The Set type is explicitly for *finite* sets only.
fromList [1..] is bottom and will run out of memory. You'd need a *very*
different implementation to be able to support infinite sets at all, and
even then you'd only catch certain special cases.

Post by Jeffrey Brown

S.intersection (S.fromList [1,2]) (S.fromList [1..])

fromList ^CInterrupted.

S.intersection (S.fromList [1..]) (S.fromList [1,2])