Final Encodings, Part 1: A Quick Demonstration

Jacques Carette, Oleg Kiselyov and Chung-chieh Shan’s paper Finally Tagless, Partially Evaluated describes a technique for encoding embedded DSLs. They advertise a number of nice properties, most of which I won’t get into for now. I think the greatest feature, once you are used to this style of thinking, is how easy it is to reason about.

We’re going to implement a library for matching strings against some glob patterns. Ultimately, we would like to do stuff like this:

GHCi> let pat = compile "foo?bar*"
GHCi> match pat "foo bar"
True
GHCi> match pat "foo!bar blah blah blah"
True
GHCi> match pat "poo bar"
False
GHCi> match pat "foo  bar"
False

If you ignore all matters of implementation, a pattern is essentially a function from Strings to Bools, and match is our means of observing it. This is our semantics. The finally tagless approach is essentially to literally use the semantic interpretation as our representation:

> type Pattern = String -> Bool
>
> match :: Pattern -> String -> Bool
> match = id

Let’s define some combinators. The main three features we want our patterns to support are literal characters, any one character (typically ? in glob pattern syntax), and any number of any characters (typically * in glob pattern syntax):

> char :: Char -> Pattern
> char c = ([c] ==)
>
> anyChar :: Pattern
> anyChar [_] = True
> anyChar _   = False
>
> whatever :: Pattern
> whatever = const True

We need a way to combine them, so we’ll define a function that takes two patterns and creates just one from them. A string matches a glob pattern if there is any way to assign substrings to the components of the pattern such that it matches, as long as we don’t reorder them, so to combine two patterns, we try every way of splitting the string:

> append :: Pattern -> Pattern -> Pattern
> append f g str =
>   any (\i -> let (a, b) = splitAt i str
>              in f a && g b)
>   [0 .. length str]

Finally, the empty pattern only matches empty Strings:

> empty :: Pattern
> empty = null

We already have a working glob pattern DSL!

GHCi> :{
| let pat = char 'f' `append` char 'o' `append` char 'o' `append`
|           anyChar `append` char 'b' `append` char 'a' `append`
|           char 'r' `append` whatever
:}
GHCi> match pat "foo bar"
True
GHCi> match pat "foo!bar blah blah blah"
True
GHCi> match pat "poo bar"
False
GHCi> match pat "foo  bar"
False

It was awkward to define the pattern, though, so let’s write a function to parse one from a String:

> compile :: String -> Pattern
> compile []          = empty
> compile ('?'   :cs) = anyChar   `append` compile cs
> compile ('*'   :cs) = whatever  `append` compile cs
> compile ('\\':c:cs) = char c    `append` compile cs
> compile (     c:cs) = char c    `append` compile cs

Now the original session, using compile, works fine.

This was pretty easy to design and implement. All it took was recognizing at the beginning what our semantic model is and running with it. It helped, as always, that the semantic model is simple.

Some of this is unsatisfying. For one, it’s quite inefficient for some cases, especially when the match is destined to fail. In my next post, I will show a way to write a more efficient implementation of this.

Another issue some might have with this approach is that you can pretty much only do one thing with a Pattern once you’ve created it, which is match some strings against the pattern. In later posts, I will explain how to get more out of a glob pattern than just matching; for example, you might want to pretty print it or perform some static optimizations.

Finally, I didn’t compare this with a more traditional implementation using ADTs. I’m saving that for last, after I have finished demonstrating more of this approach.

About these ads

Posted on November 28, 2013, in Final Encodings. Bookmark the permalink. 1 Comment.

  1. Cool. Looking forward to the next post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.