Dueling Rhetoric of Clojure and Haskell

by @eborden on November 01, 2017

Recently Lispcast wrote a post interpreting Rich Hickey’s controversial statements on static types. This post had some very interesting perspectives and some unfortunate misinformation. Within the post the author proposed a challenge to static typing communities.

Is it possible with our current understanding to build a statically typed language that rivals Clojure for its intended purpose? What would such a language look like? What types would it have? Who will build it?

This is an interesting question, and modern type systems are addressing this. These systems exist and are being improved daily. Row types in PureScript are one example.

I’m not going to answer the question directly, but abundant resources are available. You may find these concepts interesting: row type polymorphism, data generic programming, and extensible records. Instead, I’m going to respond to another challenge.

Of course you could build a type with all those properties and more in Haskell. A universal information model…That model would be something like the JSON datatype or a richer one like an EDN…Your library of functions…would look something like Clojure’s standard library. At some point you’re just re-implementing Clojure.

Screw it, let’s re-implement1 Clojure in Haskell.

Write you a Clojure for questionable good

To write a Clojure first we need to define our EDN data type. EDN is technically a subset of Clojure, but it’ll do. It has the following types:

The EDN data type looks like:

data EDN
  = Nil
  | Boolean Bool
  | String Text
  | Character Char
  | Symbol Text
  | Keyword Text
  | Integer Int
  | EdnFloat Float
  | List [EDN]
  | EdnVector (Vector EDN)
  | Map (HashMap EDN EDN)
  | EdnSet (HashSet EDN)
  | Tag Text EDN
  deriving (Eq, Hashable)

That is a pretty expansive sum type, which is our first sign that EDN might be complicated.

Std liberate me

Now let’s write a standard library function. How about map? We can do that by pattern matching on our input and handling the cases.

clmap :: (EDN -> EDN) -> EDN -> EDN
clmap f edn =
  case edn of
    Nil -> Nil
    List xs -> List $ map f xs
    EdnVector xs -> List . map f $ toList xs
    EdnSet xs -> List . map f $ toList xs

Awesome, we can map over our data, but what about all the other cases? You can’t map over many Clojure data types. Our clmap is “partial”. Typically we’d change the type of clmap to return a Maybe EDN, but Rich isn’t fond of Maybe, so let’s do something closer to Clojure’s semantics.

clmap :: (EDN -> EDN) -> EDN -> EDN
clmap f edn =
  case edn of
    Nil -> Nil
    List xs -> List $ map f xs
    EdnVector xs -> List . map f $ toList xs
    EdnSet xs -> List . map f $ toList xs
    Boolean{} -> error "can't clmap a boolean"
    String{} -> error "guess you can't do this either"
    Character{} -> error "this is starting to get silly"
    Symbol{} -> error "you see where I'm headed?"
    Keyword{} -> error "YOLO"
    Integer{} -> error "everything is awful"
    EdnFloat{} -> error "we are floating in a pit of despair"
    Map{} -> error "if only there was a safer way to do this"
    Tag{} -> error "please save me"

That is a lot of failure, our function is now matching all our cases, but it blows up when we make a mistake.

What about a simple function on one of the author’s favorite data types Map?

clget :: EDN -> EDN -> EDN
clget key edn =
  case edn of
    Map ms -> case HashMap.lookup key ms of
      Just x -> x
      Nothing -> error "crap we didn't find it"
    List{} -> error "oh no"
    Nil -> error "NilPointerException, I guess?"
    EdnVector{} -> error "here we go again."
    EdnSet{} -> error "bear with me"
    Boolean{} -> error "this will only take a second"
    String{} -> error "we're half way there"
    Character{} -> error "still feeling pretty depressing"
    Symbol{} -> error "we can keep it together"
    Keyword{} -> error "we are strong and disciplined"
    Integer{} -> error "our resolve isn't cracking..."
    EdnFloat{} -> error "we are floating in a pit of despair!!!!!!"
    List{} -> error "whhyyyyyyyyyyyyyyy"
    Tag{} -> error "time is a flat circle"

That was even rougher than the last time. That function fails in way more cases than it succeeds. So maybe, just maybe, Maybe isn’t that bad. Let’s refactor.

clmap :: (EDN -> EDN) -> EDN -> Maybe EDN
clmap f edn =
  case edn of
    Nil -> Just Nil
    List xs -> Just . List $ fmap f xs
    EdnVector xs -> Just . List . toList $ fmap f xs
    EdnSet xs -> Just . List . fmap f $ toList xs
    -- we are going to use a shortcut and utilize wild card pattern matching
    _ -> Nothing

clget :: EDN -> EDN -> Maybe EDN
clget key edn =
  case edn of
    Map ms -> HashMap.lookup key ms
    _ -> Nothing

Okay, we are no longer completely exploding. Our semantics have diverged from Clojure proper, but they are safer, and more visible. So what does this look like when we use it?

let
  mkMap pairs = Map $ HashMap.fromList pairs
  -- clj: (def cl-map {:foo (list "bar")})
  clMap = mkMap [(Symbol "foo", List [String "bar"])]

-- clj: (map str/upper-case (get cl-map :foo))
clmap clToUpper =<< clget (Symbol "foo") clMap
-- Just (List [String "BAR"])

It isn’t terribly pretty, but it does what we want. We could write some utility functions to clean it up, but our proof of concept works, and we’ve made our failure cases more visible.

What have we learned?

The previous author raises some very valid concerns with utilizing Haskell for “situated” programs. In Haskell we typically “concrete” data with record types, but we don’t have to. With only a few lines of code we’ve encoded EDN’s open dynamic information model and shown that we can compositionally de-construct it with a few functions. This lets us avoid having to pattern match over and over again, but we still have to implement a lot of functions to get there. We can probably do better, but we have shown we can quickly encode Clojure in Haskell. Encoding Haskell in Clojure would very likely be a much more difficult task.

In which we do better

Let’s take this further.

Can we set keys?

Our MVP is pretty ugly and it quickly falls apart when we attempt to set something. We’d have to wrap and unwrap tons of types to do something trivial. This is the experience that our friends at Lispcast had and it is unfortunate. Luckily the Haskell ecosystem has a solution to getting/setting in nested types, it is called lens.

Lenses give us a way to talk about getters and setters, compose them, nest them and they also allow us to talk about partiality in a composable way. They are conceptually similar to Clojure’s transducers. Our EDN -> Maybe EDN types fall into the realm of Traversals. These are getters and setters over values that have a failure case. Let’s write our clget function as a Traversal and rename it clkey since we’ll be getting and setting.

clkey :: EDN -> Traversal' EDN EDN
clkey key f edn =
  case edn of
    Map ms -> case HashMap.lookup key ms of
      Nothing -> pure edn
      Just val ->
        let optionalInsert v = Map $ HashMap.insert key v ms
        in optionalInsert <$> f val
    _ -> pure edn

That wasn’t so hard. If the key exists we apply a function and wrap an insertion over it. Otherwise we return our original structure. It may seem a bit magical, but that function allows us to get or set the value if it exists. It took a few people solving “puzzles” to create this solution, but now we have its ambitious power.

Do we need clmap?

It turns out we don’t need to reinvent the wheel here. Haskell has its own map function and we can leverage it. We are utilizing Haskell data types under the hood; let’s create Traversals that let us peek into our data type and modify things in regular old Haskell. These are more powerful Traversals called Prisms and their creation is even simpler.

_List :: Prism' EDN [EDN]
_List = prism' List $ \case
  List xs -> Just xs
  _ -> Nothing

_String :: Prism' EDN Text
_String = prism' String $ \case
  String xs -> Just xs
  _ -> Nothing

Cool, that pattern is pretty simple and now we can utilize Clojure’s information model while safely leveraging Haskell’s robust and reliable standard library.

let
  -- clj: (def cl-map {:foo (list "bar")})
  clMap = mkMap [(Symbol "foo", List [String "bar"])]

  foo = clkey (Symbol "foo") . _List . ix 0 . _String

-- clj: (nth 0 (get cl-map :foo))
preview foo clMap
-- Just "bar"

-- clj specter: (setval [:foo (nthpath 0)] "baz" cl-map)
set foo "baz" clMap
-- Map (HashMap.fromList [(Symbol "foo", List [String "baz"])])

-- clj specter: (transform [:foo (nthpath 0)] str/upper-case cl-map)
over foo Text.toUpper clMap
-- Map (HashMap.fromList [(Symbol "foo", List [String "BAR"])])

This is starting to look a lot more enjoyable and in the real world we can use Haskell’s type safe “macro” system to generate all these Prisms for us, saving us even more time and boilerplate.

What about being generic?

Clojure map works on sequence like structures and our existing Prisms force us to be overly specific. Well there is a simple solution, write a new Prism for sequence like structures. We’ll use Haskell’s list as a canonical sequence, we could probably be fancier, but let’s keep it simple like Clojure.

_Seq :: Prism' EDN [EDN]
_Seq = prism' List $ \case
  List xs -> Just xs
  EdnVector xs -> Just $ toList xs
  EdnSet xs -> Just $ toList xs
  Map ms -> Just $ toList ms
  _ -> Nothing

let
  clList = List [String "bar"]
  clVector = EdnVector $ fromList [String "foo"]
  clSet = EdnSet $ fromList [String "baz"]
  clMap = Map $ HashMap.fromList [(Symbol "biz", String "bats")]

  foo = _Seq . traverse . _String

over foo Text.toUpper clList
-- List [String "BAR"]
over foo Text.toUpper clVector
-- List [String "FOO"]
over foo Text.toUpper clSet
-- List [String "BAZ"]
over foo Text.toUpper clMap
-- List [String "BATS"]

Awesome, not only can we encode Clojure’s information model, we can also elegantly support its generic programming model with a higher level of safety and visible failure semantics. With a bit more work we could clean this up and have Clojure’s information model available as a DSL in a compact form, leveraging our existing ecosystem, with very little cost and high assurance. In fact I’ve done just that and so have others. Still others have endeavored to encode Haskell in a lisp like Racket.

What does it mean?

Much of the rhetoric that is currently flying around is a false dichotomy. The Clojure ecosystem is interesting and its information model is trivial to compose and use in a modern static type system. We can now utilize EDN as opposed to JSON while still enjoying strong guarantees of the Haskell ecosystem. But lets ask an important question. What do we gain with EDN?

If EDN is an improvement over JSON, then it is marginal at best. At worst it is an explosion in complexity and failure cases that provides no implicit method to tame it. We can tame these in our DSL, but the Clojure ecosystem has no visible recourse. Utilizing EDN also promotes a lot of invisible coupling. Some may tell you that dynamic types don’t couple, but that is incorrect and shows a lack of understanding of coupling itself. Many functions over Map exhibit external and stamp coupling. Even generic functions produce their own form of coupling. With an explicit type system we don’t couple more, but we make that coupling visible, understandable, and maintainable, even addressable. Those types help us grow and evolve with our understanding of our data instead of living in the opaque world of EDN -> EDN -> EDN where failure is implicit.

If we decide to “concrete” a world view where everything is awful and only marginal improvements are possible then I’d rather take an optimistic path; even if it comes at the cost of some navel gazing. I’d rather find boundless possibilities in my bellybutton lint than stare into the universe and ignore the majestic chance of a better world around every star.

But Seriously

This post is inflammatory and somewhat hyperbolic, but the truth is I don’t want to pick a fight with Clojurians. They have interesting ideas, cool technology and a lovely ecosystem. Let’s share the joy of discovery and creativity that made us all fall in love with programming in the first place.

Notes

  1. No irony is lost here in fulfilling Greenspun’s 10th rule. As well we recognize the existence of Greenspun’s 10th Dual:

    Any sufficiently complicated dynamically typed program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a type system.

Back to the home page.