Queensland FP Lab - Quick and easy user-defined operators with Plated

Quick and easy user-defined operators with Plated

Posted on November 17, 2017

All infix operators have precedence and associativity. A language that supports user-defined operators should also give the user a way to control these attributes for their custom operators. Modern languages do this in a variety of ways. In Swift, all operators are associated with a precedence group. User-defined operators in F# get their precedence and associativity from the combination of characters that make up the operator. In Haskell-like languages, the user explicitly states the precedence and associativity using special syntax. For example, infixl 5 + says “the + operator is left-associative and has precedence level 5”.

I like Haskell-style operators out of all these options — I think they’re a more elegant solution to the problem. However, elegance comes at the cost of implementation difficulty.

Haskell-style infix operators are generally implemented like this:

Define a set of characters that are allowed in operators.
Add syntax for declaring operator precedence and associativity.
Parse all operators right-recursively.

The grammar might look something like this: ``` operator_char ::= ‘~’ | ‘!’ | ‘@’ | ‘#’ | ‘$’ | ‘%’ | ‘^’ | ‘&’ | ’*’ | ‘?’ | ‘>’ | ‘<’ | ‘.’ | ‘|’ | ‘-’ | ‘+’ | ‘=’ operator ::= operator_char+

fixity ::= ‘infixr’ | ‘infixl’ infix_decl ::= fixity [0-9]+ operator declaration ::= infix_decl | …

non_operator_expr ::= ‘(’ expr ‘)’ | … expr ::= non_operator_expr (operator non_operator_expr)* ```
Collect the operator precedences and associativities from the parsed output
Re-write the parsed expressions according to this information

This is done in two parts:
- Associativity correction
- Precedence correction

Re-association

Consider the input 2 - 3 + 4. We implicitly read this as [2 - 3] + 4, because - and + have the same precedence and are left-associative. According to our grammar this expression will always be parsed as 2 - [3 + 4], which has a completely different meaning. Changing the position of the brackets accomplished by changing which operator is at the top of the tree. (-) is at the top of the tree in the expression 2 - [3 + 4], but (+) is at the top in [2 - 3] + 4. Notice that the order of the leaves — 2, 3, 4 — stays the same regardless of how the tree is transformed.

The re-association algorithm for a tree of height two looks like this:

Left associative operators:

  OP1(prec=X, assoc=Left)
  /             \
 /               \
A      OP2(prec=Y, assoc=Left)
       /                 \
      /                   \
     B                     C


if X == Y, becomes


            OP2(prec=Y, assoc=Left)
            /                \
           /                  \
  OP1(prec=X, assoc=Left)      C
  /                 \
 /                   \
A                     B

Right associative operators

            OP1(prec=X, assoc=Right)
            /                \
           /                  \
  OP2(prec=Y, assoc=Right)     C
  /                 \
 /                   \
A                     B


if X == Y, becomes


  OP2(prec=Y, assoc=Right)
  /             \
 /               \
A      OP1(prec=X, assoc=Right)
       /                 \
      /                   \
     B                     C

If a parent and child node have different precedences and different associativities, then they won’t be re-associated. If they have the same precedence but different associativities, an “ambiguous parse” error is raised.

If we have two operators ! and ^, with the same precedence and different associativity, any unparenthesised expression that uses them has two valid parenthesisations — 5 ^ 4 ! 3 could be [5 ^ 4] ! 3 or 5 ^ [4 ! 3].

Precedence Correction

Consider the input 2 * 3 + 4. This is read as [2 * 3] + 4, because * has higher precedence than +, but it will be parsed as 2 * [3 + 4].

The precedence-correction algorithm looks like this:

  OP1(prec=X, assoc=?)
  /             \
 /               \
A      OP2(prec=Y, assoc=??)
       /                 \
      /                   \
     B                     C


if X > Y, becomes


            OP2(prec=Y, assoc=??)
            /                \
           /                  \
  OP1(prec=X, assoc=?)         C
  /                 \
 /                   \
A                     B


and



            OP1(prec=X, assoc=?)
            /                \
           /                  \
  OP2(prec=Y, assoc=??)        C
  /                 \
 /                   \
A                     B


if X > Y, becomes


  OP2(prec=Y, assoc=??)
  /             \
 /               \
A      OP1(prec=X, assoc=?)
       /                 \
      /                   \
     B                     C

Putting it together

This seems fairly straightforward when thinking about trees of height 2, but how do you generalise it to trees of any height?

This is where Plated comes in. To make your datatype an instance of Plated, you write a Traversal that operates over its immediate self-similar children. rewriteM then allows you to write transformations on trees as deep or as shallow as you wish, interleaving a monadic context, and will recursively apply those transformations from the bottom of a tree upwards until it can no longer be transformed.

Good abstractions reduce boilerplate and help you focus on what’s important. The “recursive bottom-up tree rewriting” algorithm has already been written for us. Using Plated, we need only consider the simplest case for re-ordering the tree, and then it scales for free.

Let’s write some code.

module Operators where

import Control.Applicative ((<|>), liftA2)
import Control.Lens.Plated (Plated(..), rewriteM)
import Data.Maybe (fromMaybe)

Our syntax tree will consist of binary operators and numbers. The Parens node is for explicit parenthesisation.

data Expr
  = Parens Expr
  | BinOp String Expr Expr
  | Number Int
  deriving (Eq, Ord, Show)

The wonderful Plated instance.

instance Plated Expr where
  plate f (Parens a) = Parens <$> f a
  plate f (BinOp n a b) = BinOp n <$> f a <*> f b
  plate _ a = pure a

Operators have an associativity and a precedence.

data Associativity
  = AssocL
  | AssocR
  deriving (Eq, Ord, Show)

data OperatorInfo
  = OperatorInfo
  { opAssoc :: Associativity
  , opPrecedence :: Int
  }

An OpTable is a map from operator names to their OperatorInfo.

type OpTable = [(String, OperatorInfo)]

Our reordering function could fail due to ambiguity

data ParseError = AmbiguousParse
  deriving (Eq, Show, Ord)

Re-association. This code crashes when an operator is missing from the table.

If the input node is an operator, and:
- It is left-associative:
  - Inspect its right child
    - If the right child node is an operator and has equal precedence to the input node
      - And is also left-associative, then re-order the tree to be left-associative
      - And is right-associative, then report an ambiguity
  - Inspect its left child
    - If the left child node is an operator, has equal precedence to the input node and is right-associative, then report an ambiguity
- It is right-associative
  - Inspect its left child
    - If the left child node is an operator and has equal precedence to the input node
      - And is also right-associative, then re-order the tree to be right-associative
      - And is left-associative, then report an ambiguity
  - Inspect its right child
    - If the right child node is an operator, has equal precedence to the input node and is left-associative, then report an ambiguity
Otherwise, do nothing

associativity :: OpTable -> Expr -> Either ParseError (Maybe Expr)
associativity table (BinOp name l r)
  | Just entry <- lookup name table =
      case opAssoc entry of
        AssocL
          | BinOp name' l' r' <- r
          , Just entry' <- lookup name' table
          , opPrecedence entry == opPrecedence entry' ->
              case opAssoc entry' of
                AssocL -> Right . Just $ BinOp name' (BinOp name l l') r'
                AssocR -> Left AmbiguousParse
          | BinOp name' _ _ <- l
          , Just entry' <- lookup name' table
          , opPrecedence entry == opPrecedence entry'
          , AssocR <- opAssoc entry' ->
              Left AmbiguousParse
          | otherwise -> Right Nothing
        AssocR
          | BinOp name' l' r' <- l
          , Just entry' <- lookup name' table
          , opPrecedence entry == opPrecedence entry' ->
              case opAssoc entry' of
                AssocL -> Left AmbiguousParse
                AssocR -> Right . Just $ BinOp name' l' (BinOp name r' r)
          | BinOp name' _ _ <- r
          , Just entry' <- lookup name' table
          , opPrecedence entry == opPrecedence entry'
          , AssocL <- opAssoc entry' ->
              Left AmbiguousParse
          | otherwise -> Right Nothing
associativity _ _ = Right Nothing

Precedence correction. This code also crashes when operators are missing from the operator table.

This is broken down into two phases- making sure the left branch is precedence-correct with respect to the input node, and then doing the same for the left branch.

For each branch, if that branch contains an operator with a lower precedence than the input node, re-order the tree so the lower-precedence operator is at the top.

precedence :: OpTable -> Expr -> Maybe Expr
precedence table e@(BinOp name _ _)
  | Just entry <- lookup name table =
      checkR entry $ fromMaybe e (checkL entry e)
  where
    checkL entry (BinOp name l c) =
      case l of
        BinOp name' a b 
          | Just entry' <- lookup name' table
          , opPrecedence entry' < opPrecedence entry ->
              Just $ BinOp name' a (BinOp name b c)
        _ -> Nothing
    checkL _ _ = Nothing

    checkR entry (BinOp name a r) =
      case r of
        BinOp name' b c
          | Just entry' <- lookup name' table
          , opPrecedence entry' < opPrecedence entry ->
              Just $ BinOp name' (BinOp name a b) c
        _ -> Nothing
    checkR _ _ = Nothing
precedence _ _ = Nothing

precedence and associativity have type Expr -> Maybe Expr because eventually the transformations will no longer be applicable. We can use liftA2 (liftA2 (<|>)) to combine the two rewrite rules, and rewriteM will run until one produces a Left, or until both always produce Right Nothing

reorder :: OpTable -> Expr -> Either ParseError Expr
reorder table =
  rewriteM $
  liftA2 (liftA2 (<|>)) (Right . precedence table) (associativity table)

Let’s try it on the expression 5 - 4 + 3 * 2 + 1. It will be parsed as 5 - [4 + [3 * [2 + 1]]], but after re-ordering should become [[5 - 4] + [3 * 2]] + 1.

ghci> let o = [("+", OperatorInfo AssocL 5), ("-", OperatorInfo AssocL 5), ("*", OperatorInfo AssocL 6)]
ghci> let input = BinOp "-" (Number 5) (BinOp "+" (Number 4) (BinOp "*" (Number 3) (BinOp "+" (Number 2) (Number 1))))
ghci> reorder o input
Right (BinOp "+" (BinOp "+" (BinOp "-" (Number 5) (Number 4)) (BinOp "*" (Number 3) (Number 2))) (Number 1))

We can also use Parens to explicitly parenthesise the expression. If we input 5 - (4 + (3 * (2 + 1))), it will not be re-ordered at all.

ghci> let input = BinOp "-" (Number 5) (Parens $ BinOp "+" (Number 4) (Parens $ BinOp "*" (Number 3) (Parens $ BinOp "+" (Number 2) (Number 1))))
ghci> reorder o input
Right (BinOp "-" (Number 5) (Parens (BinOp "+" (Number 4) (Parens (BinOp "*" (Number 3) (Parens (BinOp "+" (Number 2) (Number 1))))))))

ghci> reorder o input == Right input
True

Ambiguous expressions are reported. Here’s the example from earlier — 5 ^ 4 ! 3:

ghci> let o = [("^", OperatorInfo AssocL 5), ("!", OperatorInfo AssocR 5)]
ghci> let input = BinOp "^" (Number 5) (BinOp "!" (Number 4) (Number 3))
ghci> reorder o input
Left AmbiguousParse

ghci> let input = BinOp "!" (BinOp "^" (Number 5) (Number 4)) (Number 3)
ghci> reorder o input
Left AmbiguousParse

And are resolved by adding explicit parentheses — 5 ^ (4 ! 3) and (5 ^ 4) ! 3 respectively:

ghci> let o = [("^", OperatorInfo AssocL 5), ("!", OperatorInfo AssocR 5)]
ghci> let input = BinOp "^" (Number 5) (Parens $ BinOp "!" (Number 4) (Number 3))
ghci> reorder o input
Right (BinOp "!" (Number 5) (Parens (BinOp "^" (Number 4) (Number 3))))

ghci> let input = BinOp "!" (Parens $ BinOp "^" (Number 5) (Number 4)) (Number 3)
ghci> reorder o input
Right (BinOp "!" (Parens (BinOp "^" (Number 5) (Number 4)) (Number 3)))

> Isaac Elliott

Isaac really likes types