# Andrew Gibiansky   ::   Math → [Code]

Saturday, July 12, 2014

This is the third of several blog posts meant to serve as a crash course in Haskell for someone already familiar with programming and somewhat familiar with functional programming. The previous post in the series was here, and the next post can be found here.

In this guide, we’ll learn about how another way Haskell handles polymorphism. We’ll look at a few of the uses cases for this sort of polymorphism.

Let’s begin by considering the polymorphism we’ve already seen and seeing what polymorphic type signatures for functions tell us about the functions.

### Polymorphism via Type Variables

Suppose you’d like to create a reverse function which reverses the order of a list:

reverse [1, 2, 3] == [3, 2, 1]
reverse "abc" == "cba"

This function should compltely ignore the elements of the list. It should not evaluate them or use them in any way. It should work for lists of any type of element. All of these are enforced by the type signature of reverse:

-- | Reverse the elements in a list.
reverse :: [a] -> [a]

Since the element type is just a (a type variable), reverse cannot make any assumptions about it. The same exact code must work for Int, Char, String, Handle (a file handle), or any other type.

These sorts of type signatures help eliminate many bugs and can help you figure out what code you want to write. However, while sometimes we can operate on any type variable a, sometimes we’d like to have some constraints on what a can be. For example, consider a hypothetical function elem, which returns whether or not a some element is in a list. Could elem have the following type signature?

-- | Return whether an element exists in a list.
elem :: a -> [a] -> Bool
elem element list = ...

A brief examination of the type signature should make it clear that, no, a working elem function cannot have that type signature. A function with that type signature cannot check whether one value of type a is equal to another value of type a! Thus, there’s no way that it could check that the element is equal to the values in the list. We could, however, create the following function, which is elem specialized to Int values:

-- | Return whether an element exists in a list.
elemInt :: Int -> [Int] -> Bool
elemInt element list = ...

We could implement a valid elemInt, because we know we can compare Int values.

Exercise 1: Implement elemInt

As an exercise, go ahead and implement elemInt. Make sure the following test cases pass:

elemInt 0 []                 == False
elemInt (error "Nothing") [] == False
elemInt 3 [10, 20, 30]       == False
elemInt 3 [1, 2, 3]          == True

We could also implement elemChar, elemString, elemFloat, and many other versions of elem. In order to implement elem, however, we need to have a way to write a type signature that allows polymorphism over the list element type (via a type variable a) but also requires that we can somehow compare values of type a for equality. Haskell provides typeclasses as a mechanism for constrained polymorphism.

Why We Can’t == On Any Type

In some languages, all types will support the equality operator. For example, in C or Java, two values of the same type may always be compared using ==. This comparison tests for pointer or reference equality, not the fact that the underlying objects are equal. (If you’ve programmed in Java, you will have been taught early on that you must use .equals() for objects if you mean to compare their values, because == will only check for reference equality; this can be a cause of many subtle bugs.)

Haskell does not allow testing for equality by pointer comparison for a variety of reasons. (Among others, pointer comparison prevents some optimizations and breaks expected language semantics in some cases.) Instead of using pointer comparions, == will always compare the values of objects.

We cannot have == work on all types because some objects might not have a meaningful comparison operator. For example, what does it mean to check if two file handles are equal? Are the equal if they have the same filename? If they have the same filename and point to the same location on disk? If they point to the same node/location in a file system but have different names (due to hard links), are they equal?

As another example, consider equality of functions of type Int -> Int — it is impossible to check if two functions are equal without running them on all possible inputs, which would take forever. Thus, the == operator only makes sense and only exists for a subset of all types in Haskell.

### Typeclasses

Typeclasses are a mechanism for overloading the meaning of names (values and functions) for different types. This allows us to write code that works on multiple types while using values of those types — for example, we can use the == operator to test many different types for equality. The == function is defined as part of the Eq typeclass, which could be written as follows:

class Eq a where (1)
(==) :: a -> a -> Bool (2)
 1 This line declares the typeclass. class and where are keywords, and between them you put the name of the class (Eq) and a type variable (a) that will represent the Eq-able type for the remainder of a declaration. Eq is the name of a constraint on types: in the remainder of the declaration, we write the type signatures of values and functions that must be implemented for a type for it to be part of the Eq typeclass. The type variable a is used in those type signatures. 2 This type signature is the declaration of a method of the typeclass. (Be very careful — typeclasses and typeclass have very little to do with object-oriented classes and methods, although the terminology may be decieving and there may be some surface similarities.) With this type signature, we state that in order to create an instance of the Eq typeclass for some type we must write a function called == that satisfies the given type signature. For example, in order to create an instance of Eq for Int, we would have to write a function with type signature Int -> Int -> Bool which tests two integers for equality.

Haskell already defines instances of Eq for all commonly used data types, which means you can use == on common types such as Int, String, [Int], Maybe String, and most others defined in the standard library. For the sake of understanding typeclasses, let’s define our own tree data structure:

-- | A binary tree containing Int values.
data IntTree = Leaf Int | Tree Int IntTree IntTree

If we tried to write code that used == on IntTree values, GHC would spit out an error message complaining about the lack of the Eq IntTree instance. To satisfy GHC, we can write an instance of Eq for IntTree:

instance Eq IntTree where (1)
Leaf x == Leaf x' = x == x' (2)
Tree x left right == Tree x' left' right' =
x == x' && left == left' && right == right'
_ == _ = False (3)
 1 This line is the beginning of the instance declaration and generally mirrors the class declaration. In this case, we’re declaring an instance Eq IntTree, so you can replace all occurrences of the a type variable in the original class with IntTree. 2 This is the definition of the == operator. To the left of the =, we have match the arguments to == with two patterns, Leaf x and Leaf x' and return True if and only if x == x'. Note that x and x' are of type Int, which means we can use == on them, because we have the instance Eq Int provided for us by Haskell. 3 In order to make sure that == works for all IntTree values, we provide a fall-through pattern match which will match anything the previous patterns haven’t. Since the previous patterns tested leaves against leaves and branches against branches, we know that this pattern is only matched if the structures of the trees are different (there’s a leaf in one tree where there is a branch in another), so we return False because these trees cannot be equal.
Exercise 2: Eq IntList

Consider the following linked list data structure:

data IntList = Nil | Cons Int IntList

Implement the Eq typeclass for the IntList type. Then, verify that the following code works and typechecks:

value1 :: IntList
value1 = Cons 3 (Cons 10 Nil)

value2 :: IntList
value2 = Nil

main = print (value1 == value1,
value2 == value2,
not (value1 == value2))

In both the example above (IntTree) and the exercise (IntList), you must use recursion to implement ==. In addition to recursing in the definition of ==, you must eventually invoke the == for the Int type, to compare the values at the leaves of the tree and nodes of the linked list. In the line Leaf x == Leaf x' = x == x', the usage of == on the right hand side refers to == for Int values; this is not a case of recursion, because we aren’t calling == for IntTree values.

In addition to defining their required methods, typeclasses can define auxiliary methods with default implementations. For example, the Eq typeclass is actually defined as follows:

class Eq a where
(==) :: a -> a -> Bool

(/=) :: a -> a -> Bool (1)
x /= y = not $x == y  1 The (/=) method is not required by the Eq typeclass. If an implementation of /= is not provided, the default implementation not$ x == y is used. Instances are allowed to provide their own custom implementations of /=; custom implementations are often used to provide more efficient implementations of typeclass methods.

Many of the typeclasses in the standard library have several methods but only require one or two of them for a complete implementation.

## Common Typeclasses

Typeclasses are fundamental to the Haskell language, and the standard library ships with several very commonly used typeclasses. In this section, we’ll go over several of the simpler typeclasses; we’ll see how they’re defined, how they’re used, and how to write simple instances for them. We skip the Eq typeclass, as it is reviewed in the previous section.

### Ord

Types which implement the Ord typeclass can be compared to each other; their values must have a total order imposed on them (for any values x and y, we can compare the two values and determine which one is greater, if any). In order to be a member of the Ord typeclass, a type must have a compare function which returns an ordering. In some languages (C, Java, Python) the compare function must return an integer which is zero if the two values are equal, a positive integer if the first value is greater than the second, and a negative integer if the first value is smaller than the second. In Haskell, orderings are instead expressed using the Ordering type:

data Ordering = LT | EQ | GT

The Ord typeclass then has a compare function which takes two values and returns an Ordering:

compare :: a -> a -> Ordering

In addition, the Ord typeclass includes a few functions that have default implementations using compare but can be overriden for efficiency, such as <, >, max, and min. The full Ord typeclass declaration is as follows:

class Eq a => Ord a where
-- Required for implementing Ord.
compare :: a -> a -> Ordering

-- Functions with default implementations.
(>) :: a -> a -> Bool
x > y = compare x y == GT

(<) :: a -> a -> Bool
x > y = compare x y == LT

(>=) :: a -> a -> Bool
x >= y = compare x y == GT || compare x y == EQ

(<=) :: a -> a -> Bool
x <= y = compare x y == LT || compare x y == EQ

max :: a -> a -> a
max x y = if x > y then x else y

min :: a -> a -> a
min x y = if x < y then x else y

The Ord typeclass, unlike Eq, has a context. Contexts come before type declarations or typeclass heads and can specify that type variables implement some specific typeclass:

class Eq a => Ord a where ...

In the above declaration, the context is Eq a, and is separated from the typeclass head (which is Ord a) using a "fat arrow", =>. The context specifies that the type variable a must be a member of the Eq typeclass in order to implement the Ord typeclass for that variable. In this case, Eq a is required for Ord a because it is nonsensical to have an ordering unless we have equality, since clearly compare can be used to implement (==).

In general, it is wise to make sure that all instances of Ord follow a few rules. First of all, they should agree with instance of Eq; that is, if x == y, then compare x y should return EQ. Instances of Ord should also define a reasonable total order: if compare x y == LT, then compare y x == GT, and if compare x y == EQ then compare y x == EQ as well.

### Show

The Show and Read typeclasses allow types to be converted to and from strings. They are not meant for user input and output, but rather for programmer viewing and debugging. (For example, the Show instance for String outputs newlines as \n and quotes as \", which makes sense for programmers but does not for user output.). The Show typeclass has three methods: show, showsPrec, and showList.

Most of the time, knowing about show is enough; the other two are somewhat specialized methods that you will rarely need to implement. show has the type show :: a -> String; it can convert any type a which implements the Show typeclass into a String. For example, in order to convert an integer to a string, you could write show (1 :: Int); in this context, show would be specialized to show :: Int -> String.

For the sake of demonstration, let’s create our own character-like type that can only hold uppercase As, Bs, Cs, as well as a special character representing non-printable character:

data ABC = A | B | C | Other

If we want to be able to print ABC values, we can create a Show instance for it:

instance Show ABC where
show A = "A"
show B = "B"
show C = "C"
show Other = "<Not printable>"

We can then write programs that print values of type ABC to standard output. The following program will simply print the letter "A" to the screen:

Show1.hs
a :: ABC
a = A

main :: IO ()
main = putStrLn (show a) (1)
 1 Instead of writing putStrLn (show x), we can write print x. print is a function defined as print = putStrLn . show.

For most use cases, show is all you need to know about the Show typeclass; for the sake of completeness, we discuss showsPrec and showList, even though these functions come up rarely in practice.

To motivate showsPrec, consider the following code:

main = putStrLn (show Other ++ show Other ++ show Other ++ show Other)

How long does this program take to run? Not very long, because we only have four Strings we’re concatenating. However, in general, concatenating n Strings can take O(n^2) time, since each time we append a string to the end of a list, we must first traverse the entire list. If we were to run this program with a thousand ABC values instead of four, this might take quite a while due to this quadratic growth! This quadratic growth is the first problem that ShowS solves.

The fundamental issue is that Show relies on String values, which take a long time to append. To rectify this, showsPrec uses a different type with the alias ShowS:

type ShowS = String -> String

A ShowS value is a function that, when given a String, prepends another String to it and returns the sum. The type String and ShowS are isomorphic in meaning, which we can show by providing conversion functions between them. We can convert a String into a ShowS by writing a function which prepends the given string to its input:

showString :: String -> ShowS
showString str = \next -> str ++ next

Converting from String to ShowS is fast. Since we don’t actually do any work (we just create a function), we don’t need to iterate over the characters, so it is done in constant time. We can also convert from ShowS to a String by using the ShowS to prepend to an empty string:

fromShowS :: ShowS -> String
fromShowS prepender = prepender ""

Unlike showString, fromShowS is not a constant time operation. In order to prepend a string to "", the ShowS must traverse the entire string it’s appending and then add "" onto the end of it. Thus, the runtime of fromShowS grows linearly with the number of characters in the output.

Let’s compare appending String values and ShowS values. In order to append String values, you use the ++ operator, which traverses over the first string character by character and then adds the second string onto the end. As you append more and more characters to a string, appends take longer and longer, because each append must traverse all previous characters; thus, the running time grows quadratically in the length of the string. In constract, in order to append ShowS values, you just use the . function composition operator. If you have a ShowS which prepends the string "x" and a ShowS which prepends the string "y", you can make a ShowS which prepends "xy" by composing your two ShowS values to first prepend "y" and then prepend "x". Since function composition is done in constant time, combining ShowS values only takes as long as the number of values you are combining.

As long as showsPrec outputs a ShowS instead of a String, we can write code that efficiently concatenates the string representations of many things. Using ShowS yields better performance, but it is not as convenient as show for common uses, which is why show is included in the typeclass.

The second problem that showsPrec solves is one of parenthesizing. For example, if we write show (Just [1, 2, 3]), we expect the result to be Just [1, 2, 3]; however, if we write show (Just (Just [1, 2, 3])), we expect the result to be Just (Just [1, 2, 3]). Consider the following attempt at an implementation:

instance Show a => Show (Maybe a) where (1)
show Nothing = "Nothing"
show (Just x) = "Just " ++ show x
 1 This example uses instance contexts; see Exercise 1 for more information on this.

If you pay attention to what this example does, though, you will notice that show (Just (Just [1, 2, 3])) does not work! Instead of outputting what we want, it outputs Just Just [1, 2, 3], which is missing a set of parentheses.

Using the type alias ShowS, the type of showsPrec for a type a is written as

showsPrec :: Int -> a -> ShowS

The Int that showsPrec is passed is the operator precedence of the enclosing context, which is a number between zero and eleven. Function application has precedence ten; infix data constructors can have lower precedences. This integer allows the showsPrec implementation to decide whether or not to include the parentheses. The following is a proper implementation of Show Maybe, this time using showsPrec:

instance Show a => Show (Maybe a) where
showsPrec _ Nothing = showString "Nothing"(1)
showsPrec precedence (Just x) =
if precedence > 10 (2)
then showString "(Just " . showsPrec 11 x . showString ")" (3)
else showString "Just " . showsPrec 11 x (4)
 1 showString is the same convenience function we defined earlier, of type showString :: String -> ShowS. 2 10 is the precedence of function application, so a precedence context greater than means that this value is being printed as an argument to some function and thus we need parentheses. 3 We use . to concatenate ShowS values (instead of ++, which is only used for String values). 4 Since the Just constructor looks like a function, we must print the argument to it in a precedence context greater than function application; thus, we pass 11 as the precedence context to showsPrec for whatever comes after the Just.

showsPrec can be thought of as a low-level interface to the capabilities of the Show typeclass. Although the complexity may seem daunting, it is necessary for printing all the possible values that you can define in Haskell.

The last method of the Show typeclass is showList:

-- Give the method a specialized way of showing lists of values.
showList :: [a] -> ShowS

The showList method can be used to override the default of printing lists with square brackets and commas. This is rarely necessary, but is used by the Haskell standard library to print String values using quotes instead of square brackets and to omit the commas.

Exercise 1: Show for lists

Consider the following linked list data type, isomorphic to Haskell’s [a]:

data List a = Nil | Cons a (List a)

Implement the Show typeclass for List a, provided that a implements Show. To do so, fill in the following template:

instance Show a => Show (List a) where
show xs = ...

This code has another example of a context, this time used in an instance instead of a class declaration. The context Show a with the instance head Show [a] says that for any type a that implements Show, [a] implements Show (with the implementation provided below).

Your implementation of show should act identically to show for Haskell lists, but use {} instead of []. For example, show Nil should be {} and show (Cons 'X' (Cons 'Y' Nil)) should be {'X', 'Y'}.

Exercise 2: showList for Characters

Recall the data type and Show instance we defined earlier:

data ABC = A | B | C | Other

instance Show ABC where
show A = "A"
show B = "B"
show C = "C"
show Other = "<Not printable>"

Modify this instance to use showsPrec. You can use showString to do so.

Once you have rewritten this instance to use showsPrec, add an implementation for showList to it such that lists of ABC values are printed surrounded by vertical bars, without commas, and skipping Other values. For example, you should have showList [A, B, C, Other, C, B] return a string containing |ABCCB|.

The opposite of the Show typeclass is the Read typeclass. While Show is used to convert Haskell data structures to Strings, Read provides methods to parse Strings into Haskell data structures. Since converting Strings to data structures requires fairly complex parsing, the methods of the Read typeclass are actually almost never used. However, the Haskell Prelude provides the read function:

read :: Read a => String -> a

This is not a method of the Read typeclass, but it requires that the type that’s being output implements Read. To use read, just pass it a String, as in the following example:

value :: Int

main :: IO ()
main = print value

Since the output of read is a type variable, it is polymorphic in its output. This can often cause problems, as GHC’s type inference engine will be unable to determine exactly what type is meant to be read. For example, compiling the following program will yield an error, complaining that the type variable a is ambiguous:

main :: IO ()

### Defaulting rules

In the previous section, we discussed the following faulty program:

-- | Display a value prettily.
class Pretty a where
pretty :: a -> String

-- | Integers are already pretty.
instance Pretty Int where
pretty = show

main :: IO ()
main =  print (pretty 3)

Compiling this program yielded an error, as the compiler could not determine what type the literal 3 should be. However, consider the following program:

main :: IO ()
main =  print (show 3)

If you try compiling this, you will find that even though show has almost the same type signature as pretty, this program compiles just fine! The compiler does not issue any errors, and does exactly what you would expect it to do.

The secret behind this mysterious behaviour is called defaulting. Defaulting is a small addition to the Haskell language that eliminates these type ambiguity errors in the most common cases. Although defaulting is certainly somewhat of a hack, it’s necessary in order to make working with numbers remotely convenient; without it, entering something like 1 + 1 into an interactive prompt would result in a type ambiguity error.

Defaulting allows the compiler to assume some "default" type whenever it encounters an otherwise ambiguous type (C1 a, C2 a, C3 a, …​) => a. In standard Haskell, defaulting kicks in if and only if the following conditions are met:

• There are no other constraints on the type variable a.

• All the classes C1, C2, and so on are standard classes from the Prelude.

• At least one of the clases C1, C2, and so on is numeric (Num, Fractional, etc).

When defaulting kicks in, it looks for a default declaration. Default declarations look like this:

default (Int, Double, Float, Integer, Rational)

The compiler will look through this list left to right and find the first type in the list that matches all the required constraints. That type will then be inferred for the type variable.

Most Haskell programs do not include their own default statements. When no default declaration is present, the compiler assumes the following defaulting order:

-- Default defaulting rules, if no others are present.
default (Integer, Double)

Defaulting may be completely turned off with default (); however, note that a default declaration’s effects are only limited to the module in which it was declared.

GHCi (the interactive Haskell interpreter that ships with GHC) extends these rules somewhat. When at an interactive prompt, defaulting kicks in if and only if the following (modified from before) conditions are met:

• There are no other constraints on the type variable a.

• All the classes C1, C2, and so on are single-parameter classes.

• At least one of the clases C1, C2, and so on is numeric (Num, Fractional, etc) or is Show, Eq, or Ord.

With these relaxations of the rules, code such as show (reverse []) will work at the interactive prompt. However, if you try to compile the following program, you will get an ambiguity error:

main :: IO ()
main = print (reverse [])

The GHCi rules may be turned on by enabling the ExtendedDefaultRules extension.

Saturday, July 12, 2014 - Posted in haskell