In this blog post I will describe the concept of a palindrome more precisely than in my last blog post, by giving the property that determines whether or not a sequence of symbols is a palindrome in the functional programming language Haskell.
Why is a specification of a palindrome as a program more precise than the specification given in text in the previous blog post, which was taken from the Oxford English Dictionary (OED)? And why should someone interested in the concept of palindromes be interested in a program that describes the concept? These are philosophical questions, to which computer science helps to give an answer. The purpose of a definition is to distinguish certain things from other things. The definition of a palindrome distinguishes palindromic words or sequences of words from non-palindromic ones. How do I determine if a sequence of words is a palindrome? The textual definition in the OED says that it should read the same backwards as forwards, letter for letter. The first example in the OED is "Lewd did I live, and evil I did dwel", which doesn't read the same backwards and forwards, unless capital and underscore letters are considered equal, and comma's are ignored. Hmmm. Are there any more exceptions I should know of?
A much better way to define the concept of palindromes is to provide a program, which takes a string as input, and which returns either yes or no, determining whether or not the input string is a palindrome. The OED definition leads to three different programs for determining whether or not an input string is a palindrome: a definition which compares strings letter for letter, and only accepts string which are literally the same forwards and backwards, a definition which ignores punctuation and capitalization, and a definition which compares DNA strings. Such a program is a much more precise way to define what it means to be a palindrome. To determine if a string is a palindrome it suffices to run the program on the string. To study the concept of palindromes, it suffices to study the program. No surprises.
Not just palindromes profit from being defined by means of a program, many other concepts, ideas, and regulations would be better off being specified as a program. Besides the obvious anagrams, pangrams, ambigrams, and so on, tax laws, testaments, exam regulations, and anything that follows some rules is best described by means of a program. A program is precise about the rules, the exceptions, and when they apply. If a concept cannot be described precisely, using a program becomes harder. I wouldn't know how to construct a Turing test: a program to determine whether or not a species is intelligent. And even a simple concept like a chair is not easily captured in a program.
Haskell is a programming language celebrating its 25th anniversary in 2012. It was conceived at an academic conference on functional programming in 1987, in an attempt to combine efforts in developing a purely functional programming language. In the nineties of the last century it was used for research on advanced programming language concepts, and in teaching at many universities. Since a number of years it is becoming more popular in industry, in particular in the investment banking industry, where it is used to specify and implement involved mathematical financial models. I use Haskell because it allows me to describe ideas precisely and concisely. I will introduce the necessary Haskell concepts as I go.
This blog post itself is a literate Haskell program. If you save it as a file and make sure its name ends in .lhs, you can load it in ghci, an interpreter for Haskell programs, and use the definitions given in this post. To make this work I need the following two lines.
> import Prelude hiding (reverse) > import Data.Char hiding (isLetter)The first line says that I can use all the standard Haskell functions except the function
reverse, which I am going to define in this blog post. The second line says that I can use all basic functions on characters, such as the function
isSpace, which tells whether or nor a character is a space. The function
isLetteris excluded, since I am going to define that function in this blog post too.
I use the following notation for a list of characters. The empty list of characters is denoted by
 and a symbol
x followed by a list of characters
xs is denoted by
x:xs. An individual character is surrounded by quotes to distinguish it from arbitrary variables like
x. For example, in this notation, I can write "refer" as
'r':'e':'f':'e':'r':. Furthermore, I can concatenate two lists of characters by means of the operator
++, so that
"Madam, " ++ "I'm Adam" is
"Madam, I'm Adam".
How can I determine whether or not a string (a list of characters) is a palindrome? The simplest method is to reverse the string and to compare it with itself. So the string
xs is a palindrome (
palindrome xs), if
xs is equal to its reverse:
xs == reverse xs, where
xs == ys is
True only when the strings
ys are exactly equal.
> palindrome xs = xs == reverse xsThis equation is actually a program in Haskell. We can load this program in ghci and run it. For example,
palindrome "abba"evaluates, unsurprisingly, to
palindrome "yabadabadoo"evaluates to
How do I compute the reverse of a string? For this purpose, I define a recursive function: if I know how to calculate the reverse of the empty string, and if I know how to calculate the reverse of
x:xs, given that I know how to calculate the reverse of
xs, then I can use a computer to calculate the reverse of any string. The two cases for reverse are not hard: the reverse of the empty string is the empty string, and the reverse of
x:xs is the reverse of
xs followed by the string consisting of the single character
> reverse  =  > reverse (x:xs) = reverse xs ++ [x]Here
[x]is the list containing the single element
x, which can also be written
"x". This completely defines what it means to be a palindrome.
The above definition only qualifies strings as palindromes when they are exactly equal when reversed. So
"Madam, I'm Adam" does not pass this test. For this string to also pass the
palindrome test, I slightly adapt the definition. I now say that a string is a palindrome if it is equal to its reverse after throwing away all punctuation symbols such as spaces, comma's, periods, etc, and after turning all characters into lower case characters. I call such a palindrome a
> textPalindrome xs = let ys = filter isLetter xs > zs = map toLower ys > in zs == reverse zs > > isLetter l = not (isPunctuation l) && not (isSpace l)Haskell's
let ... in ...construct allows me to introduce new definitions after the
let, which I can use in the definitions after the
in. Here I use two intermediate results
zs. By filtering the original string
xswith the predicate
isLetter, I obtain a new string
ysin which no punctuation or space characters appear anymore. So
"Madam, I'm Adam"is turned into
filteris a standard function in Haskell which takes a predicate
pand a list
xs, and keeps all the elements of
xsthat satisfy the predicate
p. The function
isLetteruses two basic Haskell functions on characters,
Truefor all punctuation characters such as the dot, comma, space, and exclamation mark, and
Falsefor all non-punctuation characters, such as letters. Function
isSpaceworks similarly on space characters. Function
isLettertakes a character, and checks that it is
nota punctuation character, and (denoted by
nota space character. After filtering out all non-letter characters from
ys, I map each letter in
ysto its lower case by means of
map toLower ys. So
"MadamImAdam"is turned into
mapis also a standard Haskell function, which takes a function as argument,
toLowerin this example, and applies this function to all characters in a string. The function
toLowerturns a capital letter into lowercase, and does nothing to lowercase letters. Function
textPalindromeaccepts all palindromes, irrespective of their punctuation.
textPalindrome cannot be used to check if a DNA sequence is a palindrome, because the symbol
'A' should be considered equal to
'G', and the equality
== used in the definition of
palindrome considers them different. We need to change the equality function to compare characters in DNA strings. We define the DNA character equality function
> 'A' =:= 'T' = True > 'T' =:= 'A' = True > 'C' =:= 'G' = True > 'G' =:= 'C' = True > _ =:= _ = FalseWe use this new equality function in a definition of
dnaPalindromefor sequences of DNA symbols. We pairwise combine the elements of
reverse xswith the equality function
=:=using the function
zipWithis another standard function, which takes an operator and two lists, and `zips' the two lists with the operator. Thus we obtain a list of comparisons, which we fold to a single result by requiring that each element in the list is
andtakes a list of boolean values, and returns
Trueonly if all elements in the list are
> dnaPalindrome xs = and (zipWith (=:=) xs (reverse xs))Thus we have three functions for checking whether or not a string is a palindrome:
palindromefor strings that are exactly equal when reversed,
textPalindromefor strings that are exactly equal when reversed modulo punctuation and space characters, and
dnaPalindromefor DNA strings.