Algebraic Specification of Abstract Data Types

Algebraic specifications provide a mathematical framework for describing abstract data types. This framework offers:

Syntax specification: Presentations

sort IntSet imports Int, Bool

signatures
  new    : -> IntSet
  insert : IntSet × Int -> IntSet
  member : IntSet × Int -> Bool
  remove : IntSet × Int -> IntSet
Such a syntactic specification is sometimes called a presentation. Compare with Ada package specification, Modula-2 definition module, etc.

Semantics specification: Axioms

variables i, j : Int;  s : IntSet
axioms
  member(new(), i) = false
  member(insert(s, j), i) = 
     if i = j then true 
     else member(s, i)
Is this complete? How do we know?

Classify functions:

Identify the key constructors, capable of constructing all possible IntSets: insert, new. Identify others as auxiliary, e.g., remove is a destructive constructor.

Completeness requires (at least): every inspector and auxiliary constructor is defined by one equation for each key constructor. We therefore need equations for remove.

remove(new(), i) = new()
remove(insert(s, j), i) = 
  if i = j then remove(s, i)
  else insert(remove(s, i), j)

Are we done yet? The completeness criterion (an equation defining member and remove for each of the new and insert constructors) is satisfied.

But does this really specify sets? Do the following properties hold?

Can these properties be inferred from the specification? Yes and No. Algebraic specifications can be interpreted in two different ways.
  1. The final algebra approach: two values of a sort are equal if they cannot be distinguished by application of inspector functions. That is, the only properties of an ADT should be those that are accessible by the ADT functions. In our example, two sets are equal if they cannot be distinguished by application of member. Both of the required properties can be proven under this assumption.
  2. The initial algebra approach: two values of a sort are equal only if they are provably equal by the axioms. The properties cannot be proven under this assumption.

A good specification should provide equivalent interpretations under both the initial and final algebra approaches. Thus the properties mentioned above should be added in some fashion.

Changes we could make:

  1. Add the two equations giving a complete set specification.
  2. Add order-independence, but allow duplicates. Introduce count to count occurrences of an element.
  3. Add neither equation and interpret as lists. Add head and tail or retrieve to retrieve by list position.

Algebraic Specification and Functional Programming

Algebraic Specifications have a parallel in functional programming. Algebraic data types can be used to directly implement a set.

data IntSet = New  | Insert IntSet Int

member New i = False
member (Insert s j) i
  | i == j = True
  | otherwise = member s i

remove New i = New 
remove (Insert s j) i
  | i == j = remove s i
  | otherwise = Insert (remove s i) j
Note that this implementation has strong parallels to the specification.

One aspect of the above implementation is that there can be different representations of the same set. For example, the following data structures all represent the same set.

An alternative approach is to define insert as a function that converts a set to a canonical form. It does so by maintaing the set as an ordered list and avoiding insertion of duplicate elements.

data IntSet = New | Insert IntSet Int

new :: IntSet
insert :: IntSet -> Int -> IntSet

new = New 

insert New i = Insert New i
insert (Insert s j) i  
  | i == j = Insert s j
  | i < j = Insert (Insert s j) i
  | otherwise = Insert (insert s i) j

member New i = False
member (Insert s j) i
  | i == j = True
  | i < j = False
  | otherwise = member s i

remove New i = New 
remove (Insert s j) i
  | i == j = s
  | i < j = Insert s j
  | otherwise = Insert (remove s i) j
Using this implementation the following three expressions have all generate the same representation, namely Insert (Insert New 4) 3 . Note also that member and remove have been written to take advantage of the ordered representation.