ML: User-Defined Types, Constructors, Modules


User-defined TYPES

First, we look at type abbreviations... names for types expressions
using existing (built-in) types and type constructors

base types: int, string, bool, real, unit, exn, instream, ostream
type constructors: *, ->, list, array, ref

if T1 and T2 are types, so are 
  T1 * T2,   T1 -> T2,   T1 list, etc.

These type expressions can be named and later used as follows:

  type vector = real list ;
  val v = [1.0, 2.0] : vector ;
  >> val v = [1.0, 2.0] : vector

  val x = [1.0, 2.0] ; 
  >> val v = [1.0, 2.0] : real list 

  v = x ;
  val it = true : bool

Notice that the type def "vector" is type equivalent to "real list"
so this is not a NEW type, just an abbreviation
Polymorphic Type Definitions

  type ('d,'r) mapping = ('d * 'r) list ;

This is make a type name for a list of pairs, each pair will 
contain two elements of types determined when the type is
instantiated.

  val w1 = [("in",6), ("a",1)];
  >> val words = [("in",6),("a",1)] : (string * int) list

  val w2 = [("in",6), ("a",1)] : (string,int) mapping ;
  >> val words = [("in",6),("a",1)] : (string,int) mapping

These abberviations may be useful, but they are limited in power
since they are just shorthand notation.
Datatypes and Data Constructors

Instead of merely defining abbreviations for type expressions, we
want to create new structures (like trees, records, etc.)

Datatype definitions create NEW types that are not abbreviations
for any other type

  datatype fruit = Apple | Pear | Grape ;
  >> datatype  fruit
  >>   con Apple : fruit
  >>   con Grape : fruit
  >>   con Pear : fruit

Note the interpreter says that Apple, Pear, Grape are all 
data constructors

  fun isApple(x) = (x=Apple) ;
  >> val it = fn : fruit -> bool

  isApple(Pear) ;
  >> val it = false : bool

  isApple(Banana) ;
  >> error: unbound variable: Banana

This is essentially making Pascal-style enumerated types
Data constructors are expressions that build values for a type.

The data constructors in the previous example are simple, essentially
named constants.

Here is a more general form of datatype definition:

  datatype ('a,'b) elt =
    P of 'a * 'b  |
    S of 'a ;
  >> datatype  ('a,'b) elt
  >>   con P : 'a * 'b -> ('a,'b) elt
  >>   con S : 'a -> ('a,'b) elt

This is something like a C union type, except it's polymorphic.
The type "('a,'b) elt" is actually composed of elements from two 
dissimilar subtypes, denoted by constructors "P" and "S".
Here, the constructors look like functions (they are not) in
that they require data items of certain types in order to make
items of type "elt"... this is more like the REPRESENTATION 
from ADT implementations

  val m = [("hi",1),("there"),("big",4)];
  >> Error: operator and operand don't agree (tycon mismatch)
  >>   operator domain: string * string list
  >>   operand:         string * (string * int) list
  >>   in expression: 
  >>     ("there") :: ("big",4) :: nil

  val m = [P("hi",1), S("there"), P("big",4)];
  >> val m = [P ("hi",1),S "there", P ("big",4)] 
       : (string,int) elt list

The interpreter picked proper instantiations for type vars 'a 
and 'b in the definition of ('a,'b) elt
Using the datatype 

  fun sumEltList(nil) = 0
   |  sumEltList(S(x)::L) = sumEltList(L)
   |  sumEltList(P(x,y)::L) = y + sumEltList(L) ;
  >> val sumEltList = fn : ('a,int) elt list -> int 

  sumEltList m ;
  >> val it = 5 : int

The data constructors are used in the function code where needed to
distinguish the various versions of the elements in "('a,'b)elt"

This is a common programming pattern with datatypes:
one alternative in the fun defn for each data constructor... that
way the fun has an alternative to match any element of the type.
Another Example: Binary trees 

  datatype 'label btree =
    Empty |
    Node of 'label * 'label btree * 'label btree ;
  >> datatype  'a btree
  >>   con Empty : 'a btree
  >>   con Node : 'a * 'a btree * 'a btree -> 'a btree

Note the interpreter renaming the type variable...

  Node("as",Node("a",Empty,Empty),Node("in",Empty,Empty)) ;
  >> val it = Node ("as", Node ("a",Empty,Empty),
       Node ("in",Empty,Empty)) : string btree

  Node(2,Node(1,Empty,Empty),Empty) ;
  >> val it = Node (2,Node (1,Empty,Empty),Empty) : int btree
Mutually Recursive Datatypes

Similar to defining mutually recursive functions:

  datatype
    'lab eTree = Empty |
                 eNode of 'lab * 'lab oTree * 'lab oTree
  and 
    'lab oTree = oNode of 'lab * 'lab eTree * 'lab eTree 
  ;

eTree has even length paths from root to each leaf.  
oTree has odd length paths.  
The Empty tree is an eTree, since path length is 0.

  val t1 = oNode(1,Empty,Empty);
  val t2 = oNode(2,Empty,Empty);
  val t3 = eNode(3,t1,t2);
  val t4 = oNode(4,t3,Empty);
  val t5 = eNode(5,t4,t4);
Extended Example: Binary Search Tree

datatype 'label btree =
  Empty |
  Node of 'label * 'label btree * 'label btree;

(*=================================================================*)

fun lower(nil) = nil
 |  lower(c::cs) = 
      if c>="A" andalso c<="Z" then chr(ord(c)+32)::lower(cs) 
      else c::lower(cs);

fun lt(x:string,y) =
  implode(lower(explode(x))) < implode(lower(explode(y)));

exception EmptyTree;

(*=================================================================*)

(* lookup(x,T) tells whether element x is in tree T *)
fun lookup(x,Empty) = false
 |  lookup(x,Node(y,left,right)) =
      if x=y then true
      else if lt(x,y) then lookup(x,left)
      else (* lt(y,x) *) lookup(x,right);

(* insert(x,T) returns tree T with x inserted *)
fun insert(x,Empty) = Node(x,Empty,Empty)
 |  insert(x, T as Node(y,left,right)) =
      if x=y then T (* do nothing; x was already there *)
      else if lt(x,y) then Node(y,insert(x,left),right)
      else (* lt(y,x) *) Node(y,left,insert(x,right));

(* deletemin(T) returns a pair consisting of the least element y 
   in tree T and the tree that results from deleting y from T.  
   It is an error if T is empty *)
fun deletemin(Empty) = raise EmptyTree
 |  deletemin(Node(y,Empty,right)) = (y,right) 
    (* This is the critical case.  If the left subtree is empty,
       then the element at the current node is the min. *)
 |  deletemin(Node(w,left,right)) =
      let val (y,L) = deletemin(left)
      in  (y, Node(w,L,right))
      end;

(* delete(x,T) returns tree T with element x deleted *)
fun delete(x,Empty) = Empty
 |  delete(x,Node(y,left,right)) =
      if lt(x,y) then Node(y,delete(x,left),right)
      else if lt(y,x) then Node(y,left,delete(x,right))
      else (* x=y *)
        if left = Empty then right
        else if right = Empty then left
        else let val (z,R) = deletemin(right)
             in Node(z,left,R)
	     end;

fun sum(Empty) = 0
 |  sum(Node(a,left,right)) =
      a + sum(left) + sum(right);

fun preOrder(Empty) = nil
 |  preOrder(Node(a,left,right)) =
      [a] @ preOrder(left) @ preOrder(right) ;


ML Module System

ML has several features for encapsulating/hiding information, encouraging the reuse of code and limit propogation of bugs
Structures

General definition:

  structure < id > = struct < elements of structure > end

Reconsider the mapping from previous examples

  structure Mapping = struct
  
    exception NotFound ;

    val create = nil ;
 
    fun lookup (d,nil) = raise NotFound
     |  lookup (d, (e,r)::es) =
          if d=e then r else lookup(d,es) ;

    fun insert (d,r,nil) = [(d,r)]
     |  insert (d,r,(e,s)::es) =
          if d=e then (d,r)::es
          else (e,s)::insert(d,r,es) ;
  end;

  >> structure Mapping :
  >>   sig
  >>     exception NotFound
  >>     val create : 'a list
  >>     val insert : ''a * 'b * (''a * 'b) list -> (''a * 'b) list
  >>     val lookup : ''a * (''a * 'b) list -> 'b
  >>   end
Signatures

General form:

  sig < specs > end

  signature < id > = sig < specs > end

Example:

  signature SIMAPPING = sig
    exception NotFound ;
    val create : (string * int) list ;
    val insert : string * int * (string * int) list -> 
                   (string * int) list
    val lookup : string * (string * int) list -> int
  end ;

We can now use SIMAPPING signature to restrict the structure Mapping
previously defined:

  structure SiMapping: SIMAPPING = Mapping

Could have done it directly as well:

  structure SiMapping: SIMAPPING = struct
    exception NotFound ;
    val create = nil ;
    fun lookup (d,nil) = raise NotFound
     |  lookup (d, (e,r)::es) =
          if d=e then r else lookup(d,es) ;
    fun insert (d,r,nil) = [(d,r)]
     |  insert (d,r,(e,s)::es) =
          if d=e then (d,r)::es
          else (e,s)::insert(d,r,es) ;
  end;
  
There are 2 ways a signature can restrict a structure:

  (1) The sig can specify a more restrictive type for an identifier
      than is implicit in the structure defn (as done in SiMapping)

  (2) The sig can omit certain identifiers, hiding them.
      For example, we could create a mapping in which users could
      not modify the elements, by hiding "create" and "insert"

        sig  exception NotFound;
             val lookup : string * (string * int) list -> int
        end ;
Accessing Names in Structures

Assume we have done this:

  structure SiMapping: SIMPAAING = Mapping

We can now access it operations and elements this way:

  val m = SiMapping.create ;
  >> val m = nil : (string * int) list

  val m = SiMapping.insert("in",6,m) ;
  >> val m = [("in",6)] : (string * int) list

  SiMapping.lookup("in",m) ;
  >> val it = 6

We can also "open" a structure and not need the dot notation:

  open SiMapping ;

Any identifiers defined in SiMapping are now hiding previous 
definitions they might have had.  You can still get others with
explicit dot notation.

Or you can selectively make abbreviations without opening:

  val create = SiMapping.create ;
  val insert = SiMapping.insert ;
Importing into Structures

We can use in a structure any identifiers that are defined at the
time the structure is defined.  Nice, but leads to problems:

  fun lt (x:string, y) = (* some defn *)
  >> val lt = fn : string * string -> bool

  structure StringBST = struct
    datatype 'label btree = 
      Empty | 
      Node of 'label * 'label btree * 'label btree ;
    val create = Empty ;
    fun insert(x,T) = (* code *) 
    fun lookup(x,T) = (* code *) 
    exception EmptyTree ;
    fun deletemin(T) = (* code *) 
    fun delete(x,T) = (* code *) 
  end ;

In this case, the "lt" function is not part of the structure,
but it is used by the various lookup functions, etc.... if
it were not defined when StringBST was defined, error.

We can now instantiate StringBST different ways (restrict with
sigs) and have it work with different versions of lt...

Note: if "lt" were changed after a version of StringBST were
defined, the version of "lt" that existed when structure
is defined stays in force for the structure.

Thus, the meaning of a structure could differ, depending on what
code preceeds it... not the best way


Signature Example


Functors

Functors help us with encapsulation and generality of structures.

They address the anomalies just seen for importing into structures,
for example.

  signature TOTALORDER = sig
    eqtype elt ;
    val lt : elt * elt -> bool ;
  end ;

  functor MakeBST ( Lt: TOTALORDER ) : 
    sig
      open Lt ;
      type 'label btree ;
      exception EmptyTree ;
      val create : elt btree ;
      val lookup : elt * elt btree -> bool ;
      val insert : elt * elt btree -> elt btree ;
      val deletemin : elt btree -> elt * elt btree ;
      val delete : elt * elt btree -> elt btree
    end 
  = struct
    open Lt ;
    datatype 'label btree =
      Empty |
      Node of 'label * 'label btree * 'label btree ;
    val create = Empty ;
    fun insert(x,T) = (* code *) ;
    fun lookup(x,T) = (* code *) ;
    exception EmptyTree ;
    fun deletemin(T) = (* code *) ;
    fun delete(x,T) = (* code *) ;
  end ;
  >> functor MakeBST: 

This approach combines a sig (restriction) with a structure defn.
You open the Lt signature parameter so you can see the "lt" function
defined in it. 

Let's keep going... we need something of type TOTALORDER that we
can pass to this functor:

  structure String: TOTALORDER = struct
    type elt = string ;
    fun lt (x:string, y) =
      let fun lower(nil) = nil
           |  lower(c::cs) = if c>="A" andalso c<="Z"
                then chr(ord(c)+32)::lower(cs)
                else c::lower(cs) ;
      in implode(lower(explode(x))) < implode(lower(explode(y)))
      end ;
  end ;

  structure StringBST = makeBST(String) ;