If you have not already done so, please complete the on-line course evaluation for your classes by visiting my.wsu.edu.

Image goes here
Types
CptS 355 - Programming Language Design
Washington State University

Types

What is a type:
  • A collection of values that share a common property - usually a common set of operations.

Importance

  • Types provide a vehicle for organizing the concepts used in a program.
    Naming types helps with this use.
    This use is primarily intended for people (documentation). For example in ML,
    type <id> = declaration
  • Making sure that bit sequences are interpreted correctly.
    Language is checking that programmer isn't writing non-sense such as:
    x = 17 
    x ( );
  • Providing info to compiler about data being manipulated
    x = 17 
    y = 3.0 + x

What is a type error?

An inconsistent use of the bit sequence representing a value. An attempt to use a value in an operation inconsistent with the value's type.
Some type errors may cause hardware exceptions. If you could write
x = 17 
x ( );
and execute it, it would produce a run-time exception. Others type errors, if unchecked, could produce wrong answers.
int_add(3,4.5)
(In most assembly languages either of these is possible)

Use of types in compilation

  • Offsets of fields
  • Size of array elements
  • Think about what you would have to do for array addressing if no types for elements

Use of types at run-time

  • "Marshalling"/"Pickling/Serialization" - Converting a value to a contiguous bit stream to send to another computer
  • Garbage collection
  • Safe casts

Type Safety: strong typing

Type safety, or strong typingis assurance by a language that no program can use a value in a manner inconsistent with its type without causing either a compile time error (static strong typing) or run-time error (dynamic strong typing)

Sources of type unsafety:
  • type casts
  • pointer arithmetic
  • explicit de-allocation leading to dangling pointers
  • union types (in C)

Run-time (dynamic) versus Compile-time (static) type checking

  • Compile-time type checking is necessarily conservative: it may flag as an error something that would not ever cause a run-time error.
    "Conservative" in the preceding statement means "erring on the side of safety."
    (A checker that failed to flag an error would not be too useful)
  • Run-time type checking is expensive - must be done for every operation.
    Allows cetain programming styles not possible with compile-time type checking.
    For example, lists in Scheme may contain values of any type versus lists in ML where vals must all be the same type.

Type checking and type inference algorithms

fun f (g, h) = g (h  0)
   'e 'c 'a   'c 'a  int
                 -------
                  'b
              ----------
                 'd
where,
  'a = int -> 'b            
  'c = 'b  -> 'd
  'e = ('c * 'a) -> 'd
     = ('b  -> 'd) * (int -> 'b) -> 'd
So the final type for function f is ('b -> 'd) * (int -> 'b) -> 'd. Note that in the book, the type for this function is inferred to be
  ('a  -> 'b) * (int -> 'a) -> 'b     
which is the same type. That is, if by consistently renaming the type variables in one type you can produce the other type, then they are the same type.

Type correctness

If equations derived in type inference are solvable, the program is type correct, otherwise not.
A program can be type-correct but may still compute the incorrect result. Sometimes the inferred type of a function will suggest that something is wrong. For example:
            'b list      'a list        
fun reverse []         = []
  | reverse (x::xs)    = reverse xs;
            'b  'b list         'b list
            -----------  ------------
              'b list       'a list


'b list -> 'a list

in this function that is supposed to reverse a list the fact that the inferred type is 'b list -> 'a list suggests that the function's implementation is incorrect: we would certainly expect that a function for reversing lists would not change the type of the elements!

Polymorphism

Polymorphic means "having multiple forms." (The opposite is monomorphic.) In ML both functions and datatypes may be polymorphic. An example polymorphic function type: 'a list -> 'a. 'a here is called a type variable and the thing to remember is that in a type, each occurrence of a given type variable must be replaced by the same type. So, the type int list -> int is an instance of 'a list -> 'a but int list -> real is not.

An example of a polymorphic datatype is

datatype 'a Tree = Leaf of 'a | Interior of ('a Tree * 'a Tree)
ML type declarations may also be polymorphic, but type declarations just introduce new names for types that you could write anyway (see "Type declarations and type equality" below)
type ('a,'b) Pair = 'a * 'b

Three kinds of polymorphism in programming languages

parametric polymorphism
as in ML. Recognized by occurence of type parameters, such as 'a, 'b. Also recognized by seeing functions that have a single definition that works for multiple types.
ad hoc polymorphism
also known as overloading. Identified by seeing multiple function definitions, one for each argument type. In ML (and many other languages) arithmetic operators are an example of ad hoc polymorphism. C++ and Java also have programmer-definable ad hoc polymorphic functions.
sub-type polymorphism
also known as inheritance polymorphism. Characterized by values having many different types related by a hierarchy. Typically found in object-oriented programming languages. We will study more about it when we look at object-oriented languages.

Implementations of parametric polymorphism

Consider a C++ template function and some code that uses it:
template <typename T>
void swap (T& x, T& y) {
   T tmp = x; x = y; y = tmp;
}
int i, j;
float a, b;
swap(i,j);
swap(a,b);
In C++ the compiler/linker will generate code implementing swap for each different type on which swap is used. This is necessary because C++ values can require different amounts of storage and space must be allocated for the tmp variable. The result is that use of templates in C++ can result in a lot of code being generated to handle all the different cases.

Let's look at the swap in ML. Remember that ML doesn't have variables but does have ref values. Swap only makes sense for ref values.

fun swap (x,y) = let val tmp = !x in x:= !y; y := tmp end
(Note that the ML swap code on p. 149 of the book is incorrect. Make sure you know why.)

ML uses uniform data representation which means that all values take the same amount of space. (The implementation uses hidden pointers to make this possible.) As a result only a single body of machine code has to be generated for a polymorphic ML function, no matter how many different types it is used for. On the other hand, there is a time and space cost associated with uniform data respresentations because more values have to be dynamically allocated and referring to them may require an extra pointer dereference.

Overloading

Consider the expressions
3+2
3.0+2.0
In both C and ML the '+' operation used in these two expressions is different, the first referring to the integer + operation, the latter to the floating point + operation. The compiler simply looks for the function having the given name that matches the types of the arguments. This is called resolving the overloading.

In C, but not ML, you can also write 3+2.0. This is not overloading but rather an instance of operand coercion. There is no + function that takes an integer first operand and a floating point second operand. Rather, the compiler coerces the first operand to floating point and then uses the floating point +.

Type declarations and type equality

The basic question here is "does declaring a type create a brand new type, different from all other types, or does it create a type that is the same as some other type?" The question is answered differently in different languages and even for different kinds of declarations within a single language. In ML, datatype declarations introduce new types, but type declarations merely introduce new names for existing types. Consider
   type Celsius = real; type Fahrenheit = real
   val cTemp : Celsius = 30.0; 
   val fTemp : Fahrenheit = 30.0; 
   if fTemp=cTemp ... (* type correct but algorithmically wrong *)
Schemes (such as type declarations in ML and typedefs in C) in which declarations only give new names to existing types are said to be transparent. Another term for this is structural type equality.

Consider now:

   datatype celsiusTemp = C of real
   datatype fahrenheitTemp = F of real
   val cTemp = C of 30.0
   val fTemp = F of 30.0
   if ftemp=cTemp ... (* type error *)
Schemes (such as datatype declarations in ML and struct defs in C) in which declarations introduce totally new types are termed opaque, another term being name type equality.
(c) 2003 Curtis Dyreson, (c) 2004-2006 Carl H. Hauser           E-mail questions or comments to Prof. Carl Hauser