Image goes here
Abstract data types
CptS 355 - Programming Language Design
Washington State University

Names, Modularity, and Maintainability

In large programs, unless they are designed correctly, it is difficult to anticipate the results of modifying or adding code. Subprograms are a way of breaking up processes to abstract a commonly used piece of code and to isolate the effect of changes or modifications to code. Objects, packages, and abstract data types are ways of abstracting data, to encapsulate data structures. First we will look at ADTs, then at methods to help declutter name spaces.

Abstraction

What is abstraction? In general, elimination of details. In programming abstraction often means describing the "what" not "how". The notion of "interface" is one way that abstraction appears in programming. Informally, the interface of a component is the rules for using the component. It tells you what operations the component has, the way to use them, and what they will do. In some languages the notion of interface appears directly and concretely as syntax in the language.

Note: Interfaces in Java are a bit at odds with the above definition. Java interfaces are used in place of multiple inheritance so no one Java interface necessarily describes the whole interface, in the sense I'm using it here, of a class.

Example of how abstraction helps: compare how strings are handled in python and C. In python, strings are abstract: you don't know or care about their representation. There is just a set of "string-ish" operations that can be applied to them. In C, strings are arrays of characters. There are library routines to do "string-ish" operations but the "array-ness" shows through everywhere, making string manipulations in C much more tedious and error-prone than in Python.

Abstract Data Types

An abstract type consists of a set of values and a set of allowable operations on those values. Details of the value representation are not needed, only the operations. Floating point type, such as Pascal type real
  • Historically - different architectures have supported different floating point represenations
  • Some operations supported, +, -, *
  • Actual layout of bits is (or should be) hidden
What about ability to define new types? A abstract data type is
  • The declarations of the type and the operations on the type, defined in a single syntactic unit.
  • The representation is hidden.
An example of an abstract data type: a stack.
  • values - a stack of integers
  • operations - push, pop, create, destroy, full
An example C Stack ADT. First there is the definition of the ADT, in a header file.
/* In stack.h */
typedef struct {
  int values[];
  int top;
  } stack_type;

/* operations */
void push(stack_type stack, int value);
int pop(stack_type stack);
int isEmpty(stack_type stack);
stack_type newStack(int capacity);
stack_type destroy();
Next there is the implementation of the stack.
/* In stack.c */
#include "stack.h"

/* operations */
void push(stack_type stack, int value) {
 ....
 }
int pop(stack_type stack) {
 ...
 }
...
Finally, there is the use of the stack.
/* In main.c */
#include "stack.h"

main() {
  stack_type stack;

  push(stack, 23);         /* Should be an Error! */
  stack = newStack(50);
  
  stack.values[30] = 100;  /* Shouldn't be able to do this! */
  }
Unsatisfactory because
  • stack type internals are not hidden, main.c knows about stack internals. This dramatically reduces reliability.
  • can't hide
  • stack type objects are not "protected" in the sense that other operations could manipulate the internals of the stack object
  • stack use/definition separate: potentially, user could cut and paste stack.h into main, then any changes to stack.h would not be manifest in main.c.
  • not like an integer, type cannot be enforced by compiler, user doesn't have to do a newStack prior to using a stack!
  • name conflicts galore - what if user wants a list type, and has a pop operation on lists?
  • allocation/deallocation up to the user
  • non-orthogonal of type of elements in stack, why can't I have a stack of chars?
Kinds of operations in an ADT
  • constructor - create an object, usually heap-dynamic, but sometimes stack-dynamic
  • destructor - destroy the object, freeing space. Object should not be used after the destructor (how do we guarantee?).
  • accessor - read a value in the object
  • iterator - enumerate values in the object
  • mutator - modify the object
C++ ADT.
/* In stack.h */
class stack {
 private: //** Visible only here (and to friends)
   int  *stackPtr;
   int  maxSize;
   int  *top; 
 public: 
   stack() {  //** constructor
     stackPtr = new int [100];
     ...
     }
   stack(int size) {  //** constructor
     stackPtr = new int [size];
     ...
     }
   ~stack() {delete [] stackPtr;}; //** Destructor
   void push(int number) { ... }
   ...
 } //** class stack
/* In main.cc */
#include 
void main() {
  stack stk;  //** Calls the constructor, create a new instance
  stk.push(42);
  ...
  } //** Destructor called for stk
Operator "overloading" refers to an operator changing depending upon the type of the operands. Languages that support the definition of abstract data types should also allow operator overloading so that new types can be used with existing operators.

ADTs vs objects

ADTs and objects are similar. I see the primary difference being that objects typically have operations associated with an instance of the data where ADTs have operations associated with the data type.

Modules

Module systems are a way that the ADT idea gets expressed in programming languages. A few programming languages designs rather purely implement the mathematical algebraic notion of abstract type, but many seem more ad hoc. As an example of the former, consider ML. ML has syntactic constructions for structures, signatures and functors. We'll talk about functors next time. Structures are collections of types and operations. An example
   structure S = struct
      type t = int
      val x: t = 3
      val y: t = 5
      fun addx(i:t) = i+x
      fun addy(i:t) = i+y
   end
To refer to an element of a structure, prepend the structure name, e.g. S.x or S.addx(17). (Remember here that x and y are not variables and the program will not "construct" multiple instances of S while it executes.

An ML signature gives the signature for structures. Signatures can be, however, written separately from structures. For example

   signature SIG = sig
      type t
      val addx: t -> t
      val addy: t -> t
    end
By ascribing signature SIG to structure S like this
   structure S': SIG = S
we can effectively hide the implementation details that occur in S from client programs that use structure S'. This gives the programmer of ML modules (structures) very explicit control over visibility and representation issues.

Next time we will talk about how polymorphism and abstract data types interact.

(c) 2003 Curtis Dyreson, (c) 2004-2006 Carl H. Hauser           E-mail questions or comments to Prof. Carl Hauser