Image goes here
Concurrency Mechanisms
CptS 355 - Programming Language Design
Washington State University

Threads and Monitors approach to concurrency

Threads

A thread is an independent stream of execution. Each thread has its own stack where activation records for its function calls are placed, just as we've discussed before for what we will now term sequential programs. However, all the threads share an address space. A pointer means the same thing in all the threads. (Contrast to the idea of concurrent processes where each process has a separate address space.) In particular, threads share global variables and heap memory. If pointers to a thread's stack leak to other threads, stack data is also accessible in the other threads--but this is generally considered evil programming and should be avoided.

Threads are created differently in different languages but the common idea is to start a function or procedure running in some specified state. In the POSIX pthreads library for C this looks like (fundamentally, anyway)

   pthread_create(procptr, argptr)
In a concurrent version of Standard ML called CML the primitive used to create a thread is a little different
   spawn(function)
How is the proper state passed to the new CML thread? Recall that in ML a function is actually a closure: it represents not only code but also state in which the code is to execute (see Mitchell textbook p. 183ff).

Threads in Java are built on objects. In Java there are no function or procedure pointers--only refs to objects. But an object is exactly what we want! It combines state (member data) with behavior (member functions). So creating a thread looks like

   class Foo extends Thread {
      public void Foo(formal-args-for-thread) { save args-for-thread as member data }
      public void run () { main code of thread }
   }
   (new Foo(actual-args-for-thread)).start()
Make sure you know where the start method is located.

Synchronizing thread access to data

We saw last time that dealing with arbitrary interleaving of concurrent execution streams is practically intractable. Mechanisms are needed to allow us to reason about simpler cases. The most basic low-level mechanism is the lock. The idea of a lock is that once a process has taken a lock no other process can take it until the first gives it back. Any process that attempts to take the lock while another holds it becomes blocked.
   .                                        .
   .                                        .
   .                                        .
   lock.take()                              .
   . /* C1 */                               .
   . /* C1 */                               lock.take()
   lock.release()                           . /* C2 */
   .                                        lock.release()
Lines labeled C1 and C2 above are what are called critical sections where the two threads can access shared data without the programmer having to worry about arbitrary interleaving.

Problems with locks as a mechanism include: forgetting to take a lock (or taking the wrong lock) and forgetting to release a lock. Forgetting to release a lock usually becomes apparent quickly -- the system makes no further progress, but exception handling complicates the story: because an exception may be a rare occurrence the failure to release may not be noticed until a product is in the field. Forgetting to take a lock is much more serious as it exposes the program to arbitrary interleaving of instructions which may occur extremely rarely and then cause irreproducible behavior.

Monitors

To overcome these problems Anthony Hoare (and others independently) invented the notion of a monitor. Monitors are abstract data types (or later, objects) with special support for concurrent programming. First, monitor procedures automatically acquire and release locks when they are called. In Java every object has an implicit lock that is manipulated by methods with the synchronized attribute.
   synchronized fooMethod(...) {...}
This is equivalent to a lower-level construction using a synchronized block
   fooMethod(...) { synchronized(this) {...} }
(A synchronized block can synchronize on any object at all so the synchronized method syntax is usually better if it is applicable.) Putting all access to an object's shared state in synchronized methods ensures that arbitrary interleaving will not occur.

Locks prevent arbitrary interleaving but remember that threads are supposed to cooperate to get things done. Sometimes this entails one thread waiting for another to finish some work. This is the second key feature of monitors: they allow one thread to wait for another to finish some work. We looked in class at a producer-consumer example with a buffer between two threads. The producer must wait if the buffer is full; the consumer must wait if it is empty. So the code of the synchronized methods accessing the shared buffer would look something like:

   producer:                                consumer:
     if full this.wait();                      if empty this.wait();
     add item;                                 remove item;
     this.notify();                            this.notify();
Problem: if a thread waits while holding the lock the other thread will not be able to enter its synchronized procedure to change the buffer.

Therefore, wait is defined to release the lock before entering not-ready-to-run state. When notify() from some other thread makes the thread ready-to-run again the lock is re-acquired.

Correct use of monitors in any particular situation requires understanding the invariant of the data structure. For a monitor the invariant is a property that the programmer intends to be true whenever no thread holds the lock. Ensuring that the invariant holds is a key responsibility of the monitor's programmer. To do so you may assume that the invariant holds on entry to the procedure and after any wait() completes. From this assumption you must prove that the invariant holds when the procedure returns and when it calls wait().

Notice that when wait returns the only thing you may assume is the invariant. In the producer example, for example, it is not necessarily the case that when wait() returns the buffer will not be full (consider what happens if there are two producers). This observation leads to a particularly important feature of wait(): it should always be used in a while loop and not an if statement.

(c) 2003 Curtis Dyreson, (c) 2004-2006 Carl H. Hauser           E-mail questions or comments to Prof. Carl Hauser