Conventional Pipelining Scheme
Pipelining allows for multiple unrelated data computations to be overlapped in execution. The technique does not reduce the computation time for a given data set, but increases the rate at which new data is admitted at the input register and captured at the output register. Pipelining significantly improves throughput. The pipe stage with the longest delay plus the register overhead determine the clock cycle time implying that stages with short delays might remain idle for significant portions of the clock period. The conventional pipelining scheme is depicted in the figure above.
The wave-pipelining scheme attempts to reduce the clocking overhead by eliminating the intermediate latches. The scheme also reduces pipeline logic circuits' idle time by having several unrelated data sets within the pipe. Care must be taken to balance the data path delays so that data arrival times at gate inputs does not vary. This results in shortening of the clock cycle time since new data can be admitted into the pipe once it is determined that the current data has traveled a safe "distance" to prevent data over-run.