"Syntactic sugar": More control statements

The C while statement and if statement are "complete" in the sense that with just these two flow of control mechanisms (along with subroutine calls), we can write arbitrarily complex programs in a structured fashion. Nevertheless, certain common tasks are cumbersome to write using just these two control statements. For example, sometimes a program needs to test multiple conditions to determine what to do next. Often, the sequence of tests will require a long series of nested if-else statements. We call such series multi-way branch statements. Such statements can be difficult to read and understand. Similarly, some types of loops are so common that programmers often feel a desire to make them more compact. For instance, count-controlled loops appear in programs with great frequency.

C, like most programming languages, offers some additional control statements to help "automate" these tasks. These extra statements are not necessary for good programming, but they do make the programmer's life easier.  Just like sugar,   they "sweeten" the programmer's task. Ultimately, the machine code that the compiler generates for these statements does not differ substantially from the machine code it will generate for the more general while and if statements. The difference is at the syntactic level, not at the semantic level. For this reason, it is common to call such extra control statements "syntactic sugar." C includes three kinds of statements of this sort. The switch statement makes multi-way branching easier and the for and do-while loops implement certain kinds of loop statements easier and more compact to write.

The switch statement

Many programs are menu-driven. The program displays a menu and requests the user to select one of the options. Based on the user response, the program will perform a particular task. The value that the user types becomes the condition variable for a multi-way branch statement. Given only the if statement, we could write a code fragment to do this as follows:

printf ("Welcome to the Weather Program\n");
printf ("Select from the following options to get");
printf (" information on the current weather conditions\n\n");
printf ("1. Wind\n"2. Cloud cover\n3. Temperature\n");
printf ("4. Humidity\n"5. Barometric pressure\n\n");
printf ("Type the number corresponding to your choice: ");
scanf  ("%d", &choice);
if (choice == 1) 
  printWindInfo ();
else 
  if (choice == 2)
    printCloudInfo ();
  else
    if (choice == 3) 
      printTemperature ();
    else 
      if (choice == 4)
        printHumidity ();
      else
        if (choice == 5)
          printPressure ();
        else 
          printf ("Invalid option number\n");

As you can see, this is fairly long to type, even though the program offers only five menu options. The more options, the longer the code will be. Another common situation that requires multi-way branch occurs when we want to print a string given some sort of encoding. For example, we can specify months numerically as well as by their names. Typically, it is more convenient for a program that manipulates dates to use numbers. On the other hand, it is often desirable to print the name of a month for the user. Again, using only the if statement, we can accomplish this task, although rather clumsily. Assuming that month is an integer variable, the following code fragment will do (the first part of ) what we require:

if (month == 1)
  printf ("January");
else 
  if (month == 2)
    printf ("February");
  else 
    if (month == 3)
      printf ("March");
    else 
      if (month == 4)
        printf ("April");
      else 
        if (month == 5)
          printf ("May");
        else 
          if (month == 6)
            printf ("June")
          else ...

Plainly, this code will end up being quite long. If we indent in this fashion, we may soon end up unable to fit the statement across the width of the screen.

This sort of task is common enough that most programming languages offer an alternative control statement just for this special case. In C, the switch statement fills this role. The syntax template for the switch statement is
switch (selector_expression)
{ 
   case label0: [statement_list];
   case label1: [statement_list];
   . . .
   case labeln: [statement_list];
   [default  : statement_list;]
}

The square brackets do not appear in a switch statement. They indicate that the code between the brackets is optional rather than necessary.

When a switch statement executes, the computer first evaluates the selector expression. It then tries to match the value with one of the labels that follow the case keyword. As soon as it finds a match, it begins to execute the statements in the statement_list(s) that follows the label. It will not try to match any remaining labels after finding a match. By default, the computer will execute every statement that follows the matching label, not just the statements that precede the next case keyword. In most situations, this will not be what you want to occur. Generally, you will want only the statements that you put between a particular case and the next case to execute. To effect this, you will use a break statement. This statement consists of nothing more than the break keyword followed by a semicolon. It will cause an exit from the switch statement and execution will proceed with the next statement after the switch.

If a default case is present and the value of the selector expression does not match any of the case labels, the statement list following the default keyword will execute. If the switch statement does not include a default case and none of the labels match the value of the selector expression, the computer will not execute any of the statements in the statement lists. The effect is the same as if no switch statement had been present at all. This may or may not be correct, given a particular situation. If the program should do nothing when the selector expression does not match any labels, you should not include a default case. If it makes more sense to do something (even if it is only to print an error message), your switch statement should contain a default case.

We can convert our two if statements to equivalent switch statements, so that you can see how this works. For the first example, we will omit here the code that prints the menu, although you should understand that in a full program, that code would need to appear first. First, notice that each condition in the original nested if statement compares the value of the variable choice to a particular integer value. This tells us that the selector expression must be choice as well, since it is the value of this variable that we want to match with one of the possible integer values. The case labels will be the five possible menu option numbers. Since the original if statement includes a final else clause with no condition check, we will also need a default case to get the same effect in the switch statement. The complete statement will be:

switch (choice) 
{
  case 1:  printWindInfo ();    break;
  case 2:  printCloudInfo ();   break;
  case 3:  printTemperature (); break;
  case 4:  printHumidity ();    break;
  case 5:  printPressure ();    break;
  default: printf ("Invalid option number\n");
}

Note that a break statement follows each of the first five case labels. If the break statements were missing and the value of choice were 1, for instance, the calls to all of the subroutines would execute. After the subroutines printed all that information, the default case statement would execute, printing a message that the user had typed an invalid option number. It should be clear that the break statements are critical for proper program behavior.

The equivalent switch statement for the second example (shown only in part below) will be considerably shorter than the if statement. By now, you should be able to guess what to use for the selector expression and the case labels.

switch (month)
{
  case 1: printf ("January");  break;
  case 2: printf ("February"); break;
  case 3: printf ("March");    break;
  case 4: printf ("April");    break;
  case 5: printf ("May");      break;
  case 6: printf ("June");     break;
  ... 			       /* cases for the remaining months */
}

Again, you should notice that the switch statement is more compact and easier to read than the equivalent if statement.

The selector expression

The selector expression takes the place of the condition variable or condition expression in an equivalent if statement. For instance, in our first example above, the condition variable is choice. In other words, it is the value of this variable that controls the flow of execution. In the second example, month is the condition variable. The selector expression can be a simple variable of this sort, or it can be an expression. Expressions would include arithmetic or logical expressions as well as "true" function calls. The selector variable or the result of evaluating the selector expression must be an ordinal type. Ordinal types have two properties. First, the values are ordered. This means that given two values from the type, you can tell which comes first and which comes later. Second, the values are discrete. This means that given a value from the type, you can give the next value and/or the previous value in the ordering. For example, given the value 3 (from the int data type) you can give the next value, 4, and the previous value, 2. Of the data types you know so far, only real number types are not ordinal. Thus, a selector expression such as x / 2.2 or sqrt (x), both of  which yield a floating point result, would not be legal.

The case labels

The case labels taken in conjunction give the values that the selector expression may have. Each label must be a named constant or constant literal of the same data type as the result of the selector expression. No variables or expressions involving variables may appear as case labels. It would be possible to use an expression as a case label, but since the expression can contain only constants, it is generally not useful to use an expression. For example, it would be syntactically valid to have a case label such as:
case 2 * 3 + 1:
but it would be more reasonable for the programmer to simply evaluate the expression and use its result:
case 7:

Because labels must be constants, it is not always possible or desirable to substitute a switch statement for certain kinds of nested if statements. In general, any if statement that uses a relational operator other than = = in the condition will not translate well (if at all) to a switch statement. For instance, the following if statement will not work well as a switch statement because it compares the integer variable selector score to ranges of values:

if (score >= 90)
  grade = 'A';
else 
  if (score >= 80)
    grade = 'B';
  else 
    if (score >= 70)
      grade = 'C';
    else
      if (score >= 60)
        grade = 'D';
      else grade = 'F';

The case labels in an equivalent switch statement would need to account for every possible value, rather than the ranges in the if statement. This would mean the switch would require 41 different case labels (for values from 60 through 100, inclusive) and a default case! Clearly, the nested if structure is a better choice in this situation.

Other scenarios can also occur that make a switch statement inappropriate. Since the labels must be constant literals or named constants, anytime an if statement condition compares a value to a set of variables, it will be impossible to convert to a switch statement. For example, consider the following nested if statement, where all identifiers are integer variables.

if (i == x)
  printf ("i and x are the same\n");
else
  if (i == y)
    printf ("i and y are the same\n");
  else 
    if (i == z)
      printf ("i and z are the same\n");
    else 
      printf ("i is unique\n");

It is not possible to convert the preceding statement to a switch statement, since it would be necessary to have the variables x, y, and z as case labels.

Even though the case labels must be constants from an ordinal data type, they do not need to appear in any particular order for the switch statement to be valid. It will generally be easier for a human reader to understand the statement, however, if the labels appear in order. The exception to this rule of thumb occurs only if speed of execution is critical in a program. The machine code that the compiler generates will test the labels in the order they appear in the source code. If you know that the probability of the selector expression matching certain labels is high, you could place those labels at the beginning of the switch statement. This would ensure that the comparisons with those labels would occur first. If the value of the selector expression matches one of the first labels, this would eliminate testing labels that have a lower probability of matching.

The case labels in a single switch statement must be unique. If you have duplicate values among the labels, the compiler will generate an error message. It is unlikely that you would intentionally include duplicate values because of the way that switch statements work. If the value of the selector matches the first instance of a duplicate value, the computer will never compare the selector to any subsequent labels. Thus, the second occurrence of the label would be pointless.

C does not insist on a label for every possible value that a selector expression could have. In some instances, a default case will substitute for missing values, but even when no default case is present, the syntax of C does not require the programmer to include all possible values. If the selector expression evaluates to a value that does not appear among the case labels, the computer will not execute any of the statements in the switch statement. It is as if the switch statement did not exist.  Missing values may cause logic errors, however. If a program must perform a different action for each of the possible values, omitting one or more will cause incorrect behavior.

Statement lists

Once the series of comparisons between a selector expression and the case labels results in a match, the computer begins to execute the statements following the colon. In most kinds of C statements, you have seen that a series of statements must generally appear in a block (enclosed in braces). In the switch statement, this is unnecessary. In fact, as previously mentioned, once the computer finds a match, it will execute all the statements remaining in the switch statement until it encounters a break statement or the closing brace of the switch statement itself.

Most often, you will want to have a statement or statement list for each case label, followed by a break statement. The statement lists and break statements are optional, however, and in certain situations, we can use this to our advantage. Sometimes we will want to execute the same statement or series of statements for two or more values of the selector expression. For example, suppose that we wanted to alter the menu in the first example above so that the user types the first letter of the menu option rather than a number. It would be reasonable to allow the user to type either upper or lower case letters. This means that for each possible action (a subroutine call), we have two label values. We can place the two labels in sequence and leave the statement list empty for the first value. Since no break statement exists to transfer control to the statement following the switch statement, the computer will execute subsequent statements until it finds a break statement. In cases like this, it is conventional to put both case labels on the same line, although it would certainly be perfectly correct to keep them on separate lines. Here is the modified code fragment:

printf ("Welcome to the Weather Program\n");
printf ("Select from the following options to get");
printf (" information on the current weather conditions\n\n");
printf ("Wind\n"Cloud cover\nTemperature\n");
printf ("Humidity\n"Barometric pressure\n\n");
printf ("Type the first letter of the option you choose: ");
scanf  ("%c", &choice);
switch (choice) 
{
  case 'w': case 'W':  printWindInfo ();    break;
  case 'c': case 'C':  printCloudInfo ();   break;
  case 't': case 'T':  printTemperature (); break;
  case 'h': case 'H':  printHumidity ();    break;
  case 'b': case 'B':  printPressure ();    break;
  default:  printf ("Invalid option number\n");
}

It would also be correct to type the cases as follows:

  case 'w': 
  case 'W':  printWindInfo ();    break;

The first method emphasizes the link between the case label values and takes less vertical space. The second method makes it easier to see what values the programmer expects and takes less horizontal space. If there were three or more labels that resulted in the same action, it might be better to choose the second method.

Even though switch statements are generally more compact than an equivalent if statement, it is easy for them to get very long indeed. If the statement lists for each case label contain several statements, the switch can become long even if few case labels are present. Where many case labels are necessary, the problem is even worse. For example, if we elaborate the second example a little, you will see how quickly a switch statement may become unmanageably long. Suppose that instead of just printing the name of the month, we wanted to do some other operations for each month:

switch (month)
{
  case 1: printf ("January\n");  
	  printf ("Holidays in January include New Year's Day\n");
          numDays = 31;
	  break;
  case 2: printf ("February\n"); 
  	  printf ("Holidays in February include President's Day and ");
          printf ("Valentine's Day\n");
	  if (leapYear) 
            numDays = 29;
          else numDays = 28;
	  break;
  case 3: printf ("March\n");    
	  printf ("Holidays in March include St. Patrick's Day\n");
          numDays = 31;
	  break;
  case 4: printf ("April\n");
	  printf ("Holidays in April include April Fool's Day\n");
    	  numDays = 30;
	  break;
  ... 			       /* cases for the remaining months */
}

The fragment above shows only the case labels for the first four months. Obviously, by the time we finish writing the complete switch statement, it will be far too long. To correct the problem, we can write subroutines to perform the actions for each of the possible values of the selector expression. This is the idea behind the code for the "weather program." If we use this approach, we reduce the number of statements following each case label to two: the subroutine call and the break statement. This will give us a switch statement of a reasonable length. Wherever possible, we want each subroutine to fit on a single screen so that the programmer can see the complete code without scrolling up and down. If we have a long switch statement, this goal becomes unattainable. If you find that your switch statements are becoming too long because of the number of statements in the statement lists, always write subroutines to do the task and convert your statement list to a subroutine call.

Guard statements

In some situations, it might be important never to execute a switch statement at all. This might be because calculations make it dangerous to evaluate the selector expression or because the programmer cannot guarantee that the value of the selector expression will be within the range of values represented in the case labels. For example, if the selector expression includes a division operation and it is uncertain whether the divisor is zero, evaluating the selector expression would result in a run time error. In such situations, you will want to prevent execution of the switch statement. To do this, you will use a guard statement. Guard statements are conditional statements designed to prevent execution of a block of code. Most often, a guard statement will take the form of an if statement, but sometimes a loop will be more appropriate. With an if statement, the condition determines whether to execute or skip a switch statement. If the guard statement is a loop, it typically iterates until the value(s) of variable(s) involved in the switch are such that no error can occur. The following example shows an if statement acting as a guard:

if (x != 0)
  switch (y / x) 
  {
    case 1: ... break;
    case 2: ... break;
    ...
  }

In this example, a value of 0 for x would result in a divide-by-zero error. The condition for the if statement prevents the evaluation of the selector expression when x is zero.

The next example shows a guard statement implemented as a loop:

validInput = FALSE;
while (!validInput) {
  printf ("Enter a positive number between 1 and 5\n");
  scanf ("%d", &number);
  validInput = number > 0 && number < 6;
}
switch (number) 
{
  case 1: 	  printf ("No calculation necessary\n"); break;
  case 2: case 4: printf ("%d is even\n");		 break;
  default: 	  printf ("The square root of %d is %f", 
			   number, sqrt (number));
}

In the above example, a negative value for number would cause a run time error when the default case executes because of the call to the sqrt function. The loop that precedes the switch statement prevents this from happening, since the loop will not terminate if number is negative.

A few programming languages do not have the equivalent of the default case. If the value of the selector expression is not represented by one of the labels, the program will crash. These languages make the presence of a guard statement essential. Because you may not always program in C, it is a good habit to use guard statements whenever they are appropriate. The habit will save you time and effort when you learn other languages.