Home

hersto:  Source Code Indentation

 

Rant to the author:

    

Check for a response

Claim:

Source code indentation in languages like C/C++ or JavaScript does not go far enough.

Rationale:

Indentation of source code has the purpose to convey the information of program structure to the human reader quicly. It does not influence the logic of the code. With indented code human readers can quickly get an overview of the nesting structure of statements and they can recognize where a function or loop body ends. However, the impact that statements like return, break or continue have on control flow is commonly not visualized through indentation. This document suggests how to overcome this.

An Example:

 1:  int DoSomething(int NumberOfTimes)
 2:  {
 3:      if(NumberOfTimes < 0)
 4:          return -1;                                                   ← (1)
 5:
 6:      int Count= 0;
 7:
 8:      for(int C= 0; C<NumberOfTimes; C++ )
 9:      {
10:          printf("Schleifendurchlauf %d\n", C);
11:
12:          double Dings= GetDingsAt(C + 1);
13:
14:          if(Dings < 0)
15:              break;                                                   ← (2)
16:
17:          if(Dings > 1)
18:              continue;                                                ← (3)
19:
20:          if(Dings < 1./3)
21:              printf("Warnung: Wert ist im unteren Drittel.\n");       ← (4)
22:
23:          if(Dings > 2./3)
24:              printf("Warnung: Wert ist im oberen Drittel.\n");        ← (5)
25:
26:          Count+= ProcessDings(sqrt(Dings * (Dings + 1)), C * 2);
27:          Count+= ProcessDings(sqrt(Dings * (Dings - 1)), C * 2 + 1);
28:      }
29:
30:      return Count;
31:  }

This code snipped contains a for loop and a little parameter validation in lines 3 and 4.

Due to indentation one can see that:

  • The statements in lines 3..30 form the body of the function DoSomething(.).
  • The statement return -1 in (1) is subject to the if(.) clause one line above.
  • The for(.) loop executes the statement in lines 10..27.
  • The statements in (2), (3), (4) and (5) are subject to if(.) clauses concerning Dings.

That's all information that is conveyed to the human reader at a glance through formatting.

However, there is more information which is relevant in respect of the structure of the code!

  • The execution of the for(.) loop is influenced by the statements in (2) and (3), but not by the ones in (4) and (5). Yet these break and continue statements appear the same way as the printf's do. They are hidden multiple nesting levels deep inside the for(.) loop. Only reading the letters reveals this information to the reader.
  • The return -1 statement in line 4, if executed, aborts the function with most statements in the function not being executed. Although this statement has a major impact to the execution of the containing function it is located somewhere multiple levels deep inside the body of the function, similar to any ordinary statement.

In my Opinion the indentation level of statements, should emphasize all statements which influence the structure of a code piece.

Now one may say, the return -1 statement in (1) is also subject to the if(.) clause one line above. yes, that's true, and there is actually a conflict: On the one had the return -1 should be indented one level deeper than the governing if(.) statement. On the other hand its indentation should put it in relation to the function which it terminates when executed.

To resolve this conflict, let me ask one question:
What is more important? The fact that it aborts a function, or the fact that it is subject of an if(.) clause?

The fact that it aborts a function is a much bigger impact, and the fact that it's subject to an if(.) clause is obvious anyway. (A return statement without an if(.) clause would terminate the function always and is thus not common.) Therefore the answer should be to express the relationship to the aborted function when chosing an indentation level.

The same applies to break and continue statements. Their influence on the execution of the loop they govern is more important to express than the obvious fact that they are subject to some if(.) clause.

Therefore i'd suggest to add the following rules to be applied when indenting code lines:

  • Statements that abort a function should be placed on the same indentation level as the function they abort.
  • Statements that influence the control flow of loops should be put on the same indentation level as the loop they influence.

Here these rules have been applied:

 1:  int DoSomething(int NumberOfTimes)
 2:  {
 3:      if(NumberOfTimes < 0)
 4:  return -1;                                                           ← (1)
 5:
 6:      int Count= 0;
 7:
 8:      for(int C= 0; C<NumberOfTimes; C++ )
 9:      {
10:          printf("Schleifendurchlauf %d\n", C);
11:
12:          double Dings= GetDingsAt(C + 1);
13:
14:          if(Dings < 0)
15:      break;                                                           ← (2)
16:
17:          if(Dings > 1)
18:      continue;                                                        ← (3)
19:
20:          if(Dings < 1./3)
21:              printf("Warnung: Wert ist im unteren Drittel.\n");       ← (4)
22:
23:          if(Dings > 2./3)
24:              printf("Warnung: Wert ist im oberen Drittel.\n");        ← (5)
25:
26:          Count+= ProcessDings(sqrt(Dings * (Dings + 1)), C * 2);
27:          Count+= ProcessDings(sqrt(Dings * (Dings - 1)), C * 2 + 1);
28:      }
29:
30:  return Count;
31:  }

With this way of code formatting, one can easily scan the indentation levels of a function for the return statement in order to find out where the function is exited prematurely. Similarly, one has to just scan the indentation level of a loop in order to find all break and continue statements influencing it.

Further reading:

Herbert Stocker, www.hersto.com

 

__.-.__
end of document