Wednesday, July 28, 2010

Over and Over Again

We started our tutorial with the while loop. It allows one to perform an action over and over again until a condition becomes true. Another construct that is similar in result but different in syntax is the for loop.
A lot of times when a loop is used, we are interested in having a initialization, a conditional check and an action that changes the value of that variable. The syntax of the for loop contains precisely these three parts.
for (; ; )
Suppose you were helping a younger friend to learn tables and you wanted to print out the table of any number given to your program as input. Here is how the program would look.
#include”stdio.h”
int main()
{
int i, num;
printf(“Enter a number to see its tables\n”);
scanf(“%d”, &num);
for (i=1; i<11; i++)
{
printf(“%d\n”, i*num);
}
}
Done. As simple as that. The initialization of the variable, the condition check and its increment is all done in one statement. The first time the execution reaches the printf function, the i=1 has been executed and the condition check i<11 has been performed. The increment has not yet been visited. After the printf is executed, the execution pointer returns to the for construct, increments the variable and then checks for the condition. Though most of the times one will use the for loop in a very mechanical manner without giving much thought to the sequence of things happening, one must never forget the path taken by the execution pointer. I will show a cool program that will demonstrate the validity of my previous statement and establish it without doubt.
#include “stdio.h”
int main()
{
int i=0;
for (printf("Initialization\n"); printf("Condition check\n"); printf("Action on variable\n"))
{
printf("Inside loop\n");
i++;
if (i>2)
{break;}
}
}
Output:
Initialization
Condition check
Inside loop
Action on variable
Condition check
Inside loop
Action on variable
Condition check
Inside loop

Hope this program has helped you understand the sequence of steps undertaken by the execution of the for loop. One must understand that the loop body is visited at least once because the condition is evaluated to true. This indicates that the printf function must return something. Yes, it does. It returns the number of characters output to the screen. Instead of “Condition check\n”, if I had written a blank string “” then printf would have returned 0 and the condition check would have failed. The body would have not been visited even once.
We have introduced a new keyword break. Execution of break causes the execution pointer to abruptly pre-empt the loop and come out of it irrespective of what the condition check says. break works with for, while and switch…case but not with if. Also, it only breaks the encompasing loop not any loop beyond that. The execution pointer will be at the closing } of the for loop as a result of the break statement being executed.
Okay, now that you have started feeling that for loops are simple, let us complicate things a bit more. Suppose you want the following output:

    *
   ***
  *****
 *******
*********

How can this be accomplished? By using double for loops. Here is the program:
#include”stdio.h”
int main()
{
 int i,j;
 for (i=0;i<5;i++)
 {
  for (j=5;j>i;j--)
  {
   printf(" ");
  }
  for (j=0;j<2*i+1;j++)
  {
   printf("*");
  }
  printf("\n");
 }
 scanf("%d",&i);
}
Yes, it is a bit complicated, but try to understand the working yourself. Look at the stars carefully and observe the repetition structure. First of all, there are repeated spaces. Then there are repeated stars. Finally this all is repeated in every row, but the number of spaces decreases and number of stars increases. If you find this program a bit too complex, let me give a simpler assignment:
*
***
*****
*******
*********
This program is relatively simpler. It only has two repetitions.
for (i=0;i < 5;i++)
{
 for (j=0;j<2*i+1;j++)
 {
  printf("*");
 }
 printf("\n");
}
This is how you can have a for loop inside another. It has allowed us to grow the number of stars on each consecutive line (next line is reached because of the printf(“\n”) statement). A loop inside another gives a 2d representation. If we increase it to one more loop inside, we will have a 3d structure. Of course it is very difficult to show such a structure on the screen. Beyond that I have not seen many programs.

Conditional Checks

The contractor whom you worked for is impressed by your work. Your pavement cost calculator system is very useful for his customers. One day, the contractor calls you to his office and asks if you could create an inventory system for him. You are not sure and come to me for advice. Yes, you can create an inventory system for the contractor. But before that you have a long way to go. I promise to teach you C/C++ but it is up to you to apply it.
We will look at the C/C++ constructs that you will need going ahead step by step. For starters let’s say that the contractor has two type of paving blocks; one with rounded edges and another with square edges. The rate depends on which type of blocks the customer chooses.
We need a modification in the cost calculation program which will give the customer a choice of the type of paving blocks. This brings us to the if statement. The if…else block is a condition checker. Statements after the if are performed only if the condition is true, otherwise those after else are performed. The else part is not compulsory. Its syntax is as follows:
if (condition)
{
//true part
}
else
{
//false part
}
Just as we mentioned for the while loop, a condition can be anything that can be evaluated to either a zero or a non-zero value. Also notice the scope defining {…}.
To show how the if construct is used, let us write the cost calculation program again.
#include”stdio.h”
int main()
{
int len1, width1, len2, width2, rate, area, typeOfPayment;
printf(“Enter the outside length\n”);
scanf(…);
…
printf(“Select one of the following pavement types\n”);
printf(“1. Rounded Pavements\n”);
printf(“2. Square Pavements\n”);
scanf(“%d”, &typeOfPayment);
…
if (1 == typeOfPayment)
{
rate = 150;
}
else
{
rate = 120;
}
…
}
Now what if there are three kinds of pavement blocks?

if (1 == typeOfPayment)
{ rate = 150; }
else if (2 == typeOfPayment)
{ rate = 120; }
else
{ rate=180; }
The keen observer would have noticed two unusual things. I always leave a white space between if and the “(” but there is no space between printf and “(”. That is just a good programming practice. It is a subtle way to differentiate functions from keywords. Keywords such as if, for, while, switch which have a parenthesis will always have a space between the word and the parenthesis. On the other hand, functions such as main, printf will have no whitespace between the end of the word and the beginning of the parenthesis.
The other thing which is less subtle is the condition inside the if parenthesis. I have always written the constant literal (1, 2 or 3) first and then equated it to the variable. It is not necessary to write in this form and if (typeOfPayment == 2) is equally correct. However, it is important that one not forget to use the double= sign. Instead if someone uses a single equal, typeOfPayment will be assigned the value of 2 rather than being checked if it is equal to one. Moreover, since this assignment will succeed, the result of the condition will be always evaluated to true! If however, one writes the literal as the first operand, and then inadvertently uses the single= the compiler will complain since assignments to literals is not possible. Your mistake of using single= will not go unnoticed. I understand that some people find it easier to remember that they must use == and not = instead of this roundabout manner of comparing variables to literals. Nonetheless many developers follow the style I will follow here. Java does not allow only = inside the if statement(condition must strictly evaluate to Boolean), hence this quirkiness does not apply to Java.

Wednesday, May 19, 2010

Accepting Input

Excited at your success of solving the problem your uncle talks to the contractor about your program. The contractor sees the value of having such a program at his disposal and contacts you for the same. However, since all the clients have different size of land, and also want different sized pavements, you wish to provide user configurable length and width parameters.
This brings us to the input part of the stdio.h header file. Getting user input is similar to showing the output.
#include "stdio.h"
int main()
{
 int length;
 scanf(“%d”, &length);
 return length;
}
Running the above program will result in the computer waiting for your input. It expects an integer and be sure to type in an integer only. It will store the value you enter in the variable “length”. Since the variable is being written to (its value is being changed) we cannot just pass the variable name, but have to append an & in front of the variable. Always remember to put the & in front of the variable when using the scanf function. Here is the stub of the program you will need to complete your assignment at the contractor’s place:
#include "stdio.h"
int main()
{
 int len1, len2, width1, width2;
 int area, rate;
 printf(“Enter outside length “);
 scanf(“%d”, &len1);
 printf(“Enter outside width “);
 scanf(“%d”, &width);
 …
 area = (len1*width1)-(len2*width2);
 scanf(“%d”, &rate);
 printf(“Cost of paving = Rs.%d\n”, area*rate);
}
[Exercise] The paving company has a dedicated computer to do this calculation. They do not wish to run the program again and again after each calculation. Encompass your program with something that will cause the first question to appear soon after the cost of the first pavement is done displaying.
[Exercise] Some customers may want to check with various rates. Modify your program such that the program will remember the previous inputs. Finally the program will ask for user input in the following fashion:
Enter outside length [enter 0 for 1200] _
Previously the user had entered 1200. This time she could just enter 0 to assume the same value as previous.
This problem also teaches a lesson in usability of systems. Lack of ease of inputs is a major reason that many banks, ATMs and vending machines have long queues.
Side note: We never entered the units. This is because arithmetically units don’t matter. However when interacting with the user, it is always best to make the units clear. You may also want to write a conversion program or modify the existing one to support both US units as well as metric units.

Producing output

Now that your uncle has got his answer, he is really excited. He contacts his contractor and asks if there are any other type of pavements. Of course, there are and they cost Rs.150, Rs.170 and Rs.200 per sq. mt. respectively. Your uncle again asks you to calculate the cost. This time you have to make changes to the rate one at a time and see the return value. You start wondering if there is any way to see all the output in one shot. You will write four different formula using various rates and you want their results displayed one after another.
You have come up with a requirement with which most C and C++ books start – a hello world program that starts with printing some output on the screen. I hope you appreciate how much you could already accomplish without ever having to print anything on the screen. But now, the time has come to show the stereotypical hello world program.
#include "stdio.h"
int main()
{
 printf(“Hello World!”);
return 0;
}
Printf is a function that outputs a formatted string to the screen. For the printf function to work, we need to include a file that comes with most C implementations. Hence we write #include . The # in this statement is a pre-processor directive. It calls upon the compiler to take some actions before the real compilation begins. In this case, it copies a file stdio.h to the text of the program. Stdio stands for standard input and output. .h indicates a header file, though the extension could really be anything.
The printf function outputs a formatted string, but also has some “magic” characters which are output differently. One of them is “\” known as the escape character or escape sequence. The character after the \ gives it a special meaning. \t stands for tab, \n stands for a new line \r for carriage return, \b for backspace, \% for percentage sign and \\ for backslash (more info). Another such character is the % sign. It has substitution power. The characters following the % will determine the type of substitution. For example, the quoted string inside printf for outputting the area will be “Area=%d”
The %d will be replaced by value of some integer variable which should follow the quoted string as in:
printf(“Area=%d”, area);
Similarly, %f stands for floating point numbers, %c for a single character, %s for a string, %l for long integers and %e for double floating point number. A complete list is here.
Now that we know how to print an output we can solve the problem of calculating cost for the paving of the area of your uncle’s garden:
#include "stdio.h"
int main()
{
 const int len1 = 10;
 const int len2 = 7;
 const int width1 = 8;
 const int width2 = 5;
 int rate = 100;
 int area = (len1 * width1) – ( len2 * width2);
 printf(“Cost @Rs.100 = %d\n”, rate * area);
 rate = 150;
 printf(“Cost @Rs.150 = %d\n”, rate * area);
 rate = 170;
 printf(“Cost @Rs.150 = %d\n”, rate * area);
 rate = 200;
 printf(“Cost @Rs.200 = %d\n”, rate * area);
 return 0;
}
Note that rate is not declared const since its value changes during the execution of the program. Area is also not defined but could have been since it is calculated only from other constants.
[Exercise] Modify the above program to support floating point lengths and widths. A float is defined as “float” instead of “int”.

Wednesday, May 12, 2010

A little more useful code

Now let us make our program a little bit more useful. Let us calculate the area of a square. Suppose the length of the side of this square is 5. Then we would write a program as below.
int main()
{
    return 5*5;
}
The * sign indicates multiplication. The program will return 25. However we will have to do something special to see the return value. Depending upon the OS and the implementation you are using, you may see the returned value directly (as in VC++) or may have to type echo $? (in Linux).
In any case, you have written your first program that does something useful! But there is a limitation (there are many limitations, but we will currently dwell over only one of them). If you now wish to calculate the area of a square with a side of length 6 instead of five, you will have to change the number five at two places. What’s the big deal? You might ask. But think if I had asked you to write a program that outputs the sum of the area and the volume of a square and a cube of a particular side length, (x*x + (x*x*x)), count how many x’s you would have to modify. To avoid such problems, we use constants or variables as the case might be. Just as in algebra we use named constructs such as x and y and then replace their values only when needed, we use constants and variables to do that in C and C++.
int main()
{
    const int side = 5;
    return side*side;
}
We are now using a named construct known as “side”. We assign a particular value to this named construct. During one execution of the program, the value of “side” does not change. Hence we define it as:
Const int side = 5;
Now if I asked you to calculate the area of another square with side equal to 6, all you have to do is replace one “5” with a “6” and you will get the required answer.
With this limited knowledge of the C world, you can now solve a lot of your primary school mathematics problems. Take this for example:
“Your uncle owns a rectangular piece of land with length=10m and width=7m. He wishes to construct a paved path at the boundary of this land (inside) with a width of 1m. His contractor informed him that it will cost him Rs 100 per square meter. How much money would your uncle require to get the pavement done?” Easy isn’t it?
int main()
{
    const int len1 = 10;
    const int width1 = 7;
    const int len2 = 8;
    const int width2 = 5;
    const int rate = 100;
    return rate*((len1*width1)-(len2*width2));
}

My Hello World

As I have mentioned in my previous post, I have not liked the way most C or C++ books begin. They start with a program that prints hello world and attempts to create enough enthusiasm among the readers to lure them to continue reading the fat book. This approach puts undue stress on experiencing the output without really speaking anything about computers or the C/C++ programming language. Moreover, the first program that the beginners write is so difficult and tricky that few will understand its true genius even after completing the entire book.

If I teach C, I want to start off differently. I want to include the least number of constructs in the first program. My first program will be an infinite while loop. It will look like this.

void main()
{
    while(1);
}

Now that is a beauty. I would tell that all C/C++ programs had one (and only one) main function. The main is the entry point into the program. The CPU will start executing the program from this point. The first word of the program represents what kind of output this program will produce. It is known as a return value. “void” means nothing. So we are declaring that this program will not produce any output. Some implementations of the C language may require that the main function produce some output. For such implementations we would replace the “void” with “int” (for integer value) and add one more statement just after the while(1); - return 0;

The round brackets after main have a special purpose. We will talk about those later. There are curly brackets after this. They define the scope. Think of a closed box with its name stamped on its cover and the kind of value (integer / void) that comes out of it, stamped before the name. Then, the name will be “main” the kind of value is the int or void written before main. The periphery of the box then, are the curly brackets.

Whatever is inside the curly brackets will be executed by the CPU, one statement at a time. In the simplistic demo program, we have a single statement that says while(1); The word while is different from the word main. "while” is a keyword. There are only a few number of keywords in C/C++ and they have a specific meaning. The “while” keyword is a looping construct. It asks the CPU to execute whatever comes after “while” for as many times as required until the condition inside the () becomes false. Of course, it may never become false at all in which case the CPU will keep executing the instructions ad infinitum.

True and False are tricky in conditional checks in C. Anything that evaluates to 0 is considered false and all non-zero values are considered true. Hence, the condition inside the round bracket will always evaluate to true (since it is non-zero) and the loop will continue forever. But there is nothing inside the loop. The loop boundary could have been defined by the same scope defining {} as for “main”. Instead, in their place, we have a semicolon (;). Hence no statements are to be executed inside the loop. All the CPU does is keep checking the condition, over and over again. Really, the computer is a dumb box.

Your first program does not produce any output. It causes the computer to hang! The condition inside the while loop () can be a more complex one. It could be a comparison such as 5>2, which would always evaluate to true or another such as 1!=1, which stands for one not equal to one which always evaluate to false. The C language supports many operators and we will explain then as and when we use them.

Sunday, January 17, 2010

How to find a mini-project

Every now and then, just as we near the end of a course we are entrusted with a mini project to be presented before the course finally comes to an end. Except to some entrenched souls who find this a relief from the monotone of assignments, most of us are immediately faced with the monstrous task of finding an appropriate project title and then see it to completion. A great many projects never get finished. A great many never really get started.

If we go back a few years and look into our mathematics text books, we will find that after every few chapters, a set of miscellaneous questions has been appended which requires us to apply the principles applied in the previous chapters. If your mathematics text books did not have this feature then probably those pages were glued together because some lawmakers felt that they were too much burden for the delicate minds. Jokes apart, these miscellaneous questions have a lot in common with the mini project that is expected of you today.

You have gone through a fair enough portion of the course and it is time to apply the knowledge that you have gained so far. I am not here to give you a list of projects to do. But I am going to provide you with ideas. I hope that you will think of similar projects on your own and not just copy my ideas.

A good feature of using TurboC in DOS is that you can create cool graphics by using the graphics.h header file. In actuality, a lot of work is being done for you by the header file, but we can remain blissfully ignorant about it. For someone who wants to know how the graphics header file accomplishes this feat, I have a project to offer. There is a great book by Peter Abel describing the DOS operating system interrupts and programming in the assembly language. With aid of this book, one may, inside TurboC create new functions to initialize the graphical screen, display pixels and change colors. Create a library of functions to draw simple shapes such as line, circle and rectangle using the most basic DOS interrupts and asm commands inside TurboC. Ask your friends to use your library instead of the graphics.h file. Tempt them with some additional features such as the ability to draw and fill tilted rectangles. This project will require knowledge of the algorithms to draw lines, circles etc. but they may be directly obtained from someone who has done the computer graphics course or from the internet. The intent of this project is to get the understanding that underlying every library is a set of basic constructs.

Someone has said that a good program is one which has a lot of inline documentation. Okay, there will be some argument about whether someone said it or nobody did, but now that I have, Google will surely find it. But how embarrassing comments would be to its author if he inadvertently slips in spelling mistakes? “But since the advent of the computers, who remembers spelling?” you want to ask. True. It is not necessary during writing your report because your word processor will do most of the work for you. But what about writing comments in code? Fortunately, editors such as Eclipse now do spell check in comments. What a God-send! But wait a minute. We don’t use Eclipse for writing our programs because our teachers feel that IDEs write too much code for you and leave less to learn. Too bad, except that this rationale of your teachers has created a great possibility for your project topic. Write a spell checker for comments in C or C++ programs. First of all, fish out a good enough dictionary from the web. An extremely complete dictionary is not required because most programmers do not use artistic language in their comments (if any one does, please remind them that the comments are for others to understand and it would serve its purpose better if written in plain English!) So just fish out a list of about a 100 thousand most commonly used words and store it in a file. Most such lists are sorted alphabetically. Now write a program to filter out comments from C/C++ programs and then check the spellings of all words in the comments. Simple, isn’t it? But there is a catch. Most programmers would freely make use of variables inside their comments to explain what they are doing. Since most variables are not proper English words, they would be flagged by your spell checker as incorrect and would irritate the user. So, your program must also store all declared variables and keywords and ignore them when applying the spell check. If this complexity was not too much for you, you will complete the program and observe something bitter. This program takes too long to execute even on small input texts. The reason is that you have stored the list of valid words on the disk and have to go through the entire list for finding each word. This computational effort can be reduced if you use something known as a hash-map. A simple way to do it would be to divide the list into 26 separate lists each for a symbol in the English alphabet and then search only the relevant list. Even inside a list one could use binary search to quickly converge on the correct word, or to report that there is a spelling mistake. The sharp minds amongst you will notice that dividing the list into 26 lists each for a symbol in the alphabet is not a very good idea because there are differing number of words starting from each letter. Some other, better hashing function can be devised, but I will leave you with that.

A common project done by many students is encryption of plain text. Substitution cipher is the most common method used. Professional cryptography requires that the algorithm of the cipher must be public, but these students writing the substitution cipher put too much trust in their formula of substitution and swear never to divulge it. Lo and behold! There are chits passing under your nose and you cannot understand a word of it. You cannot tolerate this and so you want to write a program that breaks this code. Remember the sentences passing under your nose are only small ones. Here is what you do. You first fetch a full word list from the web. Then you write a program that substitutes each letter in the message with another letter and check if the words appear in the word list. When a match is found, the program returns the association and a possible formula. Of course you know that if the messages become bigger or if the formula is changed, your program will take longer to run and figure out what the formula is. Also, if the formula changes per word / per letter, you are out of luck. Try running your program on more capable computers such as servers and you will see that the program returns results much faster than on a simple desktop. (If it does not happen so, do ask your system administrators to change the servers!) This kind of attack on ciphers is called brute force attack and can also be launched on standard encryption algorithms like AES. These algorithms use a key and a program that can guess the key can decipher the code. Keys are generally so long that it will take most computers a very long time to figure out the key and documents are encrypted with the appropriate length of key depending on their importance.

Your friend has keen interest in cryptography and one of the features in crypto that he is particularly intrigued by is the ability to storing and operating on big integers. These are numbers 128 or 256 bit long (or longer) and are to be stored accurately till the units place. Java provides an inbuilt package for operating on such large numbers, but you do not know if C allows it too. Instead of trying to find it out on the internet, your friend has asked you to take up the task and create a construct for the same in C / C++. Therefore you have taken it up to yourself to create a class called BigInt which will allow storage and operations on extremely large integers. You plan to store the integer in a linked list of integers and will support only the basic operations like addition, subtraction, multiplication and division. Your friend begged you, so you will also support modulo function. Enjoy! Is there some way of finding out if this big number is a prime?

A friend of yours is in aeronautical engineering and has called upon your computer expertise to solve a persistent problem. They frequently use integrals with limits and have to use calculators to solve them. But these problems are really only a part of a larger scheme of things and hence it would be best if they had a library that solved the integration problems. Think of some way to represent arbitrary integration problems for input and then solve those using numerical methods. You might consider keeping the log book as a look up table in the computer’s memory.

Confident that the computer can solve any problem, you have set out to see your future. For starters you plan to create a computer program that will predict the mass, position and velocity of all particles in a closed box at some future point in time given their mass, positions and velocity at some earlier time. Make all simplifying assumptions such as the walls of the box are perfectly rigid and the particles inside the box are under standard gravity or no gravity and are perfect spheres and also rigid. Create a program that will input an arbitrary number of particles, their positions and velocities in three dimensions and the time at which you seek to know their positions and velocities. Simulate all collisions with the walls and each other and predict the position at the required time. Be happy that you have predicted the future of these hypothetical particles after you complete the program, but bear in mind that Heisenberg’s uncertainty principle laughs at your simplifying assumptions.

You have learnt how a projectile travels and it can be easily demonstrated in a game. QBasic came with a game called gorilla. Two gorillas stood atop high rise buildings and threw banana skins at each other. It was a two player game and each player was expected to provide a velocity and angle to his gorilla for throwing the banana skin. There existed an element of wind and gravitation to make things more difficult. You may create some form of that game.

If you use Linux for writing C / C++ programs, sprint library may be used for graphics programs. I have only listed some projects that came to my mind of-hand. However, if you look around, or look in your own text books of other courses, you should be able to find dozens of interesting projects to do. If you belong to computer science, a project done for some other engineering field will be a humbling experience where you will understand that computers are really only aids for supporting the variety of branches of science and engineering. If you are from other fields, you will feel proud that you have made life easier for many of your friends by introducing a simple tool for everyone to use. Enjoy your project. Nothing prepares one for the real world as projects do. To see the projects that I undertook during my BE and MTech, please visit my webpage.