C
Introduction & History
Birth of C
•C is very often referred as a “System
Programming Language” because it is used for
writing compilers, editors and even operating
systems. C was developed and implemented on
the UNIX operating system on the DEC PDP-11,
by Dennis M. Ritchie.
•C was evolved during 1971-73 from B language
of Ken Thompson, which was evolved (in 1969-
70) from BCPL language of Martin Richards.
“Every good tree produces good fruit.”
Programming Languages Phylogeny
Fortran (1954)
Algol (1958)
LISP (1957)
CPL (1963), U Cambridge
Scheme (1975) Combined Programming Language
Simula (1967)
BCPL (1967), MIT
Basic Combined Programming Language
B (1969), Bell Labs
C (1970), Bell Labs
C++ (1983), Bell Labs Objective C
Java (1995), Sun
Strengths
Low-level nature makes it very efficient
Language itself is very readable
Many compilers available for a wide range of
machines and platforms
Standardization improves portability of code
Efficient in terms of coding effort needed for a wide
variety of programming applications
Weakness
Parts of syntax could be better
if ( i = 1 )
stmt ...
is probably error, should be
if ( i == 1 )
stmt ...
Low-level nature makes it easy to get into
trouble
_ infinite loops
Weakness
Illegal memory accesses / tromping over
memory
Obfuscated code can be written
main(n,i,a,m)
{
while(i=++n)for(a=0;a<i?a=a*8+i%8,
i/=8,m=a==i|a/8==i,
1:(n-++m||printf("%o\n",n))&&n%m;);
}
C Language
No support for:
Array bounds checking
Null dereferences checking
Data abstraction, inheritance
Exceptions
Automatic memory management
Program crashes (or worse) when something bad happens
Lots of syntactically legal programs have undefined behavior
So, why would anyone use C today?
Legacy Code
Linux, most open source applications are in C
Simple to write compiler
Programming embedded systems, often only have a C
compiler
Performance
Typically 50x faster than interpreted Java
Smaller, simpler, lots of experience
What is Embedded C?
Embedded C is not any new language
It is not a standard
It is just the Tools that make the language dependent and
today’s embedded applications are being developed using C (A
shortcut to ASM)
As Applications are target dependent standard language will
never meet all the expectations
Hence non standard keywords and concepts are added as a
support for developers
After all, Embedded C is defined as ‘variant’ by different Venders
C and Processor Target
C more suited for RISC
Writing to a device is just by addressing
C is more challenging in CISC
Need to extend the non standard key words]
C Programming Language
Developed to build Unix operating system
Main design considerations:
Compiler size: needed to run on PDP-11 with 24KB of memory
(Algol60 was too big to fit)
Code size: needed to implement the whole OS and
applications with little memory
Performance
Portability
Little (if any consideration):
Security, robustness, maintainability
Assembly Language
Computer processors each have their own
built-in assembly language
E.g., Intel CPUs have their own language that
differs from the language of Motorola PowerPC
CPUs
Limited range of commands
Each command typically does very little
Arithmetic commands, load/store from memory
Code not totally unreadable, but close
C versus Assembly
Here’s some simple C code:
x = y + 2;
if ( (x > 0) || ( (y-x) <= z) )
x = y + z;
And now the same three lines of code
in Intel x86 assembly language...
MOV AX, y
MOV BX, y
MOV CX, Z
ADD AX, 2
CMP AX, 0
JG SET_X
MOV DX, BX
SUB DX, AX
CMP DX, CX
JLE SET_X
JMP DONE
SET_X: MOV AX, BX
ADD AX, CX
DONE: MOV x, AX
RET
The C Programming Language
C is a higher-level language compared to
assembly
Much more human-readable
Many fewer lines of code for same task
C is a lower-level language compared to
others (like Java)
Direct control over memory allocation and cleanup
(as we will see)
Compiled vs. Interpreted
For a program to run, the source code must
be translated into the assembly language of
the machine the program will run on
Compiled language: the source code is translated
once and the executable is run. (C, Pascal)
Interpreted language: an interpreter runs each
time the program is started, and it does the
translation on-the-fly (Perl, Tcl, other scripting
languages)
Procedural vs. Object-Oriented
Procedural Programming
A program viewed as a series of instructions to
the computer
Instructions are organized into functions and
libraries of functions
Object-Oriented Programming
A program viewed as a set of objects that interact
with each other
An object encapsulates functions with the data
that the object’s functions operate on
Procedural vs. Object-Oriented
OOP good for complex systems – nice to break down
into small independent objects
Procedural perhaps a bit more intuitive, esp. for
beginners
C is a procedural language
Java and C++ are object-oriented languages
Procedural Programming
Program is a set of sequential steps to be
executed by the computer, one after another
Not necessarily the same steps every time the
program runs
May want to skip steps under certain conditions
May want to repeat steps under certain conditions
May need to save some information to use later in
the program
Elements of a Program
Individual steps/commands to the computer
Statements
Technique to organize code so it’s easy to debug, fix,
and maintain
Functions
Way to save information that may change each time
the program runs
Variables
Way to change what steps get executed
Control flow
Declarations vs. Definitions
Both functions and variables MUST be declared
before they can be used
Declaration includes just:
Type
Name
Args (for functions)
Definition can come later!
For functions: body of function can be added later
Important Rules
All variables and functions must be declared
in the function/file where they are going to
be used BEFORE they are used
All variables must be initialized to some
starting value before being used in
calculations or being printed out
Statements
A statement is a single line of code
Can be:
Variable or function declarations
Arithmetic operations
x = 5+3;
Function calls
printf(“Hello!”);
Control flow commands
More to be seen…
Types
“Type” in C refers to the kind of data or
information
Functions have return types
Arguments to functions each have their own
type
Variables have types
Why Type?
Why do we have to specify types of variables,
functions, arguments?
Has to do with computer memory
Different kinds of data require different amounts of
memory to store
A single character can have only one of 128 values (a-z,A-
Z,0-9,punctuation, some others)
An integer can have an infinite number of values, limited to
~65,000 values on a computer
Therefore, more memory needed to store an integer than a
character
Computer Memory
Memory in computers made up of transistors
Transistor: just a switch that can be either on
or off
“on” state corresponds to the value 1, “off”
state corresponds to the value 0
Everything in memory has to be made up of
0s and 1s – i.e., has to be stored as a
number in base 2 (binary)
Important to understand different bases
Back to Types
Types typically defined in pieces of memory that are
powers of 2
Smallest piece of memory: 1 bit
Can hold 0 or 1 (equiv to 1 transistor)
8 bits = 1 byte
1 byte can have any value between 0000 0000 and 1111
1111 – ie, between 0 – 255.
1111 1111 binary
hex digit: 1111 = 8+4+2+1 = 15 = E
0xEE = 15x161 + 15x160 = 240 + 15 = 255
More than number of values needed for characters – 1 byte
typically used for character type
Numeric Types
16 bits = 2 bytes can have any value from 0000 0000
0000 0000– 1111 1111 1111 1111 or 0 – 65,535.
(Shortcut: 65,535 = 216-1)
Was used for a long time for integer values
If used to store negative integers, could only store up to +/-
32,768 (approx)
32-bit integers now more common
Ever heard of 32-bit operating system? Means that main
data types used are 32-bit
Amount of memory given to each type dependent on
the machine and operating system
Type Sizes
Many different types in C (more to come)
char: a single character
Often 1 byte, 128 values
int: an integer value
Often 32 bits/4 bytes, 4 billion values
float: a floating-point number
Often 4 bytes
double: a double-precision floating-point number
Double the size of a float, often 64 bits or 8 bytes
Operators
Arithmetic: + - * / % -(unary)
Increment/decrement: ++ --
Relational: > >= < <=
Equality: == !=
Assignment: = += -= *=
Logical: || &&
Bitwise: | & ^ << >> ~
Reference/Dereference: & *
Type conversions
Some types can automatically be converted to others
int, float, double can often be treated the same for
calculation purposes
int values will be converted up into floats for calc.
But major differences between integer and floating-
point arithmetic!
Fractions just dropped in integer arithmetic
Use mod (%) operator to get remainders
Integer much faster to perform
Start here
Hello, World – a first program
Hello, World – line by line
Basic rules for C
Tips for good programming style
Basic Input/Output
Hello, World
/* Hello, World program */
#include <stdio.h>
int main()
{
printf(“Hello, World!\n”);
return 0;
}
Programming Style
Put brackets on separate lines by themselves
(some variations on this)
After each open bracket, increase indentation
level by one tab
After each close bracket, decrease
indentation level by one tab
Leave blank lines between sections
Don’t let lines get too long
Control Flow
Two basic types of control flow commands
Conditionals
Execute some commands if a given condition is
true/false
if, switch, ?: operator
Loops
Execute some commands repeatedly until a given
condition for stopping is met
while, for, do-while
Functions
Basic idea of functions is to divide code up
into small, manageable chunks
One way to get started designing funcs:
Write out the entire program with no functions
Look for sections of code that are almost exactly
duplicated
Create one function for each repeated section and
replace each repetition with the function call
Function Design
Look for places where several lines of
code are used to accomplish a single
task and move the code into a function
No function too small
Rule of thumb: no function (including
main) should be longer than a page
Goal: be able to see entire function at once
when editing program
So,What is a function?
Types of function
Std
UDF
Nature of function
Stack use
Recursive
Chained
…
Using variable number of arguments
•Macros that implement variable argument lists
• Declaration:
void va_start(va_list ap, lastfix);
type va_arg(va_list ap, type);
void va_end(va_list ap);
• Remarks:
•Some C functions, such as vfprintf and vprintf, take variable argument lists
in addition to taking a number of fixed (known) parameters.
•The va_arg, va_end, and va_start macros provide a portable way to access
these argument lists.
•va_start sets ap to point to the first of the variable arguments being passed
to the function.
•va_arg expands to an expression that has the same type and value as the
next argument being passed (one of the variable arguments). Because of
default promotions, you can't use char, unsigned char, or float types with
va_arg.va_end helps the called function perform a normal return.
Using variable number of arguments…
• ap
•Points to the va_list variable argument list being passed to the function.
The ap to va_arg should be the same ap that va_start initialized.
• lastfix
•The name of the last fixed parameter being passed to the called
function.
• type
•Used by va_arg to perform the dereference and to locate the following
item.
•You use these macros to step through a list of arguments when the called
function does not know the number and types of the arguments being
passed.
Using variable number of arguments…
• Calling Order
•va_start must be used before the first call to va_arg or va_end.
•va_end should be called after va_arg has read all the arguments.
•Failure to do so might cause strange, undefined behavior in your program.
•Stepping Through the Argument List
•The first time va_arg is used, it returns the first argument in the list.
•Each successive time va_arg is used, it returns the next argument in the
list.
•It does this by first dereferencing ap, then incrementing ap to point to the
next item.
Using variable number of arguments…
void sum(char *msg, ...)
{ int total = 0; va_list ap; int arg;
va_start(ap, msg);
while ((arg = va_arg(ap,int)) != 0) {
total += arg;}
printf(msg, total);
va_end(ap);
}
int main(void) {
sum("The total of 1+2+3+4 is %d\n", 1,2,3,4,0);
return 0;
}
Arrays
An array variable is just a set of variables all of the
same type, all linked together
Can make an array variable of ANY type – built-in
ones, or ones that you define with struct or enum
char string[100]; // 100 chars
int numArray[50]; // 50 ints
Date dateArray[10]; // 10 Dates
Array Elements
Each element of the array can be treated like an
individual object of the underlying array type
To access an element, use the subscript operator:
<arrayname>[<index>] : represents the element
at position <index> in the array named
<arrayname>
Array Indices
Indices into an array start at 0 and go up to
(array length – 1).
Very convenient when combined with for
loops
Can easily go through entire array to access
every element in just a few lines of code, no
matter how large the array
Example – with arrays
#define ARRAYSIZE 10
int main()
{
int numArray[ARRAYSIZE];
int sum=0, i;
for (i=0; i < ARRAYSIZE; i++) {
numArray[i] = i;
sum += numArray[i];
printf(“%d “, numArray[i]);
}
printf(“\nSum is: %d\n”, sum);
return 0;
}
Limitations
Can’t assign or compare arrays
int numArray[10];
int numArrayCopy[10];
// Not OK:
numArrayCopy = numArray;
if (numArray == numArrayCopy)
// OK:
for(I=0; I<10; I++)
numArrayCopy[I] = numArray[I];
What is a pointer?
Pointer Syntax
Complicated – uses reference (&) and
dereference (*) operators
&<variable>
returns the memory address of that variable
result is a pointer to whatever type the variable is
*<pointer variable>
accesses the memory address stored in the
pointer
result is of whatever type the original variable was
Pointers
If all pointers are same size, why do we need to
specify pointer type?
Two reasons
Need to know what type of variable we are going to get
when we dereference the pointer
Sometimes, pointer is not pointing to just one variable – can
also point to first variable in an array
Overview
Pointers
Pointer to basic data types
Pointers and arrays
Pointers to function
Function returning pointers
Overview
Pointers
Pointers and arrays
Pointers and strings
Null pointer
Generic pointer (void *)
Dynamic Memory Allocation
malloc()
free()
NULL – the null pointer
A pointer with the address 0x0 is a null
pointer
NULL: a constant equivalent to the address
0x0
Must NEVER dereference the null pointer –
crash will occur
Major source of bugs (common cause of ‘the
blue screen of death’)
Protecting against null pointers
Always initialize all pointers to NULL when
declaring them
NULL and the null character both equivalent
to false
Can be used in conditional statements
Typical technique:
if (str) {
// dereference str somehow
} else {
// str is NULL!
}
Generic Pointer
Pointers are all the same size underneath
Sometimes convenient to treat all pointer
types interchangeably
Generic pointer type: void *
Function that expects void * can take any
pointer as argument
Important in malloc/free
Dynamic Memory Allocation
Declaring variables of fixed size allocates
memory in a static way – variables do not
change as program runs
Can also declare memory dynamically
Allocate different amounts of memory from run to
run of the program
Increase/reduce amount of memory as program
runs
Dynamic Memory Allocation
Have seen one example: using a
variable to determine size of an array
More flexible technique: combine
pointers with the function malloc()
malloc() and free() are in stdlib.h
Using Malloc
char * str = NULL;
// allocate 10 bytes to be used
// to store char values
str = (char *)malloc(10);
// when finished, clean up
free(str);
Malloc
malloc() takes the number of bytes to
allocate
returns void * pointer
Problems:
need to calculate the number of bytes
need to use the void * pointer as pointer of
type you want to use (eg, int *, char *)
Calculating Bytes
sizeof(<type>) – returns the number of bytes
used by a single variable of that type
multiply this value by however many
variables of this type you want to store
int intArray[10];
int * iarray =
(int *) malloc(sizeof(int)*10);
Casting
Casting allows you to force a variable of one
type to be treated as another
Does NOT perform any conversion – just
interprets the binary data stored in memory
in a different way
If you try to treat a variable of one type as an
incompatible type, your program will likely
crash
Casting
Syntax:
(<desired type>) <variable>
Used most commonly with malloc() and other
functions that return void *
Can also be used to get integer values of
character variables
int valueOfA = (int) ‘A’;
Overview
Dynamic Memory Allocation
malloc()
free()
realloc()
Dangling Pointers
Memory Leaks
Avoiding bugs
Dynamic Memory Allocation
Declaring variables of fixed size
allocates memory in a static way –
variables do not change as program
runs
Can also declare memory dynamically
Allocate different amounts of memory from
run to run of the program
Increase/reduce amount of memory as
program runs
Malloc
malloc() takes the number of bytes to
allocate
returns void * pointer holding the
address of the chunk of memory that
was allocated
Free
free() takes a pointer of any type holding a
memory address
Deallocates the chunk of memory starting at
that address and gives it back to the
computer to let other programs use it
Realloc
realloc() takes a pointer holding a
memory address and the number of
bytes to allocate
Allocates the additional memory needed
to get up to the specified number of
bytes
Will free up memory if fewer bytes than
original specified!
Memory Management
malloc() and free() must be used
carefully to avoid bugs
Potential problems
dangling pointers
memory leaks
Dangling Pointer
Pointer that is not pointing to a valid,
allocated chunk of memory
Typically happens when you free a pointer
and then forget to reset it to NULL, then try
to use the pointer again
Causes unpredictable behavior
Memory Leak
Chunk of memory that has been allocated
and will never be freed up in the course of
the program
Typically happens when you reset a pointer
to NULL or to a new memory address without
freeing it first
Consumes unnecessary system resources and
slows down program/computer
Avoiding Memory Mgmt Bugs
Must match up calls to malloc() and free()
Good technique – print out messages
#define DEBUG 1
...
if (DEBUG) printf(“malloc”);
// malloc
...
if (DEBUG) printf(“free”);
// free
...
if (DEBUG) printf(“resetting pointer”)
// assigning pointer variable
Avoiding Bugs Cont.
If more malloc() calls than free(), you have a
memory leak
If fewer pointer resets than free() calls, you
may have a dangling pointer
Using Dynamic Allocation
Only allocate the memory you need for
strings
Grow a database on the fly
Shrink even!!
Dynamic Memory Allocation & Functions
Often want to dynamically allocate memory within a
function
CAN be done
Memory allocated with malloc() doesn’t get freed up
automatically when a function exits the way local variables
do
Often requires passing *pointers* by reference
Pointers & Pass-By-Reference
Remember: when pointers are used to implement
pass-by-reference, a copy is still being made!
Copy holds same memory address as original pointer
– so both can be used to access the original mem
location
However, when memory address stored in the copy
is changed, the address stored in the original pointer
is NOT changed!
Pointers & Pass-By-Reference
Assuming a function that takes an argument
named “ptr”
WILL change ptr in the caller:
*ptr = <value>;
strcpy(ptr, <value>);
ptr[0] = <value>;
Will NOT change ptr in the caller:
ptr = (int *) malloc(...);
ptr = <another pointer variable>;
ptr = &<variable>;
Pointers & Pass-By-Reference
Rule of thumb: anything you do to a pointer
argument inside a function MUST involve a
dereference
If pointer is treated as an array, that’s
equivalent to a dereference
C Preprocessor
Runs on a source code file before it
gets to the compiler
Gets the source into compilable form
Strips out comments
Loads in header files
Replaces constant names with their values
Expands macros
Can chop out unneeded sections of code
C Preprocessor
Has its own very simple language
Three main kinds of statements:
#include
#define
conditionals (#ifdef, #ifndef, #if, #else,
#elif, #endif)
#pragma
Header Files and #include
Syntax: #include <library name.h>
Syntax: #include “header file.h”
Angle brackets used for shared, standard libraries
(string.h, stdlib.h)
Double-quotes used for header files you have
written yourself
Header Files
#include works by essentially copying &
pasting in all the contents of the header file
into your program
Many source code files can all share the same
header file
Useful for reusing code
Next time, will see use of header files
Constants with #define
Syntax: #define NAME VALUE
VALUE is pasted in wherever NAME appears
in the code
Can also just define a name
Syntax: #define NAME
Can later test this name with #ifdef
statements
Macros with #define
Can use same syntax to define macros
Macros are like less-full-featured functions;
just pasted into code
Example:
#define ADD(a,b) a+b
Usage:
int x = ADD(5,3); => int x = 5+3;
Macros
Can make extremely complex macros
Faster to execute than a function
Much harder to debug than functions
Example:
#define ADD(a,b) a+b
int x = ADD(5,3)*2; =>int x = 5+3*2;
Result: 5+6=11 NOT 8*2=16
#define ADD(a,b) ((a)+(b)) must be used
Macros
Macro definitions must all be on one line
Can get long and messy
Use \ at very end of line to continue it onto
the next line
#define COMPLEX_MACRO(a,b) \
printf(“complex macro\n”); \
printf(“%d+%d=%d\n”, \
(a),(b),((a)+(b)));
Conditional Compilation
Several conditional statements present in the
preprocessor language
Used to block out (or activate) chunks of
code based on a single constant value
Syntax:
#ifdef <constant>
/* conditional code goes here*/
#endif
Conditional Compilation
Useful for debugging
#define DEBUG
#ifdef DEBUG
printf(“Debugging info\n”);
for (int i=0; i<10; i++)
printf(“Array[%d] = \n”,i,array[i]);
#endif
Conditional Compilation
Also useful for compiling on different platforms
// #define WINDOWS
#define UNIX
...
#ifdef UNIX
#define END_OF_LINE “\n”
#endif
#ifdef WINDOWS
#define END_OF_LINE “\r\n”
#endif