10.2: Stages of Compilation
Exam Board:
Eduqas / WJEC
Specification:
2020 +
A compiler translates source code (high-level language written by a programmer) into machine code in five separate stages:
1. Lexical Analysis
The term 'lexical' refers to words and phrases. Source code needs to be broken down into tokens that can later be analysed.
​
In lexical analysis:
​
-
Spaces and comments are removed from the code.
-
Identifiers, keywords and operators are replaced by tokens. A token is similar to a variable with a name and a value.
-
A symbol table is created. This table stores the addresses of all variables, labels and subroutines used in the program.
​
2. Syntax Analysis
The term 'syntax' refers to sentence structure.
​
In syntax analysis:
-
The tokens created in the first stage are checked to see if they follow the syntax (spelling and grammar) rules of the programming language. This process is called 'parsing'.
-
During parsing, if a syntax error is found then an error message is displayed and compilation stops.
3. Semantic Analysis
The term 'semantic' refers to logic. Variables are checked in this stage to ensure they are used correctly:
​
-
Variable checks ensure they are correctly declared and use a valid data type (for example integers are not assigned to decimal values).
-
Operation checks ensure they are correct for the data type used (for example dividing a number must result in an real value).
4. Code Generation
The machine code (data in a binary format) is generated.
0010 1011 0101 0101 0110 0111 0101 0001 0101 0101 0101 0110
5. Code Optimisation
The code is optimised so it is fast, efficient and uses as little of the computer's resources as possible.
Questo's Questions
10.2 - Stages of Compilation:
​
1a. List the 6 stages of compilation in order. [6]
1b. Create a poster or flowchart describing each of the 6 stages of compilation:
-
1. Lexical Analysis
-
2. Symbol Table Creation
-
3. Syntax Analysis
-
4. Semantic Analysis
-
5. Code Generation
-
6. Code Optimisation [10 total]