mini-compiler / src /README.md
tareque101's picture
Upload 11 files
753d525 verified
|
Raw
History Blame Contribute Delete
4.48 kB
��=؀� Mini-Compiler: Lexical to Target Code Generation
A lightweight, end-to-end compiler designed to demonstrate the core phases of compiler construction. This project transforms a custom high-level syntax into Three Address Code (TAC) and eventually into Simple Assembly (ASM).
=��� Key Features
1. Scoped Symbol Table
Unlike a flat dictionary, this compiler uses a Stack-based Symbol Table to handle nested blocks {}.
Variable Visibility: Inner scopes can access parent variables.
Automatic Cleanup: Local variables are "popped" and destroyed when a block ends, preventing memory leaks.
Shadowing: Supports declaring the same variable name in different scopes.
2. Code Optimization (Constant Folding)
To improve performance, the compiler performs Compile-Time Evaluation.
Example: int x = 10 + 20 * 2; is automatically simplified to x = 50 in the TAC phase, reducing CPU overhead during execution.
3. Multi-Stage Output
The compiler provides full transparency by showing the transformation of code:
TAC: Machine-independent intermediate representation.
Simple ASM: Accumulator-based assembly instructions (LOAD, STORE, MUL, ADD).
<��� Compiler Pipeline
The compiler follows a structured 6-phase pipeline:
Lexical Analysis: Tokenizes the input string.
Syntax Analysis: Builds an Abstract Syntax Tree (AST) while enforcing grammar.
Semantic Analysis: Manages the Scoped Symbol Table and checks for "Undeclared Variable" errors.
Optimization: Performs Constant Folding on the AST.
Intermediate Code Generation (ICG): Produces Three Address Code (TAC).
Code Generation: Produces Target Assembly.
=ػ� Usage & Examples
Nested Scope Test
Input:
C++
{
int a = 10;
{
int b = 20;
print(a + b);
}
print(b); // This will trigger a Semantic Error
}
Optimization Test
Input:
C++
int result = 5 * 10 + 2;
Output (TAC):
Plaintext
result = 52
=��� Technical Details
Architecture: Accumulator-based.
Language: Python (or your specific language).
Error Handling: Detects Syntax Errors (missing semicolons/assignments) and Semantic Errors (scope violations).
<ؓ� Author
Tareque Rahman
4th Year, CSE
Sylhet International University (SIU)