Cimple C Compiler Flex, Bison, and LLVM compiler for a subset of the C language. Generates LLVM IR.
The compiler supports the following C keywords:
- if
- while
- for
- break
- continue
- return
It supports the following types:
- int
- float
- bool (built-in true and false)
- void
It supports the following operations:
- Comparisons (!=, ==, <, >, <=, >=)
- Addition (+, +=)
- Subtraction (-, -=)
- Multiplication (*, *=)
- Division (/, /=)
- Modulo (%)
- Logical and (&&), logical or (||)
- Ternary expression (condition ? true expression : false expression)
- Bitwise operators and(&), or(|), xor(^), and not(~)
- Shift operators << and >>
ccc consists of the following components:
-
Preprocessor
- Removes comments from the original source code
-
Lexer
- Uses Flex to parse the preprocessed file into lexemes
-
Parser
- Uses Bison to parse the lexems following our grammar rules
- Generates an Abstract Synta Tree that we can use for further analysis and transformations
-
Verification
- Traverses our AST and performs semantic analysis of the program to ensure correctness
- Errors checked:
- Existence of main function
- Redeclaration of variables in the same scope
- Type mismatch in variable assignment or function returns
- Type mismatch in binary, logical, or relational operation
- Use of undeclared variables or functions
- Missing return value in non-void functions
- Redefinition of function (Multiple declaration are OK!)
- Type returned from function doesn't match declared type
- If, for, while, and ternary conditions must evaluate to boolean
- Can only cast between int and float
-
Optimization
- Any binary, unary, or relational operation whose operands are strictly constant will be simplified as much as possible
- If statements with constant predicates will either be removed entirely if it always evaluates to false, or always executed if the condition always evaluates to true.
- Ternary operations with a constant predicate will be replaced with either their true expression or false expression.
- While statements with a constantly false conditions will be removed entirely.
-
Code generation
- Traverse the AST and emit the appropriate LLVM IR
-
Command line options
- --print-lex or -l, display the tokens generated by flex
- --print-ir or -i, display the IR code generated by LLVM
- --print-ast or -a, display the asbtract syntax tree generated by
- --optimization-level NUM or -o NUM, NUM is 0 or 1 where 0 is no optimization, 1 is default
- --keep-preprocessed or -e, don't erase preprocessed file upon completion
- --help or -h, display help message
- --version or -v, display version information
-
Requirements:
- Bison 3.6.4
- Cmake 3.18.2
- gcc 10.2.0
- flex 2.6.4
- llvm 10.0.1
-
Installation
Create folder, clone repo, make sure you have above requirements installed, and run Make in ccc/src