- About this project
- Tokens
- How Lexical Analyzer functions
- How to Run this Project
- Assumptions for Subset of C++
- Sample Source Code
- Screen Shots
This project is a lexical analyzer generator written in C++. Lexical Analysis is the first phase of the compiler also known as a scanner
. It converts the High level input program into a sequence of Tokens.
A lexical token is a sequence of characters that can be treated as a unit in the grammar of the programming languages.
Example of tokens:
- Type token (id, number, real, . . . )
- Punctuation tokens (IF, void, return, . . . )
- Alphabetic tokens (keywords)
- Tokenization i.e. Dividing the program into valid tokens.
- Remove white space characters.
- Remove comments.
- Clone the project using command:
git clone https://github.com/Akshit6828/Lexical-Analyzer.git
- Change directory to Lexical-Analyzer using command:
cd Lexical-Analyzer
. - Make a text file in this folder and write source code in C++. 3.1 Try the source code written written here
- Open
Lex.exe
file by double clicking on the file. - You'll see all associated tokens from the source code.
Note: This is a Lexical Analyzer for a particular
subset of C++ language
explained under Assumptions: . It may not be able to parse the all token of C++ language.
- Special Symbol:
;
{
}
(
)
,
#
- Keyword:
int
,char
,float
,bool
,cin
,cout
,main
- Pre-processor Directives:
include
,define
- Library:
iostream
,studio
,string
- Operators:
*
,+
,>>
,<<
,>
,<
- Numbers/Integers:
0 to 9
. - Identifies/ Variables:
All alphabetic strings except the keywords, numbers, Pre-processor directive and library strings
.
MySourceCode.txt
#include <iostream>
#define LIMIT 5
using namespace std ;
int main(){
// this comment is written by akshit mangotra for lexical analyzer to avoid reading the comments
int A , B ;
cin >> A >> B;
cout << A * B ;
}