This Senior project is a theses search system for Computer Science students. It provides functionalities like document indexing, TF-IDF-based search, and user account management. Go to Frontend Link Here!.
- Simple Search: String Matching.
- Advanced Search: TF-IDF (Term Frequency-Inverse Document Frequency) based search.
- Parsing: Extracts text from document files.
- Tokenization: Splits text into tokens (words/phrases).
- Stop Word Removal: Removes unnecessary words and whitespace.
- Term Frequency (TF): Measures word frequency within a document.
- Inverse Document Frequency (IDF): Measures word importance across all documents.
- TF-IDF: Calculates word importance for search.
Results are ranked based on their TF-IDF scores to provide the most relevant results to the user.
- Sign-up
- Log-in/Log-out
- Edit Profile
- Change Password
- Reset Password
- Clone the repository:
git clone https://github.com/mskerz/thesis_search_backend.git cd thesis_search_backend
- Create and activate a virtual environment (isolating dependencies within the project):
- For Window
python -m venv venv venv\Scripts\activate
- For Linux/MacOS:
python3 -m venv venv source venv/bin/activate
- For Window
- Install dependencies (these will be installed in the virtual environment):
pip install -r requirements.txt
- Run the server:
uvicorn main:app --reload