Skip to content

mskerz/thesis_search_backend

Repository files navigation

Thesis Search System

This Senior project is a theses search system for Computer Science students. It provides functionalities like document indexing, TF-IDF-based search, and user account management. Go to Frontend Link Here!.

System Scope

1. Searching

  • Simple Search: String Matching.
  • Advanced Search: TF-IDF (Term Frequency-Inverse Document Frequency) based search.

2. Indexing

  • Parsing: Extracts text from document files.
  • Tokenization: Splits text into tokens (words/phrases).
  • Stop Word Removal: Removes unnecessary words and whitespace.

3. Scoring

  • Term Frequency (TF): Measures word frequency within a document.
  • Inverse Document Frequency (IDF): Measures word importance across all documents.
  • TF-IDF: Calculates word importance for search.

4. Ranking

Results are ranked based on their TF-IDF scores to provide the most relevant results to the user.

5. User Account Management

  • Sign-up
  • Log-in/Log-out
  • Edit Profile
  • Change Password
  • Reset Password

Tech Stack - Backend

Frontend

Installation

  1. Clone the repository:
    git clone https://github.com/mskerz/thesis_search_backend.git
    cd thesis_search_backend
    
  2. Create and activate a virtual environment (isolating dependencies within the project):
    • For Window
      python -m venv venv
      venv\Scripts\activate
      
    • For Linux/MacOS:
      python3 -m venv venv
      source venv/bin/activate
  3. Install dependencies (these will be installed in the virtual environment):
    pip install -r requirements.txt
    
  4. Run the server:
    uvicorn main:app --reload
    
    
    
    
    

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages