Skip to content

A simple movie retrieval system that displays a searchable list of movies. Users can search by title, summaries, stars, directors, and genres. The page dynamically fetches and filters movies based on the entered text, providing relevant results quickly. Ideal for finding movies by cast, genre, or keywords in their summaries.

Notifications You must be signed in to change notification settings

AHNakbari/retrieval-system-for-movies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MIR-2024-Project

IMDb Logo

This is the repository for Modern Information Retrieval Course, Instructed by Dr. Mahdieh Soleymani Baghshah at Sharif University of Technology.

What is this project about?

One of the ways to compare movies and understand which one is a better choice for you, is through websites with this purpose and using appropriate information retrieval methods.

In this phase of project, we begin our journey towards building an information retrieval system for IMDb website. In this phase, we crawl the required datas from IMDb and do some preprocessing on them. IMDb has one of the reachest datasets of movies (with their ratings, comments, actors and etc.).

A note on forking this repository

You should make a private repository in your personal github account in order to push your answers and also, for TAs being able to track your work. Please choose Use this Template in this repository and choose Create a new repository. Make sure you make your repository private.

In order to be able to get the new changes and files from our main repository into your own repository, you should add this repository as a remote:

git remote add template [URL of the template repo]

and then, you can simply run git fetch to update the changes whenever you want:

git fetch --all

General Structure

The project contains 2 main modules: Logic and UI. The Logic module is responsible for doing the main tasks of the project and the UI module is responsible for providing a user interface for the user to interact with the system. In each task, you will be told to implement a part or a whole file in one of these modules. Please read the comments for each file and functions inside it to understand what you need to do.

Crawled Data

You can find raw crawled data for IMDB movies (which you should've done in Phase 1) here.

Contributions

Please create a new issue whenever you find a problem ir you had any suggestions regarding this project. Also, you can create PRs for issues with "student" label. The project contains 2 main modules: Logic and UI. The Logic module is responsible for doing the main tasks of the project and the UI module is responsible for providing a user interface for the user to interact with the system. In each task, you will be told to implement a part or a whole file in one of these modules. Please read the comments for each file and functions inside it to understand what you need to do.

how to run and test the code

  • first of all it is necessary to download all files. to download them follow these instructions
    • find file links in this path 'Logic/tests/file links'
    • then go download all links in the file
    • then place each file in its proper place that mentioned in the text file
  • after first step, now run the python file test_phase1.py to test everything in the first phase
  • also to run the UI for first phase pls run these two command in the terminal
$env:PYTHONPATH += ";E:\MIR\Project\MIR-Project"
streamlit run UI\main.py

About

A simple movie retrieval system that displays a searchable list of movies. Users can search by title, summaries, stars, directors, and genres. The page dynamically fetches and filters movies based on the entered text, providing relevant results quickly. Ideal for finding movies by cast, genre, or keywords in their summaries.

Resources

Stars

Watchers

Forks

Packages

No packages published