This project delves into a comprehensive dataset of movies to uncover key insights about the global film industry, including performance metrics, trends, and success factors. Using SQL Server Management Studio (SSMS), advanced SQL queries, and views, the analysis provides valuable findings across industries, languages, studios, actors, and financials.
a. Movies: Titles, industries, release years, IMDb ratings, studios, and languages.
b. Financials: Budgets and revenues in INR and USD, calculated in millions and billions.
c. Actors: Actor IDs, names, and their appearances in movies.
d. Languages: Movie languages and their associated IDs.
- Identify top-performing movies based on revenue, ROI, and IMDb ratings.
- Compare Bollywood and Hollywood in terms of financials, IMDb ratings, and output.
- Analyze language-specific trends in popularity, budgets, and ratings.
- Highlight the studios and actors driving movie success.
- Explore year-over-year trends in budgets, revenues, and ratings.
- Complex joins, aggregate functions, and window functions.
- Creation and use of views for reusable query logic.
- ROI (Return on Investment) = (Revenue - Budget) / Budget * 100.
- Industry-wise and language-wise financial performance.
**-**IMDb rating trends and averages by studios and languages.
- The Shawshank Redemption is the highest-rated movie (9.3 IMDb), while The Godfather has the best ROI (3941.67%).
- Hollywood leads in total revenue (₹1,569,460.99M) and average IMDb ratings (8.13).
- Kannada movies have the highest average IMDb rating (8.4), while English dominates in count (20 movies).
- Studios like Marvel Studios and Vinod Chopra Films excel in revenue generation.
SQL Fundamentals: Improved mastery of GROUP BY, aggregate functions, and data type handling.
Query Optimization: Learned to optimize queries with indexing and views for better performance and maintainability.
Working with Views: Simplified complex queries and improved reusability by leveraging views.
Error Handling: Resolved issues like incorrect joins and GROUP BY mismatches.