Home

Awesome

A Network Tour of Millenial Movies

Project for the course A Network Tour of Data Science

Github repository of the project done by a team of four students for A Network Tour of Data Science course (EE-558) given at École polytechnique fédérale de Lausanne. This readme contains the project abstract, list of required libraries for the correct execution, datasets that were used for project implementation and the different research questions and products that were analyzed. The code can be found in the Jupyter Notebooks of this repository, and the report is given in the Project Report.pdf.

Libraries used

We used the following libraries for this project, with Python 3.6.6

Computational:

numpy (as np)
pandas (pd)
networkx (nx)
scipy
sklearn
surprise
operator
collections
pandas_profiling

Graphical:

seaborn (as sns)
matplotlib (as plt)
IPython

Textual:

json
base64
codecs
re
io

We also utilized these libraries for Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.

Abstract

As Walt Disney once said: "Movies can and do have tremendous influence in shaping young lives in the realm of entertainment towards the ideals and objectives of normal adulthood." But what do viewers really know about movies and what makes them successful? This project, based on the TMDb dataset, offers some interesting insights into movies from the past several decades. It shows how some of the movie features are correlated, explores how movies can be classified into genres using spectral graph analysis and CNNs, and gives a simple demo of a recommender system.

Datasets

The data folder contains the subsampled data that was used for the implementation.

Research Questions

Structure of repo

The notebooks of the repository should be read in the following order:

Additionally, there is a Gephi graph visualization notebook that was only used for visualization.

Authors

License

Copyright 2019 Milena Filipović, Kristijan Lopatichki, Jelena Malić and Davor Todorovski

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.

You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.