Home

Awesome

Missing Link

Overview and Motivation

My motivation for this project is twofold. The first comes from personal experience, as a diner. When I visited my girlfriend in Richmond, VA, I was disappointed to find a lack of good Chinese restaurants. There were only a handful of restaurants with a large number of ratings, and even those restaurants only averaged 4 stars. Finally, when we did try the few 4 star restaurants, I was sorely disappointed. Surely, I thought, someone could come in and start a better restaurant, and capture this market!

My second motivation comes from one of my friends, who has worked with local chefs to start a variety of local restaurants. While there are many factors to success, competition is important, and I realized something that could benefit my friend would be an analysis of competition strength in his area. In particular, when it comes to different types of restaurants, is there anything missing?

Data Source

For exploratory analysis, I used the Yelp Challenge Dataset, available at the URL https://www.yelp.com/dataset_challenge. This dataset contains information about restaurants and reviews from 11 cities. Out of the collection, I only used the "business" dataset. Future analysis will use specific Yelp review data, also available as part of the Challenge Dataset.

Ultimately, for this project, we will also need data from a larger set of cities, partially available through Yelp's API.

Exploratory Data Analysis

Market Segmentation: I took every category used to describe restaurants, and ordered them by frequency. I took only categories with over 350 restaurants, and by doing so removed the rarely used categories.

Scoring Metric: To score each city + category combination, I took a weighted average of average review squared. The average is taken over all restaurants in the city and category, and is weighted by number of reviews received.

Analysis by City: I created a plot of low-scoring categories in every city.

Methodology

We can improve on the exploratory analysis and expand it in through the following:​