


Implementation for paper: Recommendation Based on Review Texts and Social Communities: A Hybrid Model


In this project, we implement a community regression model to predict user ratings towards bussinesse. The project is based on Spark Scala API. It is a local version of our proposed model, you can run it in a single machine in the Spark Standalone Mode. After downloading the spark dependencies and our processed Yelp data, presiction resuls will be printed by executing the Scala2.jar file. Have a good time!


Softeware requirement: Java 1.8


You can download our processed dataset from: https://drive.google.com/open?id=1uFmDlS73DRSzjqX7yL2_N3EO05N6iA7L The executable jar and dependencies from: https://drive.google.com/open?id=1M566erL8LHjpDLmL7KkeTfeapRMO9_eQ

Model Training and Testing

To training our hybrid recommendation model, use java -jar command:

eg. $ java -jar -Xmx10g Scala2.jar --root_path DataDirectory/ --coda_result socialUR20CaGroup200.txt

Here is the params list and introduction:


This is the root dir where the processed data are stored. You must set this param at first to init our model.


If you want to random split the processed data to traing and testing set, set "--task DataSplit", else program will find data in the
root/output/Access/ floder by default.


The regression model you want to apply. Default is "LR"(Linear Regression).


This is a word2vec param which used to set the dimensionality of the word embedding vector. Default is 10.


The review number of users. Default is 20.


A word2vec param. The minimal occurance number of words. Default is 5.


A word2vec param. Default is 5.


If you want to choose the community detection algorithms, please set this param to "--social_type coda" or "--social_type cnm". The de fault algorithm is coda.


The file name of the cnm community detection results. Default is "Yelp2016UserBusinessStarReview"+reviewNum+"cnm2.txt"


The file name of the coda community detection results. Default is "Review"+reviewNum+"mc50xc200ClusterSkipcmtyvv.in.txt"