Home

Awesome

Aspect-based Sentiment Analysis

Keras implementation (tensorflow backend) of aspect based sentiment analysis

Models

Pre-processing

  1. download glove embedding glove.42B.300d.zip, unzip and put it in raw_data
wget https://nlp.stanford.edu/data/wordvecs/glove.42B.300d.zip
unzip glove.42B.300d.zip
mv glove.42B.300d.txt ./raw_data
  1. pre-processing
python3 process_raw.py
python3 preprocess.py

Training

python3 train.py

Environment

Data Analysis

see data_analysis.md

Performance

Note: results in the () are the performances of models with word embeddings fixed but aspect embeddings fine tuned; results in the [] are the performances of models with both word embeddings and aspect embeddings fixed

modellaptop(paper)laptoprestaurant(paper)restauranttwitter(paper)twitter
td_lstm0.69905(0.7194)0.7732(0.8008)0.7080.69219(0.7109)
tc_lstm0.69122(0.6912)[0.7178]0.7642(0.7696)[0.79375]0.7150.71387(0.7124)[0.72543]
ae_lstm0.6890.69122(0.6974)[0.69749]0.7660.7651(0.7625)[0.78125]0.68497(0.68641)[0.6820]
at_lstm0.69122(0.6880)[0.70689]0.7678(0.7812)[0.78571]0.67196(0.7052)[0.70086]
atae_lstm0.6870.69749(0.6927)[0.70376]0.7720.7732(0.7732)[0.78482]0.66907(0.6965)[0.6921]
memnet0.72370.52821(0.5266)[0.53448]0.80950.6580(0.65)[0.65803]0.57369(0.5751)[0.5780]
ram0.74490.70376(0.7225)[0.72727]0.80230.7937(0.8071)[0.81607]0.69360.69653(0.71387)[0.6979]
ian0.7210.6865(0.7100)0.7860.7732(0.7821)0.68208(0.71965)
cabasc0.75070.6974(0.6990)[0.67554]0.80890.7919(0.8080)[0.80357]0.71530.69219(0.69508)[0.6690]
modellaptop(paper)laptoprestaurant(paper)restauranttwitter(paper)twitter
td_lstm0.64172(0.6636)0.6653(0.6986)0.6900.6746(0.6898)
tc_lstm0.62847(0.6335)[0.6666]0.6651(0.6492)[0.70170]0.6950.6983(0.6848)[0.7122]
ae_lstm0.62486(0.6435)[0.6355]0.6419(0.6370)[0.68319]0.66644(0.6670)[0.65437]
at_lstm0.61267(0.6259)[0.65907]0.6630(0.6689)[0.6676]0.6553(0.68194)[0.6823]
atae_lstm0.6246(0.63185)[0.6539]0.6505(0.6657)[0.68006]0.6651(0.67400)[0.67085]
memnet0.38574(0.4118)[0.2322]0.3547(0.4040)[0.3395]0.48982(0.49209)[0.4989]
ram0.71350.6474(0.6747)[0.66476]0.70800.6887(0.7035)[0.72720]0.67300.66536(0.69679)[0.6685]
ian0.62007(0.6604)0.6550(0.6768)0.65071(0.69550)
cabasc0.64601(0.6356)[0.5886]0.6815(0.7201)[0.69357]0.66790(0.67948)[0.6610]

There are 2 ways to use ELMo in this project. Before training model, change the configuration as belows:

  1. In config.py, set self.use_elmo to True and self.elmo_hub_url to None or https://tfhub.dev/google/elmo/2. By running train.py the tensorflow hub will download the elmo module and cached in the local temporay directory. But the module will be deleted after a machine reboot.
  2. So, I recommend you load the elmo module locally. First, download the elmo using url: https://tfhub.dev/google/elmo/2?tf-hub-format=compressed. Then change its name to tfhub_elmo_2 (whatever you want), untar the file, and put it in the raw_data directory. It will be like:
    elmo_dir
    And don't forget to set self.use_elmo in config.py to True and self.elmo_hub_url to raw_data/tfhub_elmo_2 (the name you just renamed).

Here's the results of combining ELMo Embedding:
glove_vs_elmo

  1. Surprisingly, I failed to achieved similar performance as stated in the parper of Memnet. Or maybe there are bugs in the code?
  2. TD-LSTM performs unexpectedly well.
  3. Models with fixed embeddings are generally better than those with fine-tuned embeddings, which is consistent with the paper of RAM.
  4. L2 regularization didn't help.
  5. I tried elmo embedding, but it didn't bring a performance gain.