Awesome
API for Polish sentiment analysis using Keras and Word2vec
Sentiment analysis is a natural language processing (NLP) problem where text is understood and the underlying intent is predicted.
I will show you how you can predict the sentiment of Polish language texts as either positive, neutral or negative in Python using the Keras Deep Learning library and Google Word2vec.
Check Our blog post Polish sentiment analysis using Keras and Word2vec
Getting started
First of all you need to make sure you have installed Python 3.6. For that purpose we recommend Anaconda, it has all the necessary libraries except:
- scikit-learn 0.19.1
- Pandas 0.22.0
- NumPy 1.14.0
- Keras 2.1.4
- gensim 3.4.0
- many_stop_words 0.2.2
- TensorFlow 1.6.0
- wordcloud 1.4
All libraries can be installed with the following commands:
pip install scikit-learn
pip install Keras
pip install gensim
pip install many_stop_words
pip install TensorFlow
pip install wordcloud
or quickly:
pip install -r requirements.txt
Once you have installed Python and the dependencies download at least pre-trained Polish Word Embedding model and extract to main project directory.
The easiest way to see our method in action is to run the LSTM.py script.
Data
Download our dataset from Google Drive and extract to /Data directory.
Our dataset was collected from various sources:
- Opineo - Polish service with all reviews from online shops
- Twitter - Polish current top hashtags from political news and Polish Election Campaign 2015
- Polish Academy of Science HateSpeech project
- YouTube - comments from various videos
Download Polish Word Embeddings from Polish Academy of Science and extract it in main folder.
Useful repos
- https://github.com/Kyubyong/wordvectors, pre-trained word vector models for non-English languages
- https://github.com/dakshitagrawal97/TweetSentimentAnalysis - jest zgoda!!
- https://machinelearningmastery.com/predict-sentiment-movie-reviews-using-deep-learning/
- http://dsmodels.nlp.ipipan.waw.pl/
- https://github.com/BUPTLdy/Sentiment-Analysis
- https://github.com/shayanzare007/EntitySentiment
- https://github.com/Theo-/sentiment-analysis-keras-conv
- https://github.com/giuseppebonaccorso/twitter_sentiment_analysis_word2vec_convnet
- https://github.com/PAS43/TwitterSentimentAnalysis
- https://github.com/kailashnathan
- https://github.com/ankeshanand/review-helpfulness
- https://github.com/spopov812/Sentiment-Analysis-Deep-Recurrent-Neural-Network-LSTM-
- https://github.com/xingziye/movie-reviews-sentiment
- https://github.com/dswald/sentiment_analysis/tree/0d6a85e4afef33eef5da7e6ac8aa2dbdb5e89de2
- https://github.com/sukilau/amazon-sentiment-analysis
Contact & blog post
- Main author: Szymon Płotka
- CEO of Ermlab Software Krzysztof Sopyła
- check Our blog post Polish sentiment analysis using Keras and Word2vec