Resources for Sentiment Analysis Projects (+Python)
Asked on 2021-02-06 19:53 by Michelle W.
Hi folks! I'm working on a sentiment analysis portfolio project this weekend. This seems like a popular topic so I thought I'd share a few data sets and libraries so maybe you won't have to spend so much time on google! :D
Prebuilt Python Libraries and Models
- Hugging Face models. There are a few that seem to be best here!
- Twitter sentiment model based on 58 million Tweets and using roBERTa, so using really recent model developments.
- Product sentiment model that is multilingual. For my purposes, I've only been looking at English models, and this has a big one. Their self-reported accuracy is only 67% though (on 5 star reviews).
- Stanford NLP (Python bindings). Great research institute and library. The core is Java-based but I believe this library is the Python binding. Has out of the box sentiment classification.
- TextBlob. Has out of the box sentiment and objectivity classifiers. Honestly, I don't know if I feel good about the quality of models but if you need something quick and easy, then this is quick an easy.
I've been avoiding these but I know some people prefer these when explainability is more important than model performance.
If you want to build your own models, these data sets seem like a good place to start!
- TweetEval. A really big data set of Tweets, along with sentiment associated with them.
- Yelp. Restaurant reviews.
- Movie reviews.
- Amazon product reviews. Like a lot of them.
- Airline Tweets. Lots of Tweets about airlines (customer service) with sentiment analysis labels.
- Person Sentiment. Labels for people in the news based on specific articles. This data set feels really unique compared to the ones above.
- Political Sentiment. Looks at various aspects of political statements, and includes sentiment towards specific parties in the Tweets.
I hope this is helpful for people. If anyone has other examples, super curious about them!