Twitter Sentiment Analysis with Node.js

Twitter Sentiment Analysis with Node.js

Hey folks! This tutorial will demonstrate how to use Node.js – together with Twit (a Twitter API client) and Sentiment (a sentiment analysis library) – to conduct sentiment analysis on the Twitter zeitgeist.

Why? Because it’s interesting! Twitter – despite it’s slow descent into a toxic cesspit – is how you find out what’s happening right now; it represents a global stream of consciousness by way of millions of one-hundred-and-forty character long text snippets.

Using the Twitter API, we can query the Twitter hosepipe for material relevant to our interests. We can then aggregate and interrogate the results and, with the aid of natural language processing tools, divine how people generally feel about something. “Something” could be an event, a product, a brand – anything really. Suffice it to say, marketers love this stuff.

I love indie games, and I’m based in Guildford. Being an avid sci-fi fan, No Man’s Sky is definitely on my radar. So I’m going to use sentiment analysis to find out how people feel about procedural generation in computer games. Let’s get started!


Firstly, you’ll need to create an app in Twitter. This will give you the credentials you need to access and query the Twitter API. Log in to Twitter Application Management, and click the button entitled “Create New App”.

Twitter Application Management Page - Create a New Application
Twitter Application Management Page – Create a New Application

Once you’re all set up, you’ll be directed to your app’s credentials page. You’ll be needing these later.

Using Terminal, create a new folder and create a new JS file in it.

mkdir twitter-app && cd twitter-app
touch app.js

Install Twit and Sentiment;

npm install twit sentiment

Now let’s edit the JS file.

subl app.js

Require Twit and Sentiment, and set up your access credentials.

var twit = require('twit'),
    sentiment = require('sentiment');

var t = new twit({
  consumer_key : 'YOUR_CONSUMER_KEY',
  consumer_secret : 'YOUR_CONSUMER_SECRET',
  access_token : 'YOUR_ACCESS_TOKEN',
  access_token_secret : 'YOUR_ACCESS_TOKEN_SECRET'
});

Okay, now we can query the Twitter API! Let’s search for the phrase (a phrase is indicated by double quotes) “procedural generation.” (N.B: The Twitter Search API only extends to around the past seven days.)

t.get('search/tweets', { q : '"procedural generation" since:2015-10-08', count: 100 }, function(err, data, response) {
  console.log(data);
});

If you run the app – 

node app.js

– you’ll get a JSON object back containing the data for 100 tweets. This is the sanitised output for -one- of those tweets.

{ statuses: 
   [ { created_at: 'Thur Oct 15 14:06:44 +0000 2015',
       id: 000000000000000000,
       id_str: '000000000000000000',
       text: 'text',
       truncated: false,
       entities: [Object],
       metadata: [Object],
       source: '<a href="#" rel="nofollow">@Twitter</a>',
       in_reply_to_status_id: null,
       in_reply_to_status_id_str: null,
       in_reply_to_user_id: null,
       in_reply_to_user_id_str: null,
       in_reply_to_screen_name: null,
       user: [Object],
       geo: null,
       coordinates: null,
       place: null,
       contributors: null,
       retweeted_status: [Object],
       is_quote_status: false,
       retweet_count: 10,
       favorite_count: 0,
       favorited: false,
       retweeted: false,
       possibly_sensitive: false,
       lang: 'en' } ],
  search_metadata: 
   { completed_in: 0.045,
     max_id: 000000000000000000,
     max_id_str: '000000000000000000',
     next_results: '?max_id=000000000000000000&q=%22procedural+generation%22&count=100&include_entities=1',
     query: '%22procedural+generation%22',
     refresh_url: '?since_id=000000000000000000&q=%22procedural+generation%22&include_entities=1',
     count: 100,
     since_id: 0,
     since_id_str: '0' } 
}

“Hosepipe” sounds about right. Let’s whittle this down a bit.

t.get('search/tweets', { q : '"procedural generation" since:2015-10-08', count: 100 }, function(err, data, response) {
  for (var i in data.statuses) {
    if (data.statuses[i].lang === 'en') {
      var s = sentiment(data.statuses[i].text);
      console.log(s.score + ' ' + data.statuses[i].text);
    }
  }
});

This will return the text of 100 tweets, prepended with a sentiment score. A negative score indicates a negative sentiment, a positive one indicates a positive sentiment.


Update July 2017

When I originally wrote this tutorial, the Sentiment was still in it’s infancy (the developers hadn’t yet written the code to handle negation in statements) so I figured I’d re-write this section to reflect the changes. I’m grateful (as always) to FOSS contributors that make libraries like Sentiment happen.

Let’s run the query above again, removing the date modifier, limiting the number results to 10, and filtering out retweets and replies (which can often be taken out of context).

t.get('search/tweets', { q : '"procedural generation" -filter:retweets -filter:replies', count: 100 }, function(err, data, response) {
  for (var i in data.statuses) {
    if (data.statuses[i].lang === 'en') {
      var s = sentiment(data.statuses[i].text);
      console.log(s.score + ' ' + data.statuses[i].text);
    }
  }
});

This is what we get back:

4 like take the procedural generation shit and work on it, when it's good make a good game out of it
3 This GDC talk gives a really good overview on procedural generation :) https://t.co/y16em13WcV
-4 Screw the procedural generation, the core problem here is the none-orbiting planets. You don't know my pain. #NoMansSky
0 Almost there with the procedural generation... #GameDev #IndieDev https://t.co/nXekRxYYgx
-4 Sure, it has procedural generation, I'm fighting back tears because of the shallow gameplay. I'm calling Anonymous. #NoMansSky
0 Got my copy of https://t.co/gWrr7vLPPA today. Should keep me busy for a while …
-6 Ugh... finally got the core of this procedural generation algorithm of mine working... holy shit wasn't that exhausting...
0 GameMaker Studio 2 Action RPG Tutorial - Part 1 - Procedural Generation: https://t.co/1RxY4rws3f via @YouTube
0 I added a video to a @YouTube playlist https://t.co/1RxY4rws3f GameMaker Studio 2 Action RPG Tutorial - Part 1 - Procedural Generation
3 The procedural generation in Dwarf Fortress is really darn impressive. https://t.co/Lz8EWQSLhb

Remember, the first number in each line is the sentiment score; so line 2 reveals a positive sentiment about a GDC talk, and line 3 reveals a negative sentiment regarding procedural generation in No Man’s Sky.

Now, if you’ve been paying attention, you’ll have noticed a problem with my original choice of search phrase. “Procedural generation” is a bit of a broad concept, and we can’t yet determine context from a tweet. Do I mean procedural generation pertaining to a specific game? A specific algorithm? A tool in the Unity Asset Store? A GDC talk? I *could* use a ton of filters to pare down the results – but I think, for now, it’s better to limit searches to proper nouns. Events, brands, products – concrete “things”. Let’s try something else.

I’m trying to keep things positive here, so instead of looking at No Man’s Sky, I’m going to examine sentiment regarding my newest favourite indie game, Caves of Qud:

0 Average Gunslinger On The Roll in Caves of Qud Weekly Run #21 Part 1: https://t.co/ZLdFVZmyfB via @YouTube
3 I think today is going to be a nice chill day with some Caves of Qud and some RimWorld :D
-4 the real caves of qud lore question is where are these cannibals finding all these fucking missile launchers
0 When I get addicted in to a game I get really in to them. 96 hours in Caves of Qud. But I think I'm burned out on it now.
0 my dude @Craigoryham is making some VERY spooky tracks for Caves of Qud https://t.co/ZNas6x1Htg
0 I can think of a billion other things I'd rather be doing rn than cleaning this basement.... all of them just happen to be Caves of Qud...
1 caves of qud status juicing cannibal gary: don't shoot a missile in a tiny room juicing cannibal steve: why not
1 Caves of qud is. Oh god not my contacts.
4 Had a lot of fun with Caves of Qud tonight. Definitely need to do more with a beguile build.
5 Caves of Qud is fascinating but so long and with so little narrative variation that I think this'll let me enjoy it… https://t.co/edy2OwVuVI

(Caves of Qud is Dune-inspired roguelike awesomeness. Seriously, go buy it.)

Ten tweets is far too small a sample to be statistically significant. Let’s average the sentiment score over the past week.

t.get('search/tweets', { q : '"Caves of Qud" -filter:retweets -filter:replies since:2017-07-04 until:2017-07-11', count: 100 }, function(err, data, response) {
  var aggregate = 0,
      numScores = data.statuses.length;

  for (var i in data.statuses) {
    if (data.statuses[i].lang === 'en') {
      var s = sentiment(data.statuses[i].text);
      aggregate += s.score;

      // remove any neutral scores - they bring down the average
      if (s.score === 0) numScores -= 1;
    }
  }

  console.log(aggregate / numScores);
});

This returns a score of 1.173913043478261 (remember, any score greater than zero reveals a positive sentiment). So we can say that, over the past week, Twitter users have been generally favourable in their discussion of Caves of Qud. Let’s try another – how is Stardew Valley doing this week? Computer says 0.6052631578947368 – so still generally positive, but not as much so as Caves of Qud. And No Man’s Sky? -0.2625 🙁

It’s not 100% accurate, as there are a lot of edge cases, and the tools are still being developed. But hopefully you can see how it’s useful to determine sentiment concerning “a thing.” Do you have a product that you update on a regular basis? Now you can gauge whether your latest update was a hit or a miss. Sentiment can be applied to any text – not just tweets. You could use it ascertain average sentiment on user reviews or customer surveys. Why not take my example and extend it? I’d love to see someone do some data visualisation (using D3.js?) with this. Just treat it as a jumping off point. Let me know what you make of it in the comments or on Twitter.


Note: The Twitter Search API has also seen some updates since my original post – it now supports filtering by positive and negative sentiment.


So whilst you can’t use it (without some wrangling) to get an average, you could use it to reveal (and hopefully recify) negative user sentiment about your product. Or perhaps you’ve just shipped a unicorn and want to bask in adulation? :] Perhaps Twitter should just apply this filter globally – though I’m not sure the world needs rose-tinted goggles right now.

Attribution: Google Images can’t find the original creator of the header image used for this post. The closest I could find is from @bskipper27. If you created this image, please let me know so I can correctly attribute your work. Thanks!