Thursday, March 22, 2012

Enter the Fifth Dimension; Sentiment Map Rendering

We have been plotting x,y,z and time for a while now. But this is all syntax; Tim Berners-Lee believes that the future of the web is semantics, and I could not agree more. You have a glimpse of it today when you use for example Apple’s Siri. In our GIS space, MetaCarta at one time had a great appliance/service that inferred location from the semantics of a sentence. For example, “Joe ate a Chicago style pizza in downtown Boston”, would return the latitude and longitude of Downtown Boston and would ignore “Chicago” as a location, Impressive! With the explosion of social media and the infusion of geo-locations as ‘fields’ in our records, we are reduced to plotting locations with cute popups. Boring! We humans are quite the emotional animals (some more that others, specially if these humans come from the Middle East :-) and these emotions are quite visible now within our tweets and facebook posts. So, what if I can plot these sentiments, there could be something to “see” in these maps. Enter LinguaSys, I met one of their senior scientists by accident when sheltering from a rain storm. We ended up talking about the weather, this and that and of course the question “What do you do ?” had to eventually come up. “I do sentiment analysis”, he replied. “Wow, that is exactly what I was researching before leaving”, I replied. “I am looking for an ‘engine’ that I can pass it, say a set of tweets, and it will return to me for each tweet a sentiment index.”. “Not sure about tweets”, he replied, “as we deal with entire documents, but I am sure we can adjust our engine to such a process”. The rain stopped, we exchanged contact info and parted on the promise that we will stay in touch. A couple of month later, a very good customer with receptive avant-garde ideas needed something ‘new’. I proposed the sentiment index mapping based on social media to highlight areas of interest. The following is a derivative of this work based on a totally different interest; TSA approval or disapproval tweets. Working with LinguaSys, I was handed a set of XML files where one contained the tweets and associated sentiment indexes, for example:
<text>RT @msnbc_travel: Good news for elderly fliers (75 and above): TSA announces pilot program the relaxes security procedures http:\/\/t.co\/WTaM9AgW</text>
<disapprovalFactor factor="-0.8" reason=“indicatorOfSatisfaction”/>
Note that the factor is negative to indicate a level of satisfaction. The internal factor ranges from -1 (totally satisfied) to 1 (totally pissed off :-). Easy to parse and to associate a range based renderer. The second file is the “location” of the tweeters. Notice that I put location between quotes, that is because some values were great, like an exact latitude/longitude. Others were like “Boston, MA” And some (and most) were like “Earth”, or “Best Location, NYC!” or my favorite “Look behind u… Boo!”. There was a “sense” of location in these that I think would have given MetaCarta a run for its money. This is where being a unix CLI geek with tools like awk, grep and sed come to the rescue to massage the data. I downloaded a cities.csv to cross reference a city/state name to a location and now I can plot on a map the locatable tweets. Using the latest built-in capabilities of ArcGIS API for Flex such as clustering with flares, info window rendering on clicks, custom function referencing in symbols, I was able to quickly build an application to display the sentiments. You can see the application in action here. Hover over a cluster to flare it and click on a flare element to see its details. To make things more understandable, I reversed the factor value displayed in the info window. Warning: These are real tweets with sometimes very offensive language, so…. do not call HR on me, ok ? And like usual, you can see the source code here.
BTW, Check out GeoTagger for ArcGIS Runtime to see MetaCarta in action.

1 comment:

Tim said...

Thanks for the example, it looks like the sentiment analysis doesn't have a sarcasm indicator. Many of the "good" comments actually were very negative feelings towards the TSA.

I also tried to make a mobile project out of this example, and I get an error when I mouse over on the cluster. I can show you this at the developer conference next week.

ReferenceError: Error #1065: Variable mx.effects.easing::Linear is not defined.
at com.esri.ags.clusterers.supportClasses::FlareContainer/flareIn()[C:\checkout\flex_api3\api\src\com\esri\ags\clusterers\supportClasses\FlareContainer.as:240]