An analysis of sentiment analysis
Sentiment analysis is becoming the topic du jour (note the italics for the foreign word? Thanks Ben) in social media circles. It’s even hit the mainstream with interest in how brands are using data mining for commerical advantage . The semantic web is most definitely at our doorstep yet we seem to be expecting our machines to understand complex human emotions when even men don’t seem to realise that when a woman says she’s fine it most certainly does not mean she’s fine.
Take this example, from the (comprehensive) Digital Media article that outlines where to get your election news online. Of the six articles listed as “Hot”, one is a remix of Gillard’s Moving Forward catchphrase and one is a plea for Tony Abbott to man up. A third should have been listed as neutral as the Twitpic of Bob Brown merely stated his whereabouts (and his company).
Add to this our habit of using litotes, tropes and double negatives and it’s obvious that sentiment analysis will need to rely on more intelligent text processing than we currently have at our disposal. We’re trying to combine a lexicon with an ontology hierarchy and human emotion/reason with categorical attributes. So what’s the solution?
Most solutions stink. Not just stink… dinosaur’s breath after a meal stink.We are algorithmically trying something that as yet does not lend itself to algorithmic measurement… “emotion”. It is darn near impossible to cleanly buckets feelings and nuance into clean Positive, Negative, Neutral buckets.
We, computer programs, are simply not there yet. [Though I am absolutely confident that we will get there at some point.]
For now you are most likely wasting time (and money). Sorry.
…Rather than trying to find short cuts, where none exist, and provide aggregate data, where it just gets crapified, follow a well established methodology while leveraging segmentation and nuance.
Source: Occam’s Razor by Avinash Kaushik (read the rest of the article to find out how Avinash knows how many people have shared his words of wisdom!) (In fact, read the rest of the article anyway and subscribe to his blog. You won’t be disappointed.) (Ok well maybe you’re the type to be disappointed but that’s not really my problem is it? Seriously, stop being a jerk.)
Thoughts?
~ by mandi bateson on July 27, 2010.
Posted in Social Media, Technology
Tags: 2009 social media statistics, linguistics, LinkedIn, metrics, sentiment analysis, social media strategy, twitter
Agree – there’s a long way to go here. And you know what, the cost of an intern to run an eye over your keyword reporting is likely to be a much better investment in the longer term than the licensing of some software. +1 for the humans 😉
Gavin Heaton said this on July 27, 2010 at 8:35 pm |
I can’t screenshot that fast enough for my business case!
mandi bateson said this on July 27, 2010 at 8:53 pm |
Agreed. Language is one of the most fluid forms of expression in that it constantly changes. Sure in the future it may be possible, but this would entail teaching a computer to learn, analyze and adapt to changes in language, not discounting the fact that there are about 6900+ different languages in the world, (not counting slang, etc). It’s easy to say, “we should automate this”, another matter to actually find a way to do it, perfect the method and sell it for a reasonable price. At our company, although we promote sentiment analysis, we do not remove humans from the equation. Computers are a brilliant thing but we have yet to find anything that can surpass the human mind.
Infinit-O said this on August 31, 2010 at 4:18 pm |
[…] personal favourite was a post inspired byAvinash Kaushik from Occam’s Razor which is an analysis of sentiment analysis because of what it got me thinking about language and the human […]
My top posts in 2010 « stuff and junk said this on December 30, 2010 at 7:09 am |