Thursday, February 12, 2009

Sarcasm, satire, truth, and lies for semantic data mining

Although semantic data mining has a lot of potential, it is quite a minefield of tricky issues. Even if we successfully filter, say, a blog post or purported article on a Web site into succinct statements, we then have the issue of determining the veracity of those statements. That is difficult enough in its own right, and then you have sarcasm and satire, where statements are being made that are known by the author and most human readers to not be the actual opinion of the author, but superficially do indeed appear to be explicit statement of belief by the author.

In essence, statements using sarcasm and satire are inherently "lies" in a superficial sense, but for most human readers they certainly do not betray any intention of misleading the reader.

An immediate application is for semantic data mining applications that seek to uncover brand reputation issues. For example a sarcastic product review read only superficially would have the reputation 180-degrees wrong. A wiseacre might express lavish, albeit sarcastic, praise for a poor product that he despises or withering, albeit sarcastic, criticism of a great product that he personally admires (maybe simply to tweak the insufferable zealous fans of the product.)

Still, there is value in recognizing the sarcasm and satire, even if a particular application (brand reputation) does not need it.

-- Jack Krupansky


Post a Comment

Subscribe to Post Comments [Atom]

<< Home