Saturday, January 17, 2009

Facts, opinions, secrets, gossip, speculation, and questions in the Semantic Web

Whether one is mining text for embedded semantics or offering a structured interface for directly entering semantics, a user's information needs to be properly classified if it is to be used properly in the Semantic Web. Assuming one considers user input to be a sequence or collection or graph of statements, each statement would need to be classified as one or more of:

  • Fact. A statement that is believed to be true in some objective sense.
  • Opinion. A statement that the speaker believes is likely to be true, at least for themselves, regardless of the opinions of others.
  • Secret. A personal statement that is not intended to be shared with others, except possibly on a very selective basis.
  • Gossip. A statement about others that is intended to be shared to some extent, probably without attribution as to its originator.
  • Speculation. A statement that the speaker believes might or could hypothetically be true. It is not assumed to be true, but neither is it assumed to be false. The intention is to incite at least a subtle bias in the conjectural thinking of others.
  • Question. A purely interrogatory statement, a proposition whose truth or answer is essentially unknown, but whose answer is desired by the speaker.

It seems quite clear that a useful semantic mining tool would need to be able to classify its input stream according to these qualities.

On the other hand, a tool may simply categorize to the extent that it can and correlation between similar statements from multiple sources might reveal or suggest the proper, likely, or possible classification.

There should probably be an unknown category as well.

My original motivation in coming up with this classification scheme was to think about how a user interface might assist even average users in capturing at least some aspects of the semantics of their personal information at the time it is captured. For example, to offer the user some category headings that they can click on.

An interface tool could also show the user how other users have classified the same statement. That could be the default unless the user overrides with a desired classification.

-- Jack Krupansky


Post a Comment

Subscribe to Post Comments [Atom]

<< Home