Wednesday, January 28, 2009

Semantic Web challenges 2009

Here are my current thoughts about the challenges facing the Semantic Web, vintage January 2009:

  • Mind the Gap
    • Thesis: There is a dramatic semantic gap between how users think and communicate about knowledge and the mechanisms that the Semantic Web supports for organizing knowledge.
      • Superset of the semantic data mining problem
    • How to jump from comfort with natural language to comfort with the Semantic Web
    • Extent to which the user "sees" the Semantic Web as opposed to the Semantic Web simply being more power "under the hood" in a completely transparent manner
  • Mind the Gap II
    • How do we map and transition between natural language and the Semantic Web
      • How to represent natural language in the Semantic Web
        • Concepts, statements, reasoning, processes, prose passages, stories, outlines
  • Semantic search engines
    • Not just raw text, semantic inferences as well
    • No single best form for database, need open access to create specialized databases
  • Inference Broker
    • Need for inference brokers to mediate between creators of knowledge and users of knowledge. Due to:
      • Desire for privacy
      • Protection of intellectual property
      • Massive scalability requirements - divide and conquer
      • Division of labor, factoring large problems into smaller problems
  • Social structure of knowledge
    • Individuals have only some of the puzzle pieces of knowledge
    • Propositions of uncertain classification
    • Social groups aggregate and classify knowledge
  • A medium for intelligent agents
    • Software agents can act more intelligently with a richer, knowledge-centric information stream
  • Statements that are not strict, objective facts
    • Personal facts, opinions, speculation, gossip, questions
    • "Creations" - text, graphics, images, audio, video [? Separate challenge?]
    • False statements
      • May be outright lies, deceptions, misunderstandings, misstatements, changed information
  • Medical record difficulties
    • Pen on paper still most convenient for input
    • Quick human scan of paper still most convenient for browsing
    • Input decision process still far too intrusive
    • Semi/un-structured data still far too inconvenient
  • Distributed resource storage
    • Extremely diversified storage to assure timely and efficient access
    • Needs to be part of net infrastructure that is automatic and not subject to human whim and error
  • Robust personal and organization identity, as well as roles and interests
  • Authority and provenance identification and tracking
  • The cost of knowledge engineering, especially maintenance and testing
    • Who can really afford it?
  • Semantic matching challenges
    • Apparent differences that are easily bridged by a human
    • Subtle or apparently insignificant distinctions that a human would say are too significant for a match
      • For example, improper reuse of a resource for a different "meaning"
    • Incomplete matches due to cultural differences
    • Concept matches but with differences in contractual commitments (for services)
  • Distributed semantic matching/mapping services
    • Manual creation of libraries of semantic "logic" services for bridging semantic gaps - hide the details for how to get from "A" to "B"
  • Support for time and version dimensions of information

-- Jack Krupansky


Post a Comment

Subscribe to Post Comments [Atom]

<< Home