Monday, April 6, 2009

URI-based resource location

I have never been happy with the Semantic Web concept of associating resources with specific Web locations using URLs that specify a server location such as a domain name. The main issues:

  1. Makes it difficult to move a resource to another domain.
  2. Increases the likelihood that a server might become a performance bottleneck, especially as popularity grows and the Semantic Web begins to scale up in size dramatically (so-called "exponential growth.") Wiring in a server location simply does not scale up.
  3. Encourages ad-hoc caching. Worse, as the Semantic Web scales up it requires a dependence on ad-hoc caching.

Although some form of caching is clearly part of the solution, the main component of a solution is to switch from URL-based resource locating to URI-based resource locating.

Rather than specifying a single URL and then depending on the existing, non-Semantic Web Domain Name System (DNS) to look up the actual path to "the" server, we need a non-DNS lookup mechanism that takes one or more URIs and does more of a "keyword" lookup (treating each URI as a "keyword" (actually a Semantic Web analog to a keyword)) and then redirects through a caching inrastructure that is designed to meet the needs of caching resources for the Semantic Web.

A Semantic Web resource URI list might also be supplemented with various attributes, such as version number or version requirements and other attributes needed to constrain and control resource access.

The SW resource infrastructure should be able to manage multiple versions for a resource and efficient and controlled propagation of changes.

One use of multiple URIs is to control the degree of specialization of a generic resource name. A single URI would be the most general resource reference and provide the most adapaptability, while adding on specialization URIs would provide access to resources that meet additional requirements. This is analogous to base and derived classes in object-oriented programming, but is not necessarily required.

One key attribute of such a resource infrastructure, besides scalable performance itself, would be that even a very small, under-powered web site could be the source host for even extremely popular Semantic Web resources, and migrating such resources to another host should be completely transparent to the "users" (user agents, UAs, or software agents) that have the URI list for the resource "wired" into their "code."

Alas, I am not optimistic that such an aechitecture will soon or even ever by made available for the Semantic Web as we know it today. The change may have to wait for whatever follows the Semantic Web, or maybe even for Ray Kurzweil's Singularity.

Still, it is useful to contemplate what a proper solution might look like.

-- Jack Krupansky


At April 6, 2009 at 4:27 PM , Anonymous Ben Stein said...

Jack - Interesting post!
Have a look in something very related to what you write about -, a new URI Based semantic analysis tool. Would be happy to hear your note about it.
Ben - Semantic Advertising


Post a Comment

Subscribe to Post Comments [Atom]

<< Home