Wednesday, March 31, 2010

Does philosophy bake bread?

There is an old saying that "Philosophy bakes no bread", implying that philosophy has no significant practical value, but I disagree, at least somewhat. I do agree that a large portion of what is called "philosophy", especially as practiced in modern times, is rather disjoint from progress in the real world, but a significant portion of philosophy, especially in a historical context is and has been extremely valuable, and eminently practical.

Early philosophy was really the precursor of a lot of modern science and logic. Basically, early philosophy studied and promoted the kind of disciplined and structured thought that is needed for virtually all modern disciplines, from mathematics, science, and engineering to law and our social and political systems.

Another way of saying this is that over time, every modern discipline and social system borrowed concepts, methods, and techniques from early philosophy. That is an understatement; every modern discipline and social system is based on the products of philosophy. Without the early works of Aristotle, Socrates, and Plato, and the enlightened efforts of Hume, Locke, and Rousseau, among countless other brilliant philosophers over the centuries, we would not have much of what we call modern in the modern world.

The simple fact is that all modern disciplines absorbed concepts from philosophy over the centuries so effectively that the concepts are considered part of those disciplines rather than being owned by the "discipline" called philosophy.

Critical thinking is, well, critical to analyzing the facts in any discipline. The world can be a complex and confusing place. Winnowing truth from fiction and relevance from irrelevance can be a very difficult proposition even on a good day. Technology can certainly help as a tool, but all tools must be used properly to be effective. Critical thinking is essential to guiding us to making practical and workable decisions from mushy and vague raw data.

A very pragmatic issue is that a lot of difficult questions are so poorly or vaguely phrased or framed so that it simply is not practical to even begin to answer the questions in a practical and workable manner until a deep and broad philosophical analysis can tell us what the questions are really all about. Answering inappropriate interpretations of questions can certainly lead to answers or solutions that do not meet the original needs that the questions may have been intended to address.

Another simple fact is that the adoption of the concepts of philosophy has been so thorough over the centuries that we have reached the stage where the rate of adoption of the remaining un-adopted concepts of philosophy is very slow or so slow that the average person simply cannot see it, even if they look very hard. But, as they say, appearances can be deceiving.

It is also true that a lot of "modern" philosophy has gotten so esoteric and so apparently disjoint from apparent reality that most people see philosophy as being completely disconnected from reality, even if that is not completely the case.

The truth is a bit more complex. Granted a lot of philosophy does appear disconnected from reality and maybe a lot of the time that is the case, but just as often it is simply that philosophy can run well ahead of the times. Politicians may not be ready to pass reasonable and workable laws permitting doctors to "pull the plug on granny" or define precisely how privacy and trust should function in online computer networks, but philosophers and leading edge experts in all disciplines can and do spend serious time discussing these kinds of issues seriously. They are, to put it simply, being philosophical. That is philosophy in action. That is philosophy baking bread. It may not be bread that the average person can eat today, but it is bread that the average person will be taking for granted somewhere down the road in coming years, decades, and centuries.

At the extreme, even the nature of existence itself is still an unsolved problem, with quantum mechanics, string theory, and "god particles" a matter of the kind of speculation and debate normally reserved exclusively by the kind of philosophers who supposedly do not bake bread.

In computer science there is an old saying that AI (Artificial intelligence) is just all of the things that we do not know how to do yet.

I would suggest a similar statement, that philosophy is where we hold preliminary discussions about the toughest unsolved problems of humanity or new ways of thinking that can be applied to those problems.

Philosophy is less concerned with what to do in life, but more about how to think about values and the processes of thought and action so that we can divine better systems of values and better techniques for thought and action that can then be applied in new and novel ways to solve problems in more innovative ways than were readily available to us in the past.

Put in more pragmatic terms, to be sure, philosophers do not directly bake break or tell people how to bake bread, but rather offer people new and novel approaches to how to think about new approaches to baking bread and how to leap ahead and think about meeting biological and social needs that bread was intended to address in the first place. Philosophy can guide us in forward thinking about ethical and social concerns related to global poverty and global health.

In short, there is still plenty of mileage to be gotten from philosophy, especially in leading edge research efforts of virtually all disciplines, especially in situations where significant uncertainty, lack of determinism, and ethical, social, and political concerns outstrip classical mechanistic approaches to problem solving.

So, yes, philosophers can certainly seem to be lost in the clouds, but a large part of that is because that is where some of the hardest problems facing humanity lie. So many of us remain so lost in the mundane concerns of daily life and the rote details of our disciplines that we continue to stumble through life precisely because we do not have the vantage point from the clouds that would enable us to distinguish the forest from the trees.

-- Jack Krupansky

Monday, March 22, 2010

Progression of knowledge

I have a simplified model of the progression of knowledge. Knowledge somehow needs to start somewhere as informal knowldge or tentative knowledge and eventually end up as formal knowledge, something seriously believed to be true.

In my simplified model knowledge progresses (loosely) in the following incremental steps:

  1. An observation or a thought, something that just pops into your head.
  2. An idea, something you think about.
  3. A concept or belief, something you have given some serious thought to and believe is likely to be valid and true.
  4. A conjecture, a relatively formalized and structured form of a concept.
  5. A theory, a fully formalized version of one or more conjectures.
  6. A hypothesis, a prediction based on a theory that can be tested to prove or disprove part or all of a theory.
  7. A lawconclusion, or generalization based on the theory and tested through hypotheses, experiments, experience, and the passage of time.

So, one or more statements can be categorized as to how developed they are in my knowledge progression.

-- Jack Krupansky

The semantic abyss: reality vs. our perception and our models

From the very moment we first open our eyes or first hear some sound or first touch anything we feel that we are experiencing the world around us and that we know that world, reality, but do we? Given enough experience, we gradually realize that some if not many of our earlier perceptions are not completely in accord with reality as it really exists. So, the most basic conception of the Semantic Abyss is that we have two worlds to deal with: 1) the real world, reality itself, and 2) the perceived world, our mental model of what we think or imagine the real world is.

We actually have a third and fourth world to deal with: 3) a model of the real world constructed from conceptions based on our perceptions that we can express to others, and 4) the models of the world that others have constructed and endeavored to communicate to us.

Somehow, we merge, mesh, and blend these three models and derive a composite model of the real world.

Over time and with enough input with enough diversity we come up with ever-better models that better represent the reality of the real world, but despite our best efforts, there will always be a lingering Semantic Abyss between the real world as it really is and our best mental model of what the world is.

Another issue is that even when we are fortunate enough to establish a workable one-to-one correspondence between the real world and our mental model of the real world, there is no guarantee that each correspondence of reality and mental model is accurate and rich enough to adequately model the full complexity of the real world.

A final issue is that we wish to share our models with computers and other artificial entities (e.g., robots which seek to move around and interact with the real world) so that computer programs can make sense of the real world, either in terms of recorded data or real-time sensor data.

In short, we deal with four models of the real world:

  1. The real world as it is that we can observe and interact and experiment with.
  2. Our perception and internal conception of the real world.
  3. The communicable model of the real world that we share with each other.
  4. Computer models of the real world which can be readily manipulated by computer programs.

There are plenty of gaps between those four models of reality that we need to cope with when dealing with knowledge of the real world.

-- Jack Krupansky

Thursday, March 18, 2010

Bridging the semantic gap

Given that there is a semantic gap that I have been referring to as the Semantic Abyss, how exactly do we go about bridging the gap? My overall position remains that the gap is far too great to completely bridge now, or at any time in the near future. That said, it is worth considering the many ways in which we can partially bridge the gap and what potholes, barriers, and mine fields exist in the remainder of the gap. I will not endeavor to do all of that right now and right here, but some examples are worth considering.

First, we have to acknowledge that it is virtually impossible to bridge the semantic gap as a general proposition and that at best we can only hope to approximate bridging the gap. Ray Kurzweil's vision of a Singularity would obviously have to complete the 100% bridging of the semantic gap by the year 2045 if not sooner, but that is far beyond the scope of my near-term interests.

Second, we have to acknowledge that there are a multitude of semantic gaps. For example there is the semantic gap between any two individuals, we need to acknowledge that the gap differs between every distinct pair of individuals, especially depending on how much knowledge they already share.

Third, in general, bridging the gap is a bidirectional process, not a one-way communication. For example, the more knowledgeable party has to learn at least an overview of what the less-knowledgeable party already knows and doesn't know before or as part of the process of bridging the semantic gap. As a general proposition, every party has their semantic strengths and their semantic weaknesses and bridging the semantic gap simply means a semantic balancing, so that at the end of the process they each know what the other knew. But that is only as a general proposition.

Fourth, bridging the semantic gap is frequently and intentionally an asymmetric process, where one or more of the parties seeks a semantic advantage over the other. For example, in a negotiation or propaganda. Or even education where it is usually preferable to incrementally stage the semantic transfer rather than attempt to accomplish it all at once since the cognitive capabilities of the students are under development over an extended period of time. You could say that a typical student accepts a semantic weakness if only because of the incremental nature of education. An amateur might also accept a semantic weakness relative to a professional.

As a general proposition the process of bridging the semantic gap is a learning process. As an extreme case, absolute semantic peers are "on the same page" and can communicate without any significant learning required. As a practical matter even nominal semantic peers frequently are not exactly "on the same page" and miscommunication occurs until one or both parties recognizes that the peer relationship has broken down and learning is required.

Partial knowledge is a common "solution" to bridging the semantic gap. By both parties "agreeing" that not all seemingly relevant knowledge is needed in any particular situation, the semantic gap can be dramatically reduced, "by definition" (agreement.) As a specialized case, we may simply decide that computers are still far too "weak" to support full comprehension of human knowledge and decide on a structured subset of knowledge to shrink the semantic gap to a manageable size.

Hardwired knowledge is another common solution, especially, but not exclusively, with artificial entities. The entity with the hardwired knowledge doesn't really "know" what it is dealing with in a very deep and meaningful sense, but "knows" at least deep enough so that a relatively meaningful conversation can occur.

(To be continued, eventually.)

-- Jack Krupansky

Wednesday, March 17, 2010

What do I know? What do you know? What do we know?

So, what do I know? Literally. Or what do you know? And what do we collectively know? Even if we sincerely wanted to represent everything that we know and are very diligent about going about the task, it is still virtually impossible for us to adequately convey to anyone, person or computer or simply in written word, all knowledge that we possess, either individually or collectively. At best, we can approximate what we know.

One of the biggest problems is dealing with tacit knowledge where we are clearly able to perform various tasks but are literally unable to express in natural language exactly how we are able to perform those tasks.

Another big problem is that most people do not have photographic memories and are frequently unable to recall knowledge on demand even though in some other situation or simply after the passage of time or if prompted their recall may come much more readily or with more fidelity.

There are many other difficulties with any of us being able to fully express the totality of our knowledge.

The real problem is that even if we could express everything that we know, there is no reliable way for any of the rest of us to read or view or listen to those expressions and have a 100% certainty that we understood what the other person intended that they expressed.

So, we have these distinct, although overlapping collections of knowledge:

  • What do you or I know by ourselves
  • What personal knowledge can we consciously contemplate
  • What personal knowledge can we adequately express in natural language or any other knowledge artifact
  • What personal knowledge do we choose and intend to express
  • What did we actually express relative to what we intended to express
  • What portion of our expressed personal knowledge can be reliably deciphered by others
  • How others interpret what they read or hear that we have expressed
  • How much of what they have interpreted can be remembered and recalled and with what reliability and accuracy
  • How reliably and accurately can others relate our knowledge that they have acquired to a third party
  • How much acquired knowledge of another (or others) and our own personal knowledge are coalesced into shared knowledge
  • How much shared knowledge can be reliably and accurately shared with other parties
  • Our ability to distinguish which portions of knowledge came from whom or among whom it is shared

And that is just between two real people. Add more people, many more people. And add the many combinations of two or more people, the groupings of people we find in the real world. Layer onto that the huge issue of how to represent human knowledge in a form that artificial entities can adequately process. Obviously that is what we want to try to do in a full-blown knowledge web.

And even after we have done all of that, we must acknowledge and cope with the fact that our knowledge is a living thing, subject to constant and continual change.

-- Jack Krupansky

Tuesday, March 16, 2010

Relationship between sentience and knowledge

Although my primary interest is in representation of knowledge, it makes sense to focus attention on how knowledge is generated and used by sentient entities, whether they are real people or artificial sentient entities such as robots and software agents.

I propose a fairly simple model of the structure of a sentient entity in terms of functional capabilities that somehow relate to consumption and production of data, information, knowledge, or wisdom:

  • Sense, observe, measure entities (both sentient and non-sentient) and phenomena in the environment
  • Feel, sense (at a higher level, processing what was sensed at the raw sensory level), react - emotions, instinctive processes
  • Remember and recall
  • Think - perceive, analyze, conceive, speculate, contemplate, believe, desire, intend, plan, decide, control
  • Express feelings, emotions, reactions (relatively unidirectional)
  • Communicate mental state, record information
  • Act, behave
  • Interact with other sentient entities in relatively intense conversational mode
  • Intuit
  • Read minds [Really? Well, at least conceptually.]

Data, information, and knowledge flow into each of these functional capabilities either from the environment or other functional capabilities, is processed to some degree and some new form of data, information, or knowledge is generated and made available to other functional capabilities or the environment.

The representation of data, information, and knowledge within any of these functional capabilities may or may not be comparable or synchronized with external representations of that data, information, and knowledge in a Semantic Web or Knowledge Web. There may be some boundary lines delineating internal versus external flows or availability of any or all of the data, information, and knowledge.

Note: I am not sure how the information from a neurological brain scan fits into this model. Maybe it is simply an indistinct composite or a composition of neurological state information. Conceptually, the same issue can occur with an artificial sentient entity, by examining raw data in state variables (e.g., a raw memory dump). Whether this might have some utility is unknown, but it could have some analogy to a signature such as is done with a computer virus scan.

One could also view the DNA of a sentient entity, or the code of an artificial sentient entity as data, information, and knowledge as well.

Ditto for the biography or history of the sentient entity as well.

-- Jack Krupansky

Monday, March 15, 2010

What determines the future (or caused some outcome)?

Only the most mindless simpleton believes that the future is predetermined and that everything that happens does so because it was "destined" or predetermined to happen. Most of us can agree that predestination is not an adequate account of reality. But that leaves open the general question of what determines the future or even any outcome in the present? If hard, full determinism does not preordain all outcomes, what model for the progression of reality should we be using? Just for the record, I will state my simplified model of what determines the future.

Every event or outcome or change of state in reality (the universe) is determined by some combination of factors, even if we may not be able to clearly determine what those factors may specifically be in any given instance. The categories of these factors are:

  1. Natural progression. Law-like behavior such as gravity, an object rolling down a hill, hot air rising, momentum, orbiting bodies, or the life cycle of living things. Or something as simple as evaluating a mathematical equation across its domain. Outcome is very predictable and causality is well-defined.
  2. Specific causal factors. Forces, objects, actors, drives, etc. which are reasonably "clear", including the proverbial "smoking gun." Outcome may be moderately predictable and causality relatively easily determined.
  3. Non-specific causal factors. Something influenced or caused a change even if we have difficulty or are even unable to determine what the causal events actually were. Outcome has low or no predictability and any apparent causality will tend to be mostly speculative in nature.
  4. Random variability. Ranging from quantum indeterminism and radioactive decay to statistical, stochastic, and chaotic processes. Even if we recreate the exact prior situation (say, in a parallel universe), the outcome could vary. Even omniscience and omnipotence would not determine the outcome. No predictability other than possibly a statistical distribution. Causality may sometimes be established by the nature of the event (e.g., radioactive decay), but may be completely indeterminate (e.g., judging free will decision vs. known bias.)
  5. Free will. Choice by a sentient entity (e.g., person or computer) unconstrained by any factors. Various factors may inform or influence or guide or even bias choice, but ultimately there is an act of free will making the decision. May or may not be predictable. Causality may be very difficult if not impossible to establish, although a sentient entity might communicate its decision-making process or a brain scan might suggest whether free will was a significant factor or not.
  6. Intervention by a deity. Not everyone believes in a God, but those who do might find the intentions of a God a more credible explanation for events and outcomes than other, more worldly factors.

Now, I have attempted to summarize a model for what determines the future (or caused some past outcome) in the real world for real people. That said, is this same model valid for any or all virtual worlds? I think so, but not necessarily. Some categories of factors may not be relevant in some specific virtual worlds, but are there other categories that are operative in all or some specific virtual worlds but not operative in our real world? Conceivably one could define such a virtual world, although I have not personally heard of one. Nonetheless, it would be interesting to speculate what additional categories of factors might conceivably apply to the Semantic Web and future Knowledge Webs, especially as artificial sentient entities (software agents, robots, etc.) begin to proliferate.

As a final note, all of this ties in with provenance as well, a topic of emerging interest in the Semantic Web, although currently the Semantic Web is more interested in the who of a change in data rather than some deeper why.

-- Jack Krupansky

Sunday, March 7, 2010

David Gelernter: Time to Start Taking the Internet Seriously

I just finished reading an essay on Edge by noted computer scientist David Gelernter entitled "Time to Start Taking the Internet Seriously" which basically argues for his concept of lifestreams as a better model for publishing and accessing information than today's web model. Rather that organizing information in a spatial form, he recommends that we think about and organize information along the time dimension. As he puts it:

The Internet's future is not Web 2.0 or 200.0 but the post-Web, where time instead of space is the organizing principle -- instead of many stained-glass windows, instead of information laid out in space, like vegetables at a market -- the Net will be many streams of information flowing through time. The Cybersphere as a whole equals every stream in the Internet blended together: the whole world telling its own story.

He proceeds to describe the nature of the problem and how lifestreams will address it:

13. The traditional web site is static, but the Internet specializes in flowing, changing information. The "velocity of information" is important -- not just the facts but their rate and direction of flow. Today's typical website is like a stained glass window, many small panels leaded together. There is no good way to change stained glass, and no one expects it to change. So it's not surprising that the Internet is now being overtaken by a different kind of cyberstructure.

14. The structure called a cyberstream or lifestream is better suited to the Internet than a conventional website because it shows information-in-motion, a rushing flow of fresh information instead of a stagnant pool.

15. Every month, more and more information surges through the Cybersphere in lifestreams — some called blogs, "feeds," "activity streams," "event streams," Twitter streams. All these streams are specialized examples of the cyberstructure we called a lifestream in the mid-1990s: a stream made of all sorts of digital documents, arranged by time of creation or arrival, changing in realtime; a stream you can focus and thus turn into a different stream; a stream with a past, present and future. The future flows through the present into the past at the speed of time.

16. Your own information -- all your communications, documents, photos, videos -- including "cross network" information -- phone calls, voice messages, text messages -- will be stored in a lifestream in the Cloud.

17. There is no clear way to blend two standard websites together, but it's obvious how to blend two streams. You simply shuffle them together like two decks of cards, maintaining time-order -- putting the earlier document first. Blending is important because we must be able to add and subtract in the Cybersphere. We add streams together by blending them. Because it's easy to blend any group of streams, it's easy to integrate stream-structured sites so we can treat the group as a unit, not as many separate points of activity; and integration is important to solving the information overload problem. We subtract streams by searching or focusing. Searching a stream for "snow" means that I subtract every stream-element that doesn't deal with snow. Subtracting the "not snow" stream from the mainstream yields a "snow" stream. Blending streams and searching them are the addition and subtraction of the new Cybersphere.

18. Nearly all flowing, changing information on the Internet will move through streams. You will be able to gather and blend together all the streams that interest you. Streams of world news or news about your friends, streams that describe prices or auctions or new findings in any field, or traffic, weather, markets -- they will all be gathered and blended into one stream. Then your own personal lifestream will be added. The result is your mainstream: different from all others; a fast-moving river of all the digital information you care about.

In short:

To accomplish this, we merely need to turn the whole Cybersphere on its side, so that time instead of space is the main axis.

There is much more to his model for information in the "Cybersphere", but time-based lifestreams are his core starting point.

-- Jack Krupansky

The welling up of knowledge

I was reading an essay on Edge by noted computer scientist David Gelernter entitled "Time to Start Taking the Internet Seriously" and ran across a reference to the concept of information welling up in the context of his conception of lifestreams. He wrote:

Ten years ago I described the computer of the future as a "scooped-out hole in the beach where information from the Cybersphere wells up like seawater."  Today the spread of wireless coverage and the growing power of mobile devices means that information does indeed well up almost anywhere you switch on your laptop or cellphone; and "anywhere" will be true before long.

That's an interesting concept. Rather than explicitly accessing data by going to its source or explicitly searching for it, all one need do is create the proper situation (the well) and the data simply appears or wells up, welcomed but not directly or explicitly bidden per se.

So, we have a collection of concepts here, in my view:

  • knowledge wells (or data wells or information wells) which are places where information can simply materialize (or the data equivalent)
  • knowledge welling, the incremental (or streaming or merely "seeping") appearance of data in a knowledge well (or data well or information well)
  • welled knowledge (or welled data or welled information), which is knowledge that appears in a knowledge well
  • wellable knowledge (or wellable data or wellable information), which is knowledge that is somehow prepared or packaged or published in a form that makes it readily distributable to knowledge wells.

At a simplistic level, a knowledge well could simply be a search query directed at some data source, but to truly fulfill Gelernter's vision, something far more sophisticated is needed. What that something might be I cannot say at this time.

Curiously, maybe there is a community collaboration angle there as well, since the term reminds me of the famous The Whole Earth 'Lectronic Link known as The WELL. Whether or not a connection between the two concepts would make sense would depend on how specific and narrow one wants to define the terms. One could define a simple RSS feed as an information well, I suppose. One could define the Twitter public timeline as an information well. Sure, one can tap into any "conference" on The WELL, but then that is a fairly narrow information stream. Somehow, a Gelernteresque knowledge well would have a more global, blended un-focus, I would think.

Thinking about how information might well up reminds me of a concept I considered years ago, something I call GMWIMW, for Give Me What I Might Want, a mythical filter for information on topics that I do not even know about yet. That would be at least one type of knowledge well that I would be interested in.

-- Jack Krupansky

Wednesday, March 3, 2010

What is the unit of meaning?

Superficially, that is the question: What is the unit of meaning? But, that one question is part of a bundle of questions, including (but not limited to):

  1. What is the unit of knowledge?
  2. What is the unit of semantics?
  3. What is the unit of meaning?
  4. What is the unit of communications?
  5. What is the unit of expression?
  6. What is the unit of thought?
  7. What is the unit of facts?
  8. What is the unit of reasoning?
  9. What is the unit of objectivity?
  10. What is the unit of subjectivity?
  11. What is the unit of context?
  12. What is a unit in a holistic system?

In natural language the obvious choices for a unit are word, sentence, phrase, and morpheme. I would lean towards word or term or sometimes phrase, but at least when it comes to foreign language translation, anything less than a sentence is questionable for capturing meaning. Sure, we can look a word up in a dictionary, but frequently we find that a word will have multiple senses and the context of a phrase, clause, or sentence is needed to decipher which sense is appropriate. Maybe this simply means that word are still an appropriate unit, but that context is needed as well, much in the way that pieces of wood and nails can be units for building, but tools such as a saw and a hammer and a plan are needed to make sense of the units.

In the Semantic Web, we have units such as URI and literals, but it is the statement or triple as a unit of expression that seems the most useful focus. Or maybe not. An RDF statement is somewhat analogous to a natural language statement. A URI or literal string is comparable to a natural language word. The same literal can be used in multiple RDF statements, with a multiplicity of senses, each suggested by the resource and predicate of the RDF statement which contains it.

For now, I would suggest that the word is the natural unit of meaning in natural language, and the URI is the natural unit of meaning in the Semantic Web.

One related question that concerns me: The Semantic Web does not seem to have the concept of sense for URIs that we have in a natural language dictionary. Hmmm...

Language, natural or otherwise, is used to convey meaning from one party to another. Meaning and knowledge exist primarily in the minds of the parties who are communicating. Actual words and sentences or expressions in natural language are knowledge artifacts rather than the actual knowledge and meaning itself. As carefully as we may try, analysis of natural language text can only approximate whatever meaning was intended by the initiator of the expression. So, in some sense, deciding on the unit for natural language text does not necessarily tell us the unit for meaning and knowledge in the human mind. Nonetheless, we need to start somewhere and the knowledge artifacts of natural language are a rich trove to start with.

I'll stop there for now. More thought is needed.

-- Jack Krupansky