Saturday, April 08, 2006

Normalised Google Distance

Somehow the very fact that latent semantic context information is entered in www - (the largest database on earth) by millions of independent users everyday has impressed me since the time immemorial. Thanks to her - I learnt about NGD today !! The way I understand is :

  • We demonstrate positive correlations, evidencing an underlying semantic structure, in both numerical symbol notations and number-name words in a variety of natural languages and contexts.
  • Also, we demonstrate the ability to distinguish between colors and numbers, and to distinguish between 17th century Dutch painters; the ability to understand electrical terms, religious terms, and emergency incidents.

Likewise, the idea of automatically extract the meaning of words and phrases from the world-wide-web
Google page counts is amazing indeed. One of the most important resource available to researchers in computational linguistics & text analysis is Wordnet - an online lexical database. An Online version of the same can be viewed at

