From SIAM News, Volume 37, Number 3, April 2004
Finding Information in Networks
By Prabhakar Raghavan
Consider a Web surfer performing a random walk on the Web -- after visiting a page, he proceeds on one of the outgoing hyperlinks from that page, chosen at random.* In the long run, such a walk will visit every Web page with a certain frequency. This quantity is a subtle measure of how "popular" each page is. To receive a high score, it does not suffice for a page to have many hyperlinks pointing into it; rather, it is the pages pointed to by many other highly referenced Web pages that will garner high scores. Inspired by methods from citation analysis, this scoring technique is used in Google and other Web search engines to decide which pages should rank highest in response to a query.
Such link analysis is but one visible example of the broader research area known as social network analysis, which originated with the following classic experiment carried out by Stanley Milgram in the 1960s. Milgram asked each of several volunteers in the Midwest to convey a letter to a target in Boston. The catch: A volunteer could send the letter only to someone she knew on a first-name basis; that person, in turn, would send it on to someone he knew. The goal was to get the letter to the target as quickly as possible. Milgram found that the median path length (the number of intermediaries en route to the target) for successful deliveries was six, leading to the popular notion that any two humans have at most "six degrees of separation."
Over the last ten years, these ideas from social network analysis have progressed beyond networks of links between people. In our opening example, we discussed how links between Web pages could be useful in search engines. Other such networks include the network of calls between phone numbers, and the network of e-mails between users -- dense regions in these networks tend to form around participants with common interests. Another example comes from so-called recommendation systems deployed at e-commerce sites on the Web: By analyzing the links between people and the products they buy (as implicit in their purchases), the system recommends products that site visitors might be interested in, based on their and other users' past behavior. The simplest incarnation is a recommendation in the form "people who purchased this item also purchased ...."
While these examples are promising, much work remains to be done in this nascent area.
Recently, researchers have begun to explore so-called webs of trust -- networks in which a link from one user to another carries a real number denoting how much the first user trusts (or distrusts) the second. If A distrusts B and B distrusts C, what does that say about A's trust for C? If C likes a book, should A like it as well? Questions such as these must first be resolved at a philosophical level before being approached mathematically. Connecting people with information is a fundamental problem in today's economy. Mathematicians who study networks, interacting with researchers in other disciplines, will have an exciting role to play.
Prabhakar Raghavan is vice president and chief technology officer at Verity and a consulting professor of computer science at Stanford University.
* In the absence of an outgoing hyperlink, he jumps to a random Web page.