**2. Some history and a network primer**

Among the first individuals to try to understand the nature of human social networks were economists, psychologists and sociologists. They researched the nature of human and society interactions and turned to the theory of graphs (diagrams consisting of dots and lines) to help them get insights into the structure of these networks.

In the 1960'sStanley Milgram was a pioneer of experiments concerning the nature of networks. He instructed a* source* to send a letter to a person the source knew on a first-name basis with instructions to forward the letter to another person (using the same set of instructions) with the goal of reaching a specific *target* person. After many trials, the experiment led to the observation that successful chains between a source and target used 5 to 6 steps on average. Not only was it unclear why social networks should have what seemed such a small "diameter," but also granted that social networks do have a small diameter, there was a process of locating paths between people using only information of a local kind (known to each individual in the chain) which worked.

*Graph theory, *a rapidly growing subdiscipline of mathematics and computer science, has provided a variety of tools to investigate problems about networks. Graphs are a powerful tool for representing information about many kinds of phenomena. A dot, called a *vertex*, represents something (e.g. a person) and dots are joined with a line segment, called an *edge*, when the dots they represent share a certain property. For example, if vertices represent people, the edges could be used to represent those people who spoke to each other on the phone on June 3, 2003. A related mathematical model is that of a *digraph*, short for *directed graph*. Here the line segments have arrows, so dots again might represent people and there will be a line segment with an arrow from one dot to another if, say, the first person called the second on the phone. Below are very small examples of a graph and a digraph. You can think of the graph as showing which friends talked to each other on the phone on a certain day, and the digraph as showing who called whom. Note that the digraph has the potential for double the number of edges as the corresponding graph with the same vertex set. This leads to practical questions related to storing very large digraphs (say on 200,000,000 vertices), and explains why sometimes a situation which seems clearly to call for being represented by a digraph is represented by a graph instead.

Figure 1

In a graph it is natural to take an interest in how many edges are present at a vertex. This number is known as the *degree* or *valence* of a vertex. In the graph above the degree of the vertex that represents Jacob is 3 and the degree of the vertex that represents Alice is 1. One of the first theorems of graph theory was the observation that the sum of the valences in a graph gives twice the number of edges in the graph, because when adding the valences of a graph one counts each edge twice. From this it follows that the number of vertices in a graph of odd-valence (a graph in which each vertex has odd valence) must be even. In a digraph there is potential for an edge at a vertex to point into the vertex or out. The number of edges at a vertex which point into the vertex is known as the* indegree* or *invalence* while the number of edges at a vertex that point out of the vertex is known as the *outdegree* or *outvalence*. In the digraph in Figure 1 the indegree of the vertex which represents Mary is 1 while the outdegree of this vertex is 3. Can you see why the sum of the indegrees of all of the vertices in a graph must be the number of directed edges in a digraph? Similarly, the sum of all of the outdegrees of all the vertices also must be the number of directed edges in the digraph.

Another basic idea about the structure of a graph concerns whether it is possible to get between any pair of vertices via a path or chain of edges. When graphs have this property they are called *connected*. For digraphs there are a variety of "connectivity" concepts. One is whether or not the *underlying graph*, obtained by disregarding the arrows on the edges of a digraph, is connected. Another concept is whether there is a directed chain of edges (i.e. following the arrows) from any vertex *u* of the digraph to any other digraph vertex *w*, as well as such a chain from *w* to *u*. A digraph with this property is called *strongly connected*. Figure 1 shows a connected graph and a digraph whose underlying graph is connected but not strongly connected. A necessary condition for a graph to be strongly connected is that it have positive indegree and outdegree at every vertex of the digraph. If a graph is connected or a digraph strongly connected, an important characteristic of the graph is how far apart two vertices in the graph (or digraph) can be. The distance between two vertices in a connected graph is the number of edges in a shortest path between the two vertices. In the graph in Figure 1 the distance between the vertices labeled with Alice and Charles is 3 while the distance between the two vertices labeled Mary and Barry is 1. The maximum distance between any pair of vertices in a graph is called the *diameter* of the graph. The diameter of the graph in Figure 1 is 3. Note that in a digraph, even one whose underlying graph is connected, there may not be a "distance" between some pairs of vertices, and the distance between vertices *u* and *v* may not be the same as the distance between *v* and *u*.

It is this concept of diameter that is related to Milgram's experiments described above. Given the fact that humans inhabit 6 continents and are separated by distance, language, custom and culture, could it really be true that any pair of individuals can be joined by a chain of length at most 6 (e.g. six degrees of separation)? Could it really be that the graph of Internet page connections is of diameter 19? Could it be that with no more than 19 clicks of the mouse one can get between any two web pages?