Skip to main content

Section B.2 Computer Science: PageRank

Subsection B.2.1 Activities

Activity B.2.1. The $978,000,000,000 Problem.

In the picture below, each circle represents a webpage, and each arrow represents a link from one page to another.
Figure 80. A seven-webpage network
Based on how these pages link to each other, write a list of the 7 webpages in order from most important to least important.

Observation B.2.2. The $978,000,000,000 Idea.

Links are endorsements. That is:
  1. A webpage is important if it is linked to (endorsed) by important pages.
  2. A webpage distributes its importance equally among all the pages it links to (endorses).

Example B.2.3.

Consider this small network with only three pages. Let \(x_1, x_2, x_3\) be the importance of the three pages respectively.
Figure 81. A three-webpage network
  1. \(x_1\) splits its endorsement in half between \(x_2\) and \(x_3\)
  2. \(x_2\) sends all of its endorsement to \(x_1\)
  3. \(x_3\) sends all of its endorsement to \(x_2\text{.}\)
This corresponds to the page rank system:
\begin{alignat*}{4} && x_2 && &=& x_1 \\ \frac{1}{2} x_1&& &+&x_3 &=& x_2\\ \frac{1}{2} x_1&& && &=& x_3 \end{alignat*}

Observation B.2.4.

Figure 82. A three-webpage network
\begin{alignat*}{4} && x_2 && &=& x_1 \\ \frac{1}{2} x_1&& &+&x_3 &=& x_2\\ \frac{1}{2} x_1&& && &=& x_3 \end{alignat*}
\begin{equation*} \left[\begin{array}{ccc}0&1&0\\\frac{1}{2}&0 & 1\\\frac{1}{2}&0&0\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] = \left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] \end{equation*}
By writing this linear system in terms of matrix multiplication, we obtain the page rank matrix \(A = \left[\begin{array}{ccc} 0 & 1 & 0 \\ \frac{1}{2} & 0 & 1 \\ \frac{1}{2} & 0 & 0 \end{array}\right]\) and page rank vector \(\vec{x}=\left[\begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array}\right]\text{.}\)
Thus, computing the importance of pages on a network is equivalent to solving the matrix equation \(A\vec{x}=1\vec{x}\text{.}\)

Activity B.2.5.

Thus, our $978,000,000,000 problem is what kind of problem?
\begin{equation*} \left[\begin{array}{ccc}0&1&0\\\frac{1}{2}&0&\frac{1}{2}\\\frac{1}{2}&0&0\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] = 1\left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right] \end{equation*}
  1. An antiderivative problem
  2. A bijection problem
  3. A cofactoring problem
  4. A determinant problem
  5. An eigenvector problem

Activity B.2.6.

Find a page rank vector \(\vec x\) satisfying \(A\vec x=1\vec x\) for the following network’s page rank matrix \(A\text{.}\)
That is, find the eigenspace associated with \(\lambda=1\) for the matrix \(A\text{,}\) and choose a vector from that eigenspace.
Figure 83. A three-webpage network
\begin{equation*} A = \left[\begin{array}{ccc} 0 & 1 & 0 \\ \frac{1}{2} & 0 & 1 \\ \frac{1}{2} & 0 & 0 \end{array}\right] \end{equation*}

Observation B.2.7.

Row-reducing \(A-I = \left[\begin{array}{ccc} -1 & 1 & 0 \\ \frac{1}{2} & -1 & 1 \\ \frac{1}{2} & 0 & -1 \end{array}\right] \sim \left[\begin{array}{ccc} 1 & 0 & -2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{array}\right]\) yields the basic eigenvector \(\left[\begin{array}{c} 2 \\ 2 \\1 \end{array}\right]\text{.}\)
Therefore, we may conclude that pages \(1\) and \(2\) are equally important, and both pages are twice as important as page \(3\text{.}\)

Activity B.2.8.

Compute the \(7 \times 7\) page rank matrix for the following network.
Figure 84. A seven-webpage network
For example, since website \(1\) distributes its endorsement equally between \(2\) and \(4\text{,}\) the first column is \(\left[\begin{array}{c} 0 \\ \frac{1}{2} \\ 0 \\ \frac{1}{2} \\ 0 \\ 0 \\ 0 \end{array}\right]\text{.}\)

Activity B.2.9.

Find a page rank vector for the given page rank matrix.
\begin{equation*} A=\left[\begin{array}{ccccccc} 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 \\ \frac{1}{2} & 0 & 0 & 1 & 0 & 0 & \frac{1}{2} \\ 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 \\ \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & \frac{1}{2} \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & 0 \\ 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & \frac{1}{2} & 0 \end{array}\right] \end{equation*}
Figure 85. A seven-webpage network
Which webpage is most important?

Observation B.2.10.

Since a page rank vector for the network is given by \(\vec x\text{,}\) it’s reasonable to consider page \(2\) as the most important page.
\begin{equation*} \vec{x} = \left[\begin{array}{c} 2 \\ 4 \\2 \\ 2.5 \\ 0 \\ 0 \\ 1\end{array}\right] \end{equation*}
Based upon this page rank vector, here is a complete ranking of all seven pages from most important to least important:
\begin{equation*} 2, 4, 1, 3, 7, 5, 6 \end{equation*}
Figure 86. A seven-webpage network

Activity B.2.11.

Given the following diagram, use a page rank vector to rank the pages \(1\) through \(7\) in order from most important to least important.
Figure 87. Another seven-webpage network