Text description - Google PageRank video
VISION: GOOGLE spelt out, page turns
NARRATOR: Every day, all over the world , people use Google as their first choice of search engine….
VISION: Lily Serna at Google's Australian headquarters
Lily What is the deal with Google ? It's well and truly a household name, it's even a verb "to Google" something. Why is it so popular? let's go find out”
VISION: Google page ranking website info; Larry Page photo
NARRATOR: One of main reasons is their clever mathematical software. PageRank is a link analysis algorithm, named after Larry Page. It's used by the Google web search engine that assigns a numerical weighting to each element of a hyperlinked set of documents, with the purpose of measuring its relative importance within the set.
VISION: Dave Day, GOOGLE software engineer SYNC :
Maths is really the core of computer science so it’s vital to what we do here at Google”
VISION: Websites ..links...
Dave Day, GOOGLE software engineer SYNC “ So if you come up with the solution or an algorithm to a problem, then maths is the tool that lets you understand how well that’s gonna run, how much time or space it’s gonna take and its gonna let you then prove whether that’s the optimal way of solving that solution, whether there are better ways or even sometimes if you have a problem, maths can show you is it possible for a computer to solve that problem at all
VISION: web pages/links
NARRATOR: At the heart of Google software is a system called PageRank, which basically gives every site on the Internet a rank between 0 and 1. So how is this calculated? Well, the page rank of your site is determined by the links to your web site. Each time somebody adds a link to your web site, Google interprets this as a vote for your site. The more links you have to your site, the more votes you get.
VISION: Dave Day, GOOGLE software engineer SYNC :
“With page rank what we like to do is think of all the different pages on the internet as different nodes in the graph so if you have Page A and it has a link pointing to Page B then we have those as two separate nodes with an edge joining them together and what we have is when you have all those nodes and edges together that’s what we call in mathematics, a graph, and once we can understand the internets connectivity as a graph then we can use a whole heap of powerful mathematics to understand it”
VISION: Diagrams and graphics
NARRATOR: Web pages are all linked, they're part of a network. So an IN link from my page to your page is an endorsement of your page. The more IN links your page has, the more important it is. But if your page has an inlink from a page with many outlinks, then that endorsement of your page is of lesser value as it is coming from a page that is less discerning.
VISION: Lily Serna at work talking to camera , lists of products online
Lily Serna: Ranking anything, really where do you start..? well one of my jobs is to go through lists of thousands of products and try to identify which ones are more important than others..”
NARRATOR: A good place to start is to use the variables as signals of importance, like product price or how many times a product is viewed, then like Google use an algorithm to sort the products from most to least important”
VISION: hand clicking on mouse
NARRATOR: Google PageRank is a probability distribution used to represent the likelihood that a person randomly clicking on links, will arrive at any particular page. A probability is expressed as a numeric value between 0 and 1.
VISION: Diagram, hand clicking on mouse
NARRATOR: Converting this idea into a formula that can be calculated for each of the 14 billion web pages , is an amazing achievement..and Google guarantees an accuracy of between 3 and 7 digits – Awesome.
VISION: Lily Serna and friends P1->P6 Red Symons pic; GFX #1…. P1->P6 CIRCLES and arrows
NARRATOR: But let's bring this down to manageable numbers for a second.. Using this small-scale set of 6 people , to represent just 6 web pages.
As with any network, some might know each other (I know you, how do I know you?, do you work online, yeah you look familiar ) and so have a link,
some... don't.. (no I don't know you I'm sorry, sorry I don't think I know you, no I don't know you) maybe there's just one that they all know of..(oh yeah, I think I know that guy, yes I know him, everyone knows him, doesn't everyone know Red?)
that one then is the most popular and so will get the highest page rank.
In the small example shown here, you can see that P6 has the strongest links; P5 and P4 ink directly to P6,and there are paths in the graph from P1 and P3 to P6. Even though we would guess that P6 has the highest Pagerank, it is not at all clear what ranks the other pages would have. This is where the Google Pagerank algorithm come in.
VISION: GFX #2 P1-P6 animated
NARRATOR: The first step in calculating the Google Pagerank of each page is to represent the graph in a table. The table shows, for example, that there are links from P1 to P2 and to P3. A neat way to record the results.
VISION:GFX #3 A= 0 and 1's
NARRATOR: The next step is to replace the table by the corresponding six by six matrix A.
VISION: GFX 4 G= 1/40 etc
NARRATOR: After several manipulations we arrive at the Google page ranking matrix G.
NARRATOR: Rather than a tiny six by six matrix as in our example, the Google matrix G is fourteen billion by fourteen billion. The average number of outlinks on a webpage is 10. This makes it computationally possible for Google to process the matrix multiplication, to solve the equations and yield a final list of page ranks! So cool.
VISION: to camera
Lily Serna: It's easy enough finding out who know who with just six people ..” But imagine doing this 14 billion times?
VISION: Lily Serna superimposed onto image of the earth
Lily Serna: "That's twice the population of the world”
Dave Day, GOOGLE software engineer SYNC : "The thing about Google is that everything we do we do at a massive scale so we are handling billions of pages and billions of emails billions of images you name it there’s a lot of it there’s a lot of data and so the maths become critical at every single part of the product because without it we'd be lost”
VISION: Search for directions to Google.. find them.
NARRATOR: Oh great , thanks!