Sunday, September 03, 2006

probing search engine understandings

Paul Chandler suggests at his wiki that we can improve our teaching about search engines by asking students probing questions:
To what extent do you believe that the following are true statements about internet search engines (orientating questions):
  • They search for data on every page of the Internet
  • They locate key words by searching the world
  • They are just like a library
  • They search a certain part of the Internet for you
  • A search engines searches the pages which are connected to it
Write a few sentences to describe why a search engine might be like (the probe):
  • a library
  • a library catalogue
  • a librarian
  • a game of Chinese whispers
  • the index page to a book
  • someone who is a speed reader
I have added an entry there about the importance of speed and size in the search process.

How can google search 25 billion plus web pages in less than a second? The speed of search is an important consideration. How do they achieve this? I think it's important to add the speed of the process to the probe. That would tend to sideline the Chinese whisper and speed reader options (?)

Here is an article by Matt Cutts, How does Google collect and rank results? which does include a couple of exercises for students

Here is my summary of the process:

1. Spiders crawl web and retrieve pages
2. Pages are numbered for future reference
3. An index is built
4. The index is distributed over hundreds or thousands of computers
5. When a search is made it is processed by hundreds of computers to speed things up
6. Results are ranked by relevance

One good metaphor from the Matt Cutts article is to speed up the process of searching a book index you could rip out the index and give one page to each person to search

