
It is very difficult to overestimate the societal impact of search engines. Google alone processes 13.7 billion searches per day and is far and away the leader of this industry, currently possessing 89.74% of the market. Bing is a distant second with 4%, Russian search engine Yandex holds 2.49%, one-time internet darling, Yahoo holds 1.33%, little but feisty DuckDuckGo, 0.79% and China’s Baidu holds 0.62%. Search data suggests that the average user conducts between 3 to 5 searches a day, but the digital natives known as Gen Z amp that up to over 5 searches a day.
Search engines have had a profound impact on global culture by shaping how people access information, learn, and interact with the world. They facilitate global knowledge sharing, influence educational practices, and even affect political and economic landscapes, by creating a more interconnected and globally-aware populace and by shaping public opinion. Given the central place of search in our digital culture, it behooves us to understand this most important of artifacts better. A very good way of doing that is understanding the evolution of search through time up to its current state. This we will do in a couple of posts.
Search history (pun intended), well…modern search history (Humanity has been searching for archived information since the Biblical times), could trace its beginnings to the invention of the digital computer in the 1940s, and its subsequent rise in the following decades. As computers became pervasive, institutions started collecting large amounts of data. The data being digital was by definition, eminently searchable. This led to an explosion on the field of Information Retrieval (IR). Computer Scientists at the time were encouraged by the digital computer revolution to carry out research in developing algorithms for IR.
One researcher, stood slightly taller than the rest, Harvard mathematician Gerard Salton, widely considered the father of digital search. He built was is considered as the first digital search engine. It was named Salton’s Magical Automatic Retriever of Text (SMART). Rather humble you might say. His work drove much of the research in the field of IR. Unfortunately, the algorithms developed during this period were for carrying out relatively small, well controlled searches on digital computers. They couldn’t handle wild searches on the internet/web.
The first internet search engine was a system called Archie (Both a play on the word “archive” and also named after the famous comic book character), built in 1990 by a researcher named Alan Emtage at McGill University in Canada. Though very primitive by today’s standards, Archie contained all the essential features of a search engine as we know them today, which are:
- The Crawler
- The Index
- The Query Server and User-Interface
The crawler is a software program that sends requests to web pages on the internet, requesting for the information they contain, if the page contains hyperlinks, the crawler will also send requests to pages pointed to by the hyperlinks. The information that is returned to the crawler is stored in a database. This database is the index. It is from here a search engine pulls information in response to you typing a search query into Google. This means, when you use Google (or any other search engine to search), you are not searching the actual web. You are searching Google’s index of the web. What this means is that based on differences as to how comprehensive search engines indexes are, they cannot give you the same performance. Today’s search engines will store the entire contents of a page/document in its index, but early search engines like Archie only stored the title. Once information is in the index, it can be searched for. This is done with user-interface (usually a textbox), where a person enters a query that is passed to the query server, which tries to match the query to information stored in the index. Archie had a primitive user-interface though. Rather than the visual textbox of today’s search engines, it had a command line at which you typed your search query, much like MS-DOS.
The creation of Archie would spawn an imitation three years later named Veronica (Named after Archie’s on and off girlfriend in the comic book). It was built by students at the University of Nevada, USA. Archie and Veronica were similar in many respects but had major differences. The biggest difference was that they used different file sharing standards. Archie used a file sharing standard called File Transfer Protocol (FTP), while Veronica used one called Gopher. The difference between FTP and Gopher is that FTP only allowed you to connect to the machine that held the wanted document. You would then have to search the machine for it. Gopher on the other hand, allowed you to connect to the document directly.
Both Archie and Veronica suffered major limitations though. First, they only indexed the title of a document, not its contents. That meant that to find a document, you had to have an idea of its title. Also, Archie and Veronica were strictly speaking, internet search engines and not web search engines. The internet and the world wide web aren’t strictly the same thing even though the two terms have been used interchangeably since around 1991. The internet is a computer communications network that was built for over two decades starting in the 1960s and is navigated using text-based commands. The world wide web is a visual content network built over the internet in 1991. The two have been growing as one network since then.
As the web took off (the number of websites exploded from 130 in 1993 to 600,000 in 1996), a slew of web-based search engines sought to replace Archie and Veronica. The earliest of these was the WWW Wanderer built by Matthew Gray, a researcher at MIT. It would soon be eclipsed by the WebCrawler, built by Brian Pinkerton in 1994, who at the time was juggling an academic position at the University of Washington, USA and a job at a company called NeXT run by Steve Jobs. WebCrawler was the first search engine to index an entire document. Pinkerton would later sell WebCrawler to America Online (AOL) for a million dollars.
In 1995, the first truly great search engine would be built. This was Alta Vista, built by French researcher Louis Monier, then an employee of American venerable IT Hardware giant, Digital Equipment Corporation (DEC). On December 15th, 1995, when Alta Vista was first opened to the public, it garnered 300,000 visits. Within a year, it had served 4 billion search queries. At the time, that was almost as many people on earth. It was at time, one of the most loved destinations on the web, competing with the likes of Yahoo and AOL.
Unfortunately, DEC had a hard time understanding the huge potential of Alta Vista. To be fair to them, the viable business models for search-based businesses that would crop up a mere 4 years later when Google had become the king of search just didn’t exist in 1996. Furthermore, DEC’s experience up to that point consisted mainly of selling large hardware, minicomputers to exact and microprocessors. So perhaps somewhat predictably they would try to use Alta Vista as a marketing tool for their hardware products.
DEC’s strategy would ultimately fail and the company itself would soon be acquired by Compaq. Compaq would try to take unsuccessfully Alta Vista public before selling it off to a company called CMGI. CMGI would sell Alta Vista yet again to another company called Overture who would eventually be acquired by Yahoo. Yahoo would try to revitalize Alta Vista, but its time of basking in the sun had passed.
Bibliography
- Batelle, John. 2005 The Search: How Google and its Rivals Rewrote the Rules of Business and Transformed Our Culture. New York: Portfolio